deployment - Options for deploying R models in production -


there doesn't seem many options deploying predictive models in production surprising given explosion in big data.

i understand open-source pmml can used export models xml specification. can used in-database scoring/prediction. seems make work need use pmml plugin zementis means solution not open source. there easier open way map pmml sql scoring?

another option use json instead of xml output model predictions. in case, r model sit? i'm assuming need mapped sql...unless r model sit on same server data , run against incoming data using r script?

any other options out there?

the answer depends on production environment is.

if "big data" on hadoop, can try relatively new open source pmml "scoring engine" called pattern.

otherwise have no choice (short of writing custom model-specific code) run r on server. use save save fitted models in .rdata files , load , run corresponding predict on server. (that bound slow can try , throw more hardware @ it.)

how depends on platform. there way add "custom" functions written in r. term udf (user-defined function). in hadoop can add such functions pig (e.g. https://github.com/cd-wood/pigaddons) or can use rhadoop write simple map-reduce code load model , call predict in r. if data in hive, can use hive transform call external r script.

there vendor-specific ways add functions written in r various sql databases. again udf in documentation. instance, postgresql has pl/r.


Comments

Popular posts from this blog

php - SPIP: From Tag directly to an article -

jquery - isAjaxRequest always return false -

ruby on rails - In a controller spec, how to find a specific tag in the generated view? -