博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Easily export trained Apache Spark ML models and pipelines
阅读量:6477 次
发布时间:2019-06-23

本文共 2937 字,大约阅读时间需要 9 分钟。

hot3.png

In recent years, machine learning has become ubiquitous in industry and production environments. Both academic and industry institutions had previously focused on training and producing models, but the focus has shifted to productionizing the trained models. Now we hear more and more machine learning practitioners really trying to find the right model deployment options.

In most scenarios, deployment means shipping the trained models to some system that makes predictions based on unseen real-time or batch data, and serving those predictions to some end user, again in real-time or in batches.

This is easier said than done. There are a number of challenges that organizations face deploy these models:

  • Upfront Complexity – Deploying a model into production can require a lot of upfront work that can slow down the deployment process by weeks or more.
  • Disjointed Teams – Sharing models across teams for training and deployment can create challenges as teams try to deal with persistence formats, library dependencies, and different deployment environments.
  • Featurization Logic – There is almost always data processing and featurization logic that proceeds the model application step which adds yet another thing to be implemented in deploying a model.
  • Inconsistent Deployment Environments – Different deployment systems for different scenarios can cause machine learning prediction logic to behave differently, giving subtly incorrect results.

Introducing Machine Learning Export

We are happy to announce the general availability of a powerful new feature called Databricks ML Model Export. This Databricks feature furthers our efforts to unify analytics across data engineering and data science by allowing you to export models and full machine learning pipelines from Apache Spark MLlib. These exported models and pipelines can be imported into other (Spark and non-Spark) platforms to do scoring and make predictions.

This new capability serves as an alternative to batch and streaming prediction within Spark, allowing companies to build low-latency and lightweight machine learning-powered applications with ease.

Seamless Deployment of Models

When speaking with customers, one of the consistent pieces of feedback was that they love to do data science in our platform, but then they have to re-implement the code in a different system to deploy into production. With this new export feature, Databricks can truly serve as an end-to-end platform to build, train, and deploy machine learning models into production with blazing speed and higher reliability.

More Information

To learn more about how to get started with Databricks Machine Learning Export as well as other relevant information, check out the following resources:

  •  – presented by Joseph Bradley and Sue Ann Hong.

Also, look out for a follow on blog that will dive deeper into the inner workings of this new feature.

转载于:https://my.oschina.net/u/2306127/blog/1784890

你可能感兴趣的文章
css pseudo classes
查看>>
深入浅出的webpack构建工具---webpack基本配置(一)
查看>>
rac各节点实例需设置为相同的一些参数
查看>>
acdream 1031 Component(树中一个大小为k的节点集最小权值)
查看>>
一些服务器客户端的c例子
查看>>
Font Creator Program 字库修改合并软件
查看>>
数值计算程序-特征值和特征向量 [转]
查看>>
setup_irq和request_irq(转)
查看>>
ORACLE中将一个值赋值到另一个表的值
查看>>
IE11将支持SPDY
查看>>
I.MX6 android mkuserimg.sh
查看>>
jQuery序列化后的表单值转换成Json
查看>>
linux添加开机启动项的方法介绍
查看>>
MonoTouch绑定CocoaTouch类库
查看>>
Oracle 11g Release 1 (11.1) PL/SQL_了解静态和动态 SQL
查看>>
算法学习坚持、努力——记录acm破200
查看>>
二叉查找树
查看>>
职场之KPI
查看>>
记录一个未知的问题
查看>>
Node.js事件驱动模型
查看>>