您的位置 > 首页 > 商业智能 > Deployed your Machine Learning Model? Here’s What you Need to Know About Post-P ...

Deployed your Machine Learning Model? Here’s What you Need to Know About Post-P ...

来源:分析大师 | 2019-10-08 | 发布:BOB体育娱乐平台之家

So you’ve built your machine learning model. You’ve even taken the next step – often one of the least spoken about – of putting your model into production (or model deployment). Great – you should be all set to impress your end-users and your clients.But wait – as a data science leader, your role in the project isn’t over yet. The machine learning model you and your team created and deployed now needs to be monitored carefully. There are different ways to perform post-model deployment and we’ll discuss them in this article.We will first quickly recap what we have covered in the first three articles of this practical machine learning series. Then, we will understand why and how “auto-healing” in machine learning is a red herring and why every professional should be aware of it. And then we will dive into two types of post-production monitoring and understand where and how to use them.This is the final article of my four article series that focuses on sharing insights on various components involved in successfully implementing a data science project.In this series on Practical Machine Learning for Leaders, we have so far discussed:Once the optimal end-to-end system is deployed, do we declare victory and move on? No! Not yet, at least.In this fourth (and final) article in this series, we will discuss the various post-production monitoring and maintenance-related aspects that the data science delivery leader needs to plan for once the Machine Learning (ML)-powered end product is deployed. The adage “Getting to the top is difficult, staying there is even harder” is most applicable in such situations.There is a popular and dangerously incorrect myth about machine learning models that they auto-heal.In particular, the expectation is that a machine learning model will continuously and automatically identify where it makes mistakes, find optimal ways to rectify those mistakes, and incorporate those changes in the system, all with almost no human intervention.The reality is that such auto-heal is at best a far-fetched dream.Only a handful of machine learning techniques today are capable of learning from their mistakes as they try to complete a task. These techniques typically fall under the umbrella of Reinforcement Learning (RL). Even in the RL paradigm, several of the model parameters are carefully hand-tuned by a human expert and updated periodically.And even if we assume that we have plenty of such products deployed in real-life situations, the existing data architectures (read data silos) within the organizations have to be completely overhauled for the data to seamlessly flow from the customer-facing compute environment to the compute environment that is used for building the machine learning models.So, it is safe to say that in todays world, the auto in auto-healing is almost non-existent for all practical purposes.Let us now see why machine learning systems need healing in the first place. There are several aspects of the data ecosystem that can have a significantly negative impact on the performance of the system. I have listed some of these below.A typical machine learning model is trained on about 10% of the possible universe of data. This is either because of the scarcity of appropriately labeled data or because of the computational constraints of training on massive amounts of data.The choice of the machine learning model and the training strategies should provide generalizability on the remaining 90% of the data. But there will still be data samples within this pool where the model output is incorrect or less-than-optimal.In all real-world deployments of machine learning solutions, there will be a subset of the input data which comes from a system that the data science team has little control over. When those systems change the input, the data science teams are not always kept in the loop (happens largely due to the inherent complexities in the data world).Simple changes in the input data, like a type change from scalar to list, can be relatively easily detected through basic sanity checks. But there are a variety of changes which are difficult to catch, have a substantially detrimental impact on the output of the machine learning system and unfortunately are not uncommon.Consider, for example, a system deployed to automatically control the air conditioning of a server room. The machine learning system would obviously take the ambient temperature as one of the inputs.It is fair to assume that the temperature sensors are controlled by a different ecosystem which may decide to change the unit of temperature from Celsius to Fahrenheit without necessarily informing the machine learning system owner. This change in input will have a significant impact on the performance of the system with absolutely no run-time exception thrown.As the systems get complex, it is almost impossible to anticipate all such likely changes beforehand to encode exhaustive exception handling.The landscape of just about eve
京ICP备11001960号  京ICP证090565号 京公网安备1101084107号 论坛法律顾问:王进律师知识产权保护声明免责及隐私声明   主办单位:人大经济论坛 版权所有
联系QQ:2881989700  邮箱:service@pinggu.org
合作咨询电话:(010)62719935 广告合作电话:13661292478(刘老师)

投诉电话:(010)68466864 不良信息处理电话:(010)68466864