본문 바로가기
정보공유

Operationalizing Machine Learning (Rajeev Dutt, CEO, Co-Founder, DimensionalMechanics)

by 날고싶은커피향 2018. 11. 22.

Operationalizing Machine Learning (Rajeev Dutt, CEO, Co-Founder, DimensionalMechanics)


Operationalizing Machine Learning (Rajeev Dutt, CEO, Co-Founder, DimensionalMechanics) :: AWS Techforum 2018
1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Rajeev Dutt DimensionalMechanics Inc. Operationalizing Machine Learning
2. Challenge: ML is not IT Ready
3. Data Provenance Scope Quality Versioning Integration Deployment Discoverability Security Monitoring Support Lifetime Key considerations for enterprise AI:
4. For AI to become ubiquitous IT has to be able to manage AI model development and deployment process!
5. Copyright DimensionalMechanics 20185 NeoPulse™ automates the creation, distribution and management of AI Humans don’t have to create AI Models – AI Studio can do this for you
6. Ubiquitous (adjective): present, appearing, or found everywhere. NeoPulse™ is about making AI ubiquitous – on every device, in the cloud, and on premise for every business large and small for any machine learning problem. NeoPulse™ reduces the barrier to entry so that any developer, regardless of machine learning experience, can create, deploy and manage custom AI models in one third the time and at 10% of the cost of comparable platforms.
7. Data Provenance Scope Quality Versioning Integration Deployment Discoverability Security Monitoring Support Lifetime NeoPulse® can help with
8. Introducing NeoPulse™: One platform for all Enterprise AI needs Integrate Plan Deploy Design Train Analyze Manage
9. Portable Inference Models (PIM) A neural network that is encapsulated in a container that can be queried using a runtime layer, also referred to as an AI Model NeoPulse® AI Studio: AI to build AI Server application with a powerful AI called “the oracle” that is capable of automating the process of creating sophisticated AI Models NeoPulse™ Query Runtime A program that is licensed by the organization to allow any application in the enterprise to access the AI model using a web-based (REST) API NeoPulse™ Modeling Language (NML) An intuitive DSL (domain specific language) developed by DimensionalMechanics™ that is executed by the NeoPulse™ AI Studio to automate the creation of new AI Models NeoPulse® Framework
10. Learning methods Classification Regression Some unsupervised (eg. Auto-encoders, GAN) Data types supported  Audio  Video – single frame  Video – multiple frames (motion video)  Images  Text  Numerical  Time Series  Medical data (DICOM) Enterprise Features  Automated AI engineering  Multi-platform (Cloud, PCs, ARM64 devices)  RESTful Interfaces  Extensive logging  Integration with enterprise workflows  Enterprise scaling  Nvidia CUDA GPU computing Capabilities
11. PIM NeoPulse® AI Studio NeoPulse® Query Runtime NML File .CSV NeoPulse® Workflow REST Application
12. PIMNeoPulse™ AI Studio NML File .CSV With a simple command, the PIM can be deployed anywhere there is a runtime including on ARM64 devices NeoPulse® Workflow
13. Create intelligent applications in 7 steps
14. With tumors Without tumors 0 1 label path 0 /images/negative/img_n_0001.jpg 0 /images/negative/img_n_0002.jpg 0 /images/negative/img_n_0003.jpg … 0 /images/negative/img_n_<…>.jpg 1 /images/positive/img_p_0001.jpg 1 /images/positive/img_p_0002.jpg 1 /images/positive/img_p_0003.jpg … 1 /images/positive/img_p_<…>.jpg Curate your data and construct a CSV file 1 lung.csv Assuming that you have high quality images and properly formatted, a simple script can construct the csv file – less than an hour
15. Create the NML script 2 lung_classify.nml Copy one of the examples listed on the DM Github page and modify it for your needs – less than an hour oracle("mode") = "classification" source: bind = "/DM-Dash/medical/lungtumor/lung.csv" ; input: x ~ from “path" -> image: [shape=[28, 28], channels=1] -> ImageDataGenerator: [rescale= 0.003921568627451]; output: y ~ from “label”-> flat: [2] -> FlatDataGenerator: [] ; params: batch_size=32, number_validation=10000 ; architecture: input: x ~ image: [shape=[28, 28], channels=1] ; output: y ~ flat: [2] ; x -> auto -> y ; train: compile: optimizer = auto, loss = auto, metrics = ['accuracy'] ; run: epochs = 4 ; dashboard: ; 1 2 3 4 5 6 7 8 9 10 11 12 13 14
16. Compile and start training 3 Compiling the NML code (assuming no syntax errors) is immediate – seconds …training is another matter NeoPulse™ AI Studio lung_classify.nml lung.csv
17. Training… 4 Training can take time depending on the volume of data and the compute resources available. There’s nothing for you to do but the machine will be busy for a couple of days or more for a decent model Fortunately AI Studio employs a queuing model – so it doesn’t stop you from starting the next project NeoPulse™ AI Studio
18. Export a PIM file NeoPulse™ AI Studio Exporting a PIM file is simple – choose from a set of models based on accuracy (for example) and simply export in a single call. The PIM is a file that can be moved from one machine to another (either locally or in the cloud) lung_tumor_model.pim 5
19. Import a PIM file 6 NeoPulse™ Query Runtime lung_tumor_model.pim Once the model has been built, you can move the resulting PIM file from machine to machine as long as the NeoPulse Query Runtime has been installed. Importing the model into the runtime is a simple command – takes just a couple of seconds.
20. Call the model via a REST API from an application 7 NeoPulse™ Query Runtime After importing the model, NeoPulse™ Query Runtime automatically generates a RESTful API that allows applications to query the model directly. You don’t need to build any custom APIs to call your model.
21. NeoPulse® Advantages 50x less code 18x lower project cost 4x lower developer cost 3x shorter project duration 3 4 18 50 1 1 1 1 0 10 20 30 40 50 60 Project Duration Development Cost Project Cost Lines of Code Competition DM
22. CASE STUDY: AMORE PACIFIC Sept 2015: Amore Pacific added to Forbes “Most Innovative Company” list. Revenue: $4.5B May 2018: Amore Pacific sent a team of four engineers without ML experience for a 1 month training session on NeoPulse® June 2018: Four engineers develop 20 models in 20 days during initial training. Return to South Korea with 100% recommending DM as company platform for AI model development. Sept 2018: Company hires Chief Digital Technology Officer to oversee AI team.  Amore Pacific puts first DNN Model into production with $1200/month revenue stream to DM. DimensionalMechanics has created a platform called NeoPulse that maes it possible for companies like AMORE PACIFIC to do machine learning at scale. We also found the training useful and comprehensive. By using the NeoPulse, I am confident that it will be a great help in achieving our # 1 beauty AI company vision." -Kevin Choi, Leader, Digital IT Innovation Team, Amore Pacific
23. Case Studies A Seattle based staffing startup was quoted nearly $450,000 to develop a solution for their AI platform – NeoPulse developed the solution for under $10,000. A Seattle based medical startup was paying $20,000/month for 6 months to develop and maintain a solution to create an AI model that was 74% accurate. In three days, NeoPulse built a solution with 86% accuracy costing the company only $4,000. Stanford University physicians could create a model to differentiate between normal and abnormal PET/CT images on NeoPulse with no prior knowledge of AI. Quality was high enough that they published the results and the paper got accepted at RSNA. University of Washington ran a study comparing NeoPulse to a standard vision recognition algorithm called VGG16. It took 20,000 iterations using VGG16 to get to 95% accuracy. It took NeoPulse 30 iterations to reach the same accuracy.
24. Material Science Auto-generation of materials: • With University of Washington, we are running a program to automatically understand material properties and then generate new materials with those properties • Imagine designing new ultra-strong materials automatically and then 3D printing them… AI generated materials Original material
25. Available as AMIs on AWS AI Marketplace
26. NeoPulse® AI Studio
27. NeoPulse® Query Runtime
28. Watch out for a major announcement at AWS re:Invent
29. Thank you
30. NeoPulse® AMI Architecture Optional
31. Enterprise ML Pitfalls Model Quality How accurate is our model? What is the false positive/false negative rate? RoC curve? Is my model overfitted? Bias vs. Variance of model?
32. Enterprise ML Pitfalls Version Control What version of the model am I working with? When was it created? Can I roll back to a previous model if the current model does not perform?
33. Enterprise ML Pitfalls Integration Can the process of creating and deploying AI models be integrated in an enterprise workflow? How easy is it to integrate the models into enterprise applications? Are there any standards?
34. Enterprise ML Pitfalls Deployment Where has my model been deployed? How many versions exist? Who determines when and how the model is deployed? What is the target environment (OS, Memory, Hardware config.)?
35. Enterprise ML Pitfalls Discoverability What models exist? How do I access them? Where do the models exist?
36. Enterprise ML Pitfalls Monitoring Can I retrieve statistics about my model?  Number of queries  Errors  Time taken per query  Batch vs. real time  Overall performance metrics: CPU and memory utilization
37. Enterprise ML Pitfalls Data Provenance Where did the data to train the model come from? How trustworthy is the data? Is it biased?
38. Enterprise ML Pitfalls Scope What does my model do? (ex. classification of dogs/cats/lemurs) What are its limits? – what can it do what can’t it do? Who is the audience of the model? What technology is used by the model? Where can it run? Who created it? For what purpose?
39. Enterprise ML Pitfalls Security Has anyone tampered with the training data? Can I tell? How do I know that I can trust the model and that no one has tampered with it? Access controls on the model? Is it possible to reverse engineer the model or the training data?*
40. Enterprise ML Pitfalls Support Do I have the staff with the right skills to support the solution? What kind of problems am I likely to see? How do I validate/test the model in the wild?
41. Enterprise ML Pitfalls Lifetime & Deprecation Does the underlying training data change? How often? Are the fundamental model statistics changing over time? Should I retrain?


반응형