Day 29 in MIT Sloan Fellows Class 2023, Introduction to Operation management 2 - Machine Learning Process Flow Diagram
Process flow diagram and capacity analytics
You can find the basics about these concepts on my previous blog article()
According to what I learned, I try to form the optimization problem in ML workflow in a simple diagram because capacity problem is highly relevant to software industry as well.
ML workflow in a nut shell
Then, we identify the bottleneck with the information above( the amount of data flow and capacity to deal with data).
According to bottleneck analysis, the primary issue is cleansing process for data source A. And even if we can resolve this bottleneck, we still need to tackle "Merge" process.
Waiting time and solution
So, you can find some bottlenecks in this data flow.
There are several resolutions for more efficient ML data flows.
- Increase capacity of cleansing virtual machines
- Make cleansing and ML process parallel
- Launch batch processing
- Create scalable application to automatically add computing resources
Waiting time is calculated by the area under the graph below.