Over the past few years, an increasing number of business data processing projects have emerged, due to which we are seeing an accelerated development of analytical tools.
Data equips us with valuable knowledge, and knowledge provides us with a competitive advantage. A growing number of organizations understand the value that data analytics provides and opt for advanced analytical tools, which in practice means building solutions based on Machine Learning. The times when a single Data Scientist worked alone on a project are a thing of the past, nowadays more and more companies want dedicated solutions. Entire teams are created, tasked with both implementing and improving already established solutions, as well as conducting research on new ones.
The process of application development using ML is more complex than the conventional one of software development. The first step is of course data collection and exploration, as in checking what the data contains, mainly with the use of statistical methods. Then we determine what model can be used to create it and what business benefits can be gained thanks to it. Subsequently, we prepare data for the model we want to use (clean, normalize, etc.). This represents the first phase in creating a solution based on ML algorithms. The second phase consists of the preparation of models, which includes the extraction of relevant variables, creating the model, testing it, as well as saving it as a new version using the appropriate version control system. Finally, the model must be made available in an appropriate way, usually as a separate service with an appropriate API. In the next stage, it is both checked and tested. The model uploaded to the production environment should be monitored to ensure stability and improvement in subsequent versions.
To streamline the entire software development process, DevOps practices are used on a large scale. Their goal is to streamline the software development process, as well as ensure smooth and fast deployment of subsequent versions of the application. In case of applications based on ML models, developers also want to be able to streamline large-scale application development. However, these systems are certainly different from conventional applications. Hence, we hear the term “MLOps” more and more often. This is the DevOps approach used for ML-based applications.
Despite many similarities, the differences between these approaches are enormous, and they result primarily from the specifics of ML projects, in which there are usually specialists in the field of mathematical models who often have no experience with creating services on a production level. Tasks related to broadly defined artificial intelligence are research tasks, which means that models change very quickly, while data often needs to be prepared in a different way for different problems. MLOps is inherently a more experimental field than DevOps. Models are based on data, which means that any change in the data schema requires adjustments, which is not taken into account in the classic software development process.
Considering the entire manufacturing process, the differences in MLOps versus DevOps approaches are very clear:
- Continuous Integration is not solely about testing and validating code and components, it also includes data, datasets, as well as models.
- Continuous Delivery becomes even more complex. It is not about a single service or package, but whole complex systems in which models are handled as services.
- There is an emerging need to monitor models, and use the monitoring information to retrain. Models should be retrained automatically without external intervention, which is an additional challenge.
- An additional element – Continuous Testing – includes testing and validation of models, which depends on the problem being solved. It is no longer just about integration or unit testing.
All of the previously mentioned means that in order to start working as a MLOps, a person who has experience in the DevOps domain needs to be sufficiently familiar with the ML domain as well, which is very broad. The reason for this is that MLOps draws knowledge from both the DevOps and ML domains, which is why we are looking for someone who is even familiar with DevOps to join our teams, who also has knowledge, even basic, in any of the aforementioned fields, or just a willingness to learn about them. Following theoretical and practical training tailored to individual needs, these people are deployed on a project where they can broaden their knowledge in practice. This approach allows us to build expert teams equipped with both theoretical knowledge and practical preparation at the highest level.
As in the case of DevOps methodology, tools which enable implementation and automation of processes are also used here. But as I previously mentioned, due to the crucial importance of models in this case models, these tools extend the DevOps approach. Of course, the number of tools is continuously growing, among which we can distinguish those that are used for production handling of models e.g.: Cornex, TensorFlow Serving or TorchServe. On the other hand, other tools provide support throughout their lifecycle, such as MLFlow or Neptune AI. Moreover, platforms are also emerging to provide end-to-end process support – KubeFlow or Algorithmia. Please follow this link for a list of popular MLOps tools: https://github.com/kelvins/awesome-mlops#cicd-for-machine-learning. Cloud solution providers also offer services to help implement models in applications. Examples include SageMaker from AWS, or Azure ML. In the next post, we will be showing you how one of the tools for implementing MLOps in a project operates.
MLOps is a concept that is constantly developing, it is in the process of being created before our eyes. New tools are created day by day and there is no single standard that solves as much as the majority of problems in this area. It is a very complex field, requiring consideration of many aspects. All the reason why it is both fascinating and challenging.
Do you want to grow as MLOps in Billennium?
Check out our recruitments >>>MLOps engineer https://billennium.com/job-offers/mlops-engineer/