MLOps – what is it and what can you do with it?
Using machine learning for various tasks can mean a variety of different things depending on the use case, and the same goes for what they require in terms of management and maintenance. MLOps offers a set of guidelines and best practices that help approach the entire lifecycle of machine learning in a systematic way. In this blog post, I’ll go through the basics of MLOps, starting from the concept itself as well as the concrete benefits it can offer in real-world use.
MLOps or Machine Learning Operations is an emerging concept in the field of data that helps project teams to utilize the full potential of their talent on developing machine learning models rather than manually wrangling data or running deployments to move the application code from local development to test environment or production use. MLOps offers a systematic approach to monitoring, scaling, and evaluating data pipelines and machine learning models that benefit all kinds of projects, whether complex applications, proof-of-concept work, or anything in-between.
With the application of MLOps principles, the data scientists can focus on the core development of machine learning models, which often is the most interesting part, while the MLOps practices take care of manual tasks such as data cleaning, quality control, and model versioning.
But applying MLOps not only brings ease and benefits to data scientists and engineers working with application development. It also equally benefits the business owners and clients. Automation allows for faster development leading to faster results and more reliable machine learning models ensuring functionality from the beginning. This leads to shorter development times that in turn bring faster end-result delivery and cost-effectiveness. Finally automated quality control ensures more reliable solutions that are ensured and tested to function as intended reducing the risk of faulty deployments.
What is MLOps?
At its core, MLOps refers to the lifecycle management of machine learning projects. It is a key concept of modern machine learning application development, and its purpose is to make training, deploying and maintaining of machine learning applications seamless and efficient. MLOps is not a set of specific technologies or approaches but rather an umbrella term for activities focused on building reliable and well functioning machine learning models. It includes both development work practices and ways of working as a project team. To put it simply, you could call MLOps a set of best practices for machine learning application development.
In software development, DevOps has been applied for years to automate integrations, testing, and deployments of code, easing the work of software developers and creating less error-prone development practices. MLOps aims to bring the same best practices to machine learning development to ease the work of ML engineers, allowing them to focus on the core aspect of algorithm development, and create more reliable applications.
Machine learning solutions solve increasingly complex problems within a variety of industries every day. The applications we build must be reliable, scalable, and efficient to develop, maintain, and use. Traditionally, processing data, creating, updating, and validating models, creating deployments, and maintaining applications have all required a huge amount of hands-on work from experts, which can be repetitive and error-prone. MLOps brings a set of tools for automating many of these steps and allows for example manual data processing and model validation to become parts of an automated pipeline.
Why should you consider MLOps and what are its benefits?
The biggest benefit of MLOps is that it helps in automating certain steps of ML processes, allowing experts to focus on core functionalities instead. In short, whether MLOps makes sense for you boils down to one question – why spend time creating pipelines, automated validations and deployments instead of hopping to algorithm development instantly?
The benefits of utilizing MLOps manifest in the later stages of a project. Development, deployment and maintenance processes become increasingly efficient, allowing experts to focus on the core problems and enabling faster delivery. Utilizing MLOps enables us to build end-to-end solutions faster, providing improved focus on creating genuine business value. Generally speaking, the major benefits of applying MLOps to project work come from increased efficiency and reliability.
Next, we’ll take a closer look at the core benefits of the approach at each key phase of the machine learning application development process.
Development
The process of developing a machine learning model follows an iterative process: The model is trained, validated, and tested in an iterative manner to obtain the best possible model for the use case.
When starting a data-focused project development process, the first step is commonly to preprocess and validate the data. This step benefits hugely from automation as the process can be both time-consuming and prone to errors. Sufficient and consistent preprocessing ensures well functioning models after training. Similarly, the data validation process can be automated to ensure the data meets the same criteria every time.
During model development, training is a key step. During the training process the model is introduced to a dataset representing the use case and the model learns parameters best describing that dataset. After training the model is able to apply the defined parameters to new data and create for example predictions or classifications, depending on the selected use case. With utilization of MLOps practices the model training – which can take from minutes to days depending on the amount of data – can be initiated automatically after each change in the model, reducing the required amount of manual work.
With the help of automation, keeping track of the different model versions becomes increasingly straightforward, and testing and validation can be introduced to the pipeline, which in turn helps ensure sufficient model quality control. Automated model version control also allows the use of a feature store, where certain highly successful features and characteristics of a model can be stored, tracked and used later on as needed.
Deployment to production
The deployment process can be automated as well. The pipeline to production can contain checks and tests that the program must pass to ensure model accuracy is as good as (or better than) the current one, the program does not contain any major errors, and all features work as intended. Automated testing ensures that all the code is properly tested and meets the required quality standards.
Automated deployment process also gives us the opportunity to easily execute something called a canary deployment. This means the model is initially only exposed to a limited number of users to ensure correct functionality, and then later on – if things work as expected – to all users, which ensures the solution stays functional with the highest quality at all times.
Maintenance
Machine learning models are initially trained during the development phase. But what is obtained at that point should not be considered the final model – a better approach would be to think of it as the first deployment model in a continuous improvement process, where a new model can be retrained in regular intervals as more data is obtained. This ensures that the model is always at its best. Automating the process allows for effortless re-training, as the model is automatically re-trained for example at a set interval or when new data is presented.
What are the components of MLOps architecture?
Now, let’s quickly introduce the building blocks of a general level architecture for applying MLOps in a project. The steps of the process are the same as in traditional projects, but here they are augmented with automation. The image below presents a general architecture for implementing MLOps in project work.
MLOps architecture diagram
The pipeline begins with data being fetched from storage. Afterwards, the data preparation process starts, which includes for example data cleaning, validation and formatting. After the preprocessing step, the data is in a format that can be utilized in model training. An automated pipeline will trigger the model training process once the data is present and the model code is validated through a continuous integration & development (CI/CD) pipeline. The resulting model is moved to the validation & evaluation step, where automated tests and checks are run to ensure the model is adequate for deployment and the performance matches at least a minimum threshold, but preferably is at least equal or better than the current model.
During the evaluation step the model is assessed for highly successful and well performing features. Those features are moved to feature storage, from where they can be utilized when training the next generation model.
If the validation and evaluation of the model has been successful, the model automatically moves to deployment, where the model is taken into production use. Finally, the created model is then moved to version control to keep track of the different model versions and provide the possibility to revert to an earlier version if necessary.
Finally, during the application use, automated monitoring will ensure everything works as intended and allow for swift maintenance in case any issues occur.
Though all the steps of the process are automated and run by default, the user can always for example stop the process at any step or run any additional steps if deemed necessary. The loop structure of the pipeline allows for easy maintenance, model updates and mode re-training, as the pipeline can be triggered for example in case new data is applied to the storage or model code is updated.
Summary: MLOps in a nutshell
MLOps, short for Machine Learning Operations, is an umbrella term for activities around lifecycle management of machine learning applications. It covers everything from automated data processing and model training pipelines to automated code and model deployments and quality assurance.
The ML approach does not include a single set of tools or technologies, but is rather a collection of best practices utilizing automated tools, processes, and ways of working. The aim is to create machine learning applications that are safer, more efficient, more reliable, and easy to reproduce and maintain, but require less manual work from experts. The methodology allows machine learning engineers and data scientists to focus on the core development of the model rather than spending time on tasks unrelated to data science, such as preprocessing data, configuring environments, or monitoring models. The faster deliveries and better performing models provide faster value to the end-user and customer bringing benefits also to the business owners.
MLOps can be applied to projects of a variety of sizes and will provide benefits from small proofs-of-concept to large application development, but the method especially shines in big, long-term projects where the amount of manual work can often grow.
- Noora RaitanenSoftware Developer