Machine Learning Best Practices

By on
Machine Learning Best Practices

Artificial intelligence (AI), machine learning (ML), and data science are together transforming modern organizations in new ways. These highly advanced technologies—along with a long queue of related technologies—have reshaped global organizations of all shapes and sizes in recent times. The cumulative success of these technologies in businesses is so huge that now the remaining organizations are all playing “catch-up” to survive or remain relevant.

The unique feature of ML technology is that it allows a computer to learn on its own—by using models (algorithms) and studying sample data to determine the correct answer. In ML, machines do not need prior programming to produce results; the machines can study sample datasets and train themselves to provide desirable results when needed. The algorithm contains all the smart logic to test data at every step of processing to arrive at the correct conclusion.

Machine Learning Is Currently Very Popular in Enterprises

With the widespread penetration of AI across industry sectors and business practices, ML technology is now used across sectors such as manufacturing, finance, healthcare, insurance, education, and more—primarily to process routine processes and tasks in high speed, and also to automate many human and machine tasks to enhance efficiency and reduce errors. With the rising popularity of AI and machine learning technologies and tools, businesses need a set of “best practices,” to get the most benefit from their technology investments.

Machine Learning Best Practices

After spending almost a decade on testing and refining technology practices in use, the ML researchers, field experts, and business leaders have collectively contributed to what may be defined as machine learning best practices. The major steps in implementing ML practices can be summed up as:

  • Statement of a business problem
  • Presenting a case to the C-Suite for buy-in
  • Convincing business users and building a ML culture
  • Tapping business data for use in ML use cases
  • Building resources for machine learning best practices
  • Starting small and growing gradually
  • Measuring and optimizing at every stage

Here is a detailed article about 18 ML best practices from Rubik’s Code, which explains all the important steps and the associated best practices involved in implementing a ML model from start to finish. The best practices explained in this article are infrastructure best practices, data best practices, model best practices, and code best practices. This handy guide can serve as a checklist for ML implementers.

Defining and Developing a Business Problem

For the first step, the business problem can only be defined and developed by training and testing large data sets.

For this step to complete successfully, identified “training data” is used to train the ML model or the algorithm. The training data (information collated from a variety of internal and external sources) is usually fed into the model to teach the machine.

Another set of data, known as the “test data” is used to evaluate the performance of the training process. Usually, the training and test data files are kept distinct through naming conventions. The test data help determine whether the model is doing what it’s expected to do, and how well it’s doing. The model can be gradually refined by using different test data sets. Model accuracy and performance are two big criteria for measuring.

Presenting a Use Case to the C-Suite

The best way to sell a use case to top executives is by using the “show and tell” method. An operational mode may be presented in a meeting or presentation, which can serve as a test case for the project.

This practical approach can demonstrate the performance metrics of the ML model. If the purpose of this use case is to enhance a specific business process or task, then demonstrated performance metrics will certainly help to convince the C-Suite about the usefulness of the model in discussion. Generally, a formal demonstration should also include a clear cost-benefit analysis. The CBA helps to reveal the ROI.

Here are the best practice rules for executive buy-in:

  • Define the value that this ML model will offer to the organization
  • Share the costs vs benefits of implementing this ML model
  • Estimate the timeframe for retrieving the projected ROI

Although estimating the ROI is a time-consuming and iterative process requiring actual field testing over a long period of time, some past metrics can be used to share ROI in an executive meeting.

Gaining the Confidence of Business Users and Building an ML Culture

The best way to convince ordinary business users and slowly develop an enterprise machine learning culture is by discussing practical, function-specific use cases with the relevant team. For example, an ML model designed to predict credit rating of banking customers can convince a credit-card handling team in the bank only when that model can explain its decision in case of credit-card application rejection. So, a good model will not only optimize a metric, but also offer bias-free, “explainability” and further advice to the customer for improving their credit scores.

Tapping Business Data for Use in Machine Learning Use Cases

The data gathering and assessment stage begins with determining the “fitness” of specific data sets for a given use case. This fitness check usually determines whether the data is right for particular ML models, whether the data is fast enough for predictive analysis, and whether data storage and access methods are suitable for the ML operation.

Consider a restaurant chain with millions of customer data records. The data volume combined with the data variety warrants the use of a data lake environment, where a wide variety of customer data can be accessed easily and quickly without any bureaucratic hassles. This KDNugget post not only talks about best practices related to data assessment but also other best practices related to pre- and post-implementation stages in ML model deployment.

Building Resources for Machine Learning Best Practices

The resource building for machine learning best practices in an enterprise happens over time, by assessing data gathering and storage methods, building a ML expert team, collecting and storing data sets, allocating a data science workforce, and selecting a combination of advanced technologies for ML implementation.

Organizations today realize that developing an end-to-end ML infrastructure that supports the “seamless training, testing, and deployment of models” is a long-term goal but not easily achievable. The author of the linked article discusses how most businesses handle this challenge.

A proverb said to be African goes: “If you want to go fast, go alone; but if you want to go far, go together.” This proverb is highly relevant for ML practices. A collaborative development spirit, free communication, and team decision-making are inherent goals for all successful ML implementation teams.

Starting Small and Growing Gradually

This is true of any information-technology venture, but specifically true of ML practices. The best approach is to begin with one or two use cases, develop, implement, and monitor their performance in the real world. If they succeed, then explore some more. According to a VentureBeat report, around 87% of ML models “never make it to production.” So being slow and cautious is perhaps the best way to go.

Measuring and Optimizing at Every Stage

Here are some best practice steps for measuring an ML model:

  • Review the initial hypotheses on the model or models (algorithms) and the dataset
  • Test if the model is overfitting or underfitting
  • Monitor the errors made by the model

Model improvement can be a tricky subject and often requires highly specific data science expertise based on the use case. The resources can be gathered across an enterprise or sometimes by tapping external resources. This article explains the secrets behind refining an ML model. The quick list of best practices discussed here are adding data samples, viewing the problems from other angles, adding context to data, hyper parameters, using cross validation in model training, and experiments with algorithms.

Here are some best practices dedicated to fine-tuning  ML and ML Models. In this article, the ML models selected are built on structured data (tabular, categorial, or time series) and the DL models selected are built on unstructured data: image, text, video, audio. This article has a warning that while no ML implementation can guarantee success, the suggested best practices can help minimize chances of failure.

Best Practices for ML Engineering: The Final Words

Here is another guide that includes flow lists of engineering best practices used to develop software systems with ML elements embedded in them. Global surveys are being conducted to measure the current adoption rates of these best practices. Another article from Google provides a complete rule book for ML engineering.

Words of Wisdom for ML Implementers

  • Data access is the key to optimized ML implementation
  • There are benefits to virtual storage over physical data storage
  • Embrace the hybrid-cloud or multi-cloud for ML workloads

Image used under license from

Leave a Reply