Practical Ways To Improve The Robustness of Machine Learning Models
With the wide adoption of artificial intelligence (AI) these days, it is easy to see why machine learning models are being overwhelmingly incorporated into businesses. The reason is simple – AI-powered systems make processes ever more efficient. The fear a lot of people had over the years that artificial intelligence will take over humans’ jobs, has gradually reduced as there is now an increasing understanding of how AI improves processes and augments human capabilities in repetitive or strenuous tasks that would otherwise take longer to complete. Commercial AI innovations like OpenAI’s ChatGPT have also reinforced the importance of having systems that drive interactions, automate processes, improve efficiency, increase profitability for many businesses, give room for iterations of processes, fast response time, among many other advantages.
From healthcare to education, e-commerce to payment processing, retail marketing to food, artificial intelligence is becoming the fulcrum of innovations. Businesses want to solve problems and make profits at the same time, but often the process of achieving this is very challenging. This is where artificial intelligence, data science, and machine learning come to play. The reason is simple – machine learning models are trained on data to learn patterns and complex relationships, then they use that knowledge to deliver new insights, solve complex problems and improve processes that would be very challenging and laborious, if not impossible, for humans to do. As a result, several startups are springing up with the aim of creating solutions using AI.
Despite the huge effectiveness of machine learning, there is still a lot of nuance and complexity required to implement machine learning systems successfully. You can have a machine learning model that is not entirely effective because of various limitations. These problems, and their solutions, are what we aim to address in this article.
Practical methods for improving the robustness of models
There are several practical methods to improve the robustness of machine learning models. The first step is to make the training data robust during preprocessing. The process is then continued during model development and inference. Therefore, methods for improving the robustness of models can be explored at the following two levels:
1. The Data Level
Sampling: This can take place at any point in the machine learning workflow, whether during data preprocessing or during model experimentation and development. It’s simply the process of using statistical techniques to gather a subset of the real-world data of interest, usually when it is impossible, impractical or infeasible to access every possible data for the problem. The real-world data of interest is called the population, and the selected subset is called a sample. Proper sampling not only allows you to accomplish a task faster and cheaper, but can also help avoid potential biases in data collection, as well as help select methods that improve the efficiency of the data we sample.
Bias Mitigation (Preprocessing): There are many sources for bias in machine learning, the primary one being the data collection method. Also bias can be a result of a distorted truth in the real-world which the data is representing, such as systemic and structural biases which reflect prejudice and discrimination. Lastly, biases can exist in the insights we derive from data or models, such as conservatism bias, salience bias, and fundamental attribution error. Bias mitigation is a suite of techniques for detecting and eliminating (or reducing) different types of bias at any of three levels – the data level (preprocessing), the model training level (in-processing), and the inference level (post-processing). At the preprocessing level, methods include detecting and removing bias from the data before training the model. This ensures that bias is tackled at the source, as any undetected bias in data will be further amplified by the model.
Data Balancing: This refers to the methods for handling a problem predominantly in classification tasks known as class imbalance. A dataset is class-imbalanced if there is a substantial difference in the number of samples in each class of the training data. For example, in a training dataset for a credit default prediction task 95% of customers might be non-delinquent, and only 5% might be delinquent or in default. Since ML models work well in situations where the data distribution is more balanced, treating class imbalance is essential to ensuring that the models can have sufficient signal to detect all classes effectively. Data balancing can be achieved using a variety of techniques such as resampling (undersampling or oversampling), selecting suitable evaluation metrics (such as precision, F1 and recall), and algorithm-level loss functions such as class-imbalanced loss and focal loss.
Data Augmentation: This is a suite of techniques used to increase the training data in such a way as to make it more suitable and representative for modeling. Even when the training sample is large already, augmenting it can make models more robust to noise and model incidents such as adversarial attacks. The choice of augmentation technique is typically based on the data format you are working with, but generally data augmentation is done either by adding noise or by adding synthetic data.
Feature engineering: This is a general name for the subset of data preprocessing that involves manipulating or engineering features to make them suitable for modeling. Feature engineering operations include handling missing values, normalization and scaling, discretization, encoding categorical features, and even creating new features. Another significant technique in feature engineering is handling data leakage, which occurs when information from the test set “leaks” into the training set before the model is trained. This variety of methods ensure that the data passed into the models is robust, since models are only as good as the features they are trained on.
Utilize Diverse Training Data: By utilizing more diverse training data, machine learning models can be more robust. This can be done by not overfitting to test data. Rather, the model can be used efficiently when there are more diverse training data. One great way to achieve this is by augmenting the data with synthetic samples or using a large dataset.
2. The Model Level
Cross-Validation: When building machine learning models, it can be necessary to experiment with different algorithms to figure out the one that best fits on the training data and generalizes well to unseen data. Cross validation is a powerful technique for achieving this by running different algorithms on the training data iteratively and testing them on every evaluation metric of interest. This allows you to figure out which algorithm, and at what hyperparameter setting, works best on the data and solves the problem efficiently.
Ensembles: When building machine learning projects, it is necessary to think about how to continue to improve the model performance. Ensembling multiple models is a proven method to provide this performance boost, and the resulting model is simply referred to as an ensemble. The most common ensembling techniques are bootstrap aggregation (or bagging) and boosting, which are both designed to increase the training stability and accuracy of models, while reducing variance within a noisy dataset. Both bagging and boosting involves training multiple models and then combining their predictions to make a final prediction. The Random Forest algorithm is a popular implementation of bagging, while the XGBoost algorithm is a popular implementation of boosting.
Bias Mitigation (In-processing): This involves detecting and eliminating bias while the model is training and, as such, are highly dependent on the particular model being trained, as opposed to preprocessing and post-processing methods which are model-agnostic. Methods include cost-sensitive training and adversarial debiasing.
Bias Mitigation (Post-processing): These are methods used to mitigate bias during model inference and they aim to detect and correct fairness directly in the predictions. The advantage of these methods is that they can tackle outcome unfairness where it can have the greatest impact. Methods include prediction abstention and equalized odds post-processing.
Transfer Learning: Transfer learning can also be used to improve model robustness. It simply means a technique where a model trained on one task is used to initialize a model for a new task. This can help to improve the performance of the new model and make it more robust to changes in the data.
Adversarial Training: Adversarial training is a technique for increasing the robustness of models to make them more resilient against external attacks and abuses designed to corrupt them with bad data. In this technique, the model is trained on a dataset containing samples that have been specially designed to be difficult for it. As a result, the model will be more robust to real-world examples that are similar.
The above techniques are critical for successfully implementing robust machine learning systems that are not only highly accurate but also very robust against model incidents post-deployment. At Periculum, we build high-performing and resilient machine learning models that are optimized for high-quality and tuned for fairness to offer you robust and complete solutions customized for your business. Send us a mail at sales@periculum.io and get a free demo on how we can help your business make informed decisions