Artificial intelligence is likely to have a problem

AWS has just announced Amazon’s SageMaker Ground Truth to help companies build machine learning data sets.

This is a powerful new service for people who have access to a lot of data which has not been annotated consistently. In the past, people would have to label a massive body of images or frames in videos to train a model for computer vision. Ground Truth uses machine learning to automatically label a training data set in addition to humans.

This is an example of a new topic in the past year, machine learning for machine learning. Machine learning data catalogs (MLDCs), probabilistic or faded matching, automated annotation of training data and synthetic data creation all use machine learning to generate or prepare data for subsequent machine learning downstream, often solving data scarcity or dispersion problems. This is all well and good until we consider that the learning of machines in itself is based on inductive reasoning and therefore based on probability.

Let’s consider how this can take place in the real world: a healthcare provider wants to use computer vision to diagnose a rare condition. An automated annotator is used to create more training data (more labeled images) due to sparse data. The developer sets a propensity threshold of 90 percent, meaning that only records with a 90 percent chance of accurate classification are used as training data. Once the model has been trained and deployed, it is used for patients whose data is linked from multiple databases using fluent text data matching. Entities from different data sets are matched with a 90% chance of being the same. Finally, the model flags images with a 90 percent or higher probability of diagnosis of the disease.

The problem is that data scientists and machine-learning experts traditionally focus only on this final propensity as a representation of the overall prediction accuracy. This worked well in a world in which data preparation was deductive and deterministic. But if you introduce probabilities above probabilities, the final propensity score is not accurate anymore. In the above case, there is an argument that the probability of an accurate diagnosis decreases from 90 percent to 73 percent.

As the emphasis on the need for explanation in AI increases, a new framework for analytical governance needs to be created that incorporates all the probabilities included in the machine-learning process— from data creation to data preparation to inference training. Without it, erroneously inflated propensity values misdiagnose patients, mistreat clients and mislead companies and governments as they make critical decisions.

Next week, my colleague Kjell Carlsson will be holding a deep dive session entitled ” Drive Business Value Today: A Practical Approach To AI ” at the Data Strategy & Insights Forum in Orlando, Forrester. Please join us next Tuesday and Wednesday, 4 and 5 December to discuss this topic and learn best practices for introducing data into actions that drive measurable business outcomes. Original Source