It is the objective of the AI that determines the kind of data that needs to be collected in order to create useful data sets. Unfortunately so far too much emphasis has been put on creating a neural network and the model to process the data. It is time to refocus and improve the learning.

But how do you target something, when the outcome is pretty much a work in progress. Bottom line, whatever the capability of the AI will be, after the learning process has been completed, is not something that we can really anticipate in all its details. We can only prepare the learning curve to be focused, so let’s focus on the right data.

Three things to watch out for..

#1 Make sure you select only what matters. For example: if you want to improve the AI in a robot that is working at a burger flipper in order to increase the safety when operating together with humans in a small space, then you wouldn’t want to collect data from the conversations with the person taking the orders from the guests. Instead you would want to use as many input signals as possible, telling the AI about proximity, speed, movements etc.

#2 Cut your AI some slack and make it easier to learn from provided data sets. Sort them so it is clear right from the beginning, what kind of differences certain categories entail. Eliminate useless outliers. Provide data that represents only relevant differences – don’t bother with aspects you don’t want your AI see as part of the learning. Basically, don’t just throw data at your AI, just because it is available.

#3 Give the AI enough time to learn one step after the other. Plan these steps carefully. If you are a parent, you wouldn’t teach your two-year-old to juggle before it has learned to control the speed and mass of the juggling balls. Yes, some parents want their kids to start with Mozart when they want to teach them how to play the piano. But is that really the best way to install learning? Granted, with a kid, motivation plays a critical role. But the effects are the same: neither the kid nor the AI will get to step number two nicely, when step one hasn’t been taken.

Learn more about Data-Centric AI and Controlled Application Spaces.