AI does not have to be complicated (Part I/II)

AI can be accessible and simple to implement in various industries and applications, with various open-source tools and cloud services available. It ultimately depends on the complexity of the problem you are trying to solve and the resources you have available.

Despite the vast potential of artificial intelligence (AI), it hasn’t caught hold in most industries. Sure, it has transformed consumer internet companies such as Google, Baidu, and Amazon — all massive and data-rich with hundreds of millions of users.

To unleash AI’s full potential, executives in all industries should adopt a new, data-centric approach to building AI. Specifically, they should aim to build AI systems carefully to ensure that the data clearly conveys what they need the AI to learn. This requires focusing on data that covers important cases and is consistently labeled so that the AI can learn what it is supposed to do from this data. In other words, the key to creating these valuable AI systems is that we need teams that can program with data rather than a program with code.

Why adopting AI outside of tech could be so difficult.

Small datasets: In a consumer internet company with huge numbers of users, engineers have millions of data points that their AI can learn from. But in other industries, the dataset sizes are much smaller. For example, can you build an AI system that learns to detect a defective automotive component after seeing only 50 examples? Or to detect a rare disease after learning from just 100 diagnoses? Techniques built for 50 million data points don’t work when you have only 50 data points.

Lack of technical expertise: Non-technical industries may struggle with understanding the technical details and complexities of AI.

Cost of customization: The big internet services employ several skilled engineers to build and maintain AI systems that are capable of creating value in a specific field. But while coming to other industries, there are many projects each of which needs a custom of AI systems. And the value for this might be huge and sometimes it is not even possible to hire a large dedicated Artificial Intelligence team due to the finances. This is due to the shortage of Artificial Intelligence teams which further adds up to the costs.

Gaps in the process: While an Artificial Intelligence system works in a lab, there is a need for engineers to be deployed there to monitor the process and production. And deploying the AI is not an easy task. It would take 12-24 months to get them ready to work.  There is a need for a systematic approach to solving these problems across all industries. The one-step solution for this can be data centric-approach to Artificial Intelligence, along with tools designed for building, deploying, and maintaining AI applications they are called as machine learning operations (MLOps) platforms. With the help of this Artificial Intelligence can attain its full potential. The companies who adopt this approach can have an upper hand in the market.

AI Data-centric building

Data-centric AI (DCAI) is a new class of AI technology that focuses on understanding, utilizing, and making decisions based on data. Before data-centric AI, AI was largely reliant on rules and heuristics. While these could be useful in some cases, they often led to suboptimal results or even errors when applied to new data sets. Data-centric AI changes this by incorporating techniques from machine learning and big data analytics, allowing it to learn from data instead of relying on algorithms.

AI’s current level of sophistication is the bottleneck for many applications in getting the right data to feed to the software. We’ve heard about the benefits of big data, but we now know that for many applications, it is more fruitful to focus on making sure we have good data — data that clearly illustrates the concepts we need AI to learn. This means, for example, the data should be reasonably comprehensive in its coverage of important cases and labeled consistently. Data is food for AI, and modern AI systems need not only calories but also high-quality nutrition.

Shifting your focus from software to data offers an important advantage: it relies on the people you already have on staff. In a time of great AI talent shortage, a data-centric approach allows many subject matter experts who have a vast knowledge of their respective industries to contribute to AI system development.

For example, most factories have workers that are highly skilled at defining and identifying what counts as a defect (is a 0.2mm scratch a defect? or is it so small that it doesn’t matter?). If we expect each factory to ask its workers to invent new AI software as a way to get that factory the bespoke solution it needs, progress will be slow. But we instead build and provide tools to empower these domain experts to engineer the data — by allowing them to express their knowledge about manufacturing through providing data to the AI — their odds of success will be much higher.