The Rise of Data-Centric AI

By on
Read more about author David Willingham.

Data-centric AI is gaining momentum among engineers. While traditionally, a model-centric approach has been used to improve accuracy for a variety of applications, the increase of data available today and the benefits of using reliable data are leading engineers to reevaluate their priorities and workflows. With a model’s performance so dependent on the quality of the data it is being trained with, this data focus has empowered engineers to improve model accuracy without the circular process of constantly tweaking parameters. 

By improving data quality and model accuracy, data-centric AI allows for new areas of application and opens new opportunities in the field of engineering – from 5G communications to LiDAR, medical device imaging, state of charge estimations, and many more. 

While careful data examination has always proven critical to successful modeling, the modern challenge lies in determining how data-centric AI should advance to solve specific application problems, and what techniques and tools are available to do so. Data-centric AI gives engineers access to new capabilities both in terms of the answers that can be found and the issues that can be addressed.

Best Practices for Data-Centric AI 

To achieve accurate results, engineers increasingly emphasize improving the quality of the data inputted into a model. But as data-centric AI continues to drive improved model outcomes, it’s important to note that as of now, there are no universal standards for the degree of data needed to maintain a successful AI model. In turn, engineers must remember that data-centric AI is dynamic, and needs will vary based on the application and desired result across industries.

Ultimately, this necessitates a multi-faceted approach to data optimization to ensure high accuracy. As more engineers are implementing data-centric AI in their operations, there are several best practices that the industry is leveraging to ensure optimal accuracy. These include reduced order modeling, data synchronization, digital distortion, and image object detection.

With a renewed emphasis on data being fed into a model, engineers are increasingly turning to reduced-order modeling. This allows a model to run faster and take up “less space” on a computational level. Data quality is maintained, while accounting for some minor loss of fidelity. Through image classification, engineers can close any possible gaps in training data, by retaking or augmenting original image data to develop new copies and ensure an ample volume to sufficiently train models. Data synchronization, in turn, ensures that the data used aligns with the specific needs of the application. If engineers build an AI model that makes hourly predictions, it will require hourly data inputs to train and guide its performance.

As data quality improves, so too will engineers’ ability to tackle bias. Improved data makes it easier recognize bias, providing engineers with the insights needed to ensure adequate data collection to provide a representative outcome in vital fields like health care.  

New Areas for Innovation 

This increased focus on data, and the improved model outcomes that it brings, has brought the dynamic nature of data-centric AI to certain areas of niche application across the spectrum of industry. Wireless is one such example. Here, data optimization techniques have brought a new approach to designing digital predistortion filters, which proactively modify signals to reach a comfortable noise level in the presence of competing ones. Within LiDAR, use cases show how data-centric AI can evaluate and clean error-prone, sensor-provided data by moving sensors closer to their intended functions and performance levels. In so doing, engineers can correct live, operational, but crucially incorrect, data.

From a health care application perspective, medical device imaging is also embracing this area. By pairing image and signal data, engineers can adjust 3D imaging machines to drive more tailored and accurate tumor analysis and lung health measurement, with additional applications for COVID-19 screening. 

Another example is automotive engineering, where data-centric AI is applied to build a clearer picture of battery sensor data such as voltage and average temperature. This enables a better state of charge estimation, which constitutes a vital component in the design and improvement of electric car batteries.

There are a number of experiment-based and data preparation tools that can assist engineers in implementing data-centricity into AI models. Data-centric AI brings code and code modification to the upfront of the design process, as model code remains mostly constant. Engineers find value in data preparation apps that enable quick and automated data labeling, along with pre-processing libraries often used in applications relying on signal data.

Fueling a Data-Centric Future 

As research into data-centric AI continues, another factor of this evolution is the need for greater levels of collaboration between multidisciplinary teams. 

With an added focus on technique, engineers should be cognizant of the fact that efficient modeling still requires close engagement between the data scientists leading modeling efforts and the engineers who drive the data needed to make them work. By showing how data can be enriched to support the production of a model that engineers may not be making, data-centric AI provides a route to collaboration for multidisciplinary teams.   

Engineers across industries are accelerating their use of data-centric AI, leading to improved data quality and model accuracy. In addition to this increased accuracy across an expanded range of applications, data-centric AI has the potential to drive a greater impact on society through its increased use and push for collaboration. 

Leave a Reply