Moving from AI Science Experiment to Operationalized Pipeline

By on

Click to learn more about author Dipti Borkar.

Organizations have spent the past few years educating themselves on various Artificial Intelligence frameworks and tools. But for AI to become mainstream, it will need move beyond small-scale and often ad hoc experiments being run by Data Scientists, to becoming an automated pipeline with the inference and results being usable directly by business analysts. Technology advances on related fronts have now made that possible.

The first generation of technologies used for data-driven Machine Learning and Deep Learning on structured datasets forced Data Scientists to spend too much time configuring and administering databases and Data Management systems, allowing less time for building and coding algorithms. Business analysts attempting to incorporate AI findings into their preferred platforms, tools and languages have been experiencing similar difficulties, especially when running AI frameworks on larger datasets.

These challenges have created a fairly wide chasm that continues to divide these two user domains, perpetuating the need for a pipeline that starts with Business Intelligence, but also incorporates Data Science, along with the need to share and manage data between them. Working in siloed environments leads to duplicated data and inefficiencies, and makes it difficult to get insights driven from both human and artificial intelligence.

To integrate data-driven AI into operationalized pipelines, organizations will need a single platform capable of streamlining, automating and managing the entire Machine Learning and Deep Learning lifecycles. The platform will also need to support all of the custom, open source and packaged software now being used by both data scientists and business analysts, such as Python and SQL, respectively.

One approach that is capable of satisfying these needs is GPU-optimizing the analytical database to support both AI and BI workloads. The massive parallel processing power of the GPU makes it ideal for the compute-intensive vector and matrix operations found in Machine Learning and Deep Learning algorithms. Some of these solutions support running popular Machine Learning frameworks like TensorFlow and Caffe directly within the database in a distributed manner. The better platforms also support user-defined functions that make it easy to run custom frameworks as a part of the pipeline. UDFs provide direct access to the NVIDIA CUDA® API, enabling them to take full advantage of the GPU’s power.

The GPU-optimized analytics database is capable of delivering what is needed to automate and operationalize all processes in the pipeline in ways that enable Data Scientists to focus on the algorithms, and business analysts to apply the full potential of AI to benefit the organization.

Getting ROI from AI

Based on its enormous potential, investments in AI can be expected to increase in 2018. The ability to operationalize the entire pipeline with GPU-optimized analytics databases now makes it possible to bring AI to Business Intelligence cost-effectively. And this will enable the organization to begin realizing a satisfactory ROI on these and prior investments.

Examples of high-ROI applications include customer experience and relationship optimizations, product recommendations in retail sales, and value-at-risk analysis for financial services. AI is perfect for analyzing the tsunami of data currently available for these and other applications. But to get the best results, it is important to operationalize the AI framework on the entire data corpus, rather than just small subset, and to converge or integrate the results into the BI framework. Doing so assures making the most of the potential value hidden deep within your data—value that may otherwise go unnoticed.


Leave a Reply