AI-Enabled Business: A Data-Centric View

September 4, 2018 | Debiprasad Banerjee

A quick recap – the last discussion was about data-driven decision-making being the future of all successful enterprises. We also touched upon the importance of data quality to successfully adopt AI as part of their strategy. We will discuss data and the challenges associated with using it for AI applications in this section.

Today, businesses worldwide are generating unprecedented amounts of data, and given cheaper and abundant storage, most of it is available in online/real-time mode. However, can the available data be used by AI applications in its current form? And if not, then what does it take to do so? But first, a little more background for those who are less initiated into the AI world.

Understanding the Need for AI Applications in Business

While deploying an AI application to solve a business problem, it is essential to start from the beginning. Like all important things in life, these may sound basic and simplistic, but they are crucial for the project’s success.

  1. Define the problem – What do you want the AI application to do?
  2. Output to Outcome – How will you use the AI output to improve your business outcome?
  3. Check the Data – Do you have the appropriate data set to solve this problem?

While the first two can be challenging to answer, let us focus on the third one.

At this stage, it is necessary to distinguish between two specific types of outcomes (there are others) desired by the business while deploying an AI application. The first is to augment the human, and the second is to replace them. A simple example will demonstrate this main difference – consider large sets of data from all possible business areas brought together in one place to look for hidden insights on customer behavior, sales projections, whatever! A human being looking at this would probably do an ok job at best. However, a cleverly selected AI algorithm would be perfect – fast and effective. Now consider an existing process where humans do routine tasks that are repeatable and boring but utilize some unique human faculty, such as looking at an image to derive meaning or listening to a conversation to respond adequately. Trying to put an AI application to replace (partially or totally) the human from this workflow is not a simple task. It is far more complicated than the prior example of parsing data. As a result, selecting the correct AI algorithm, picking and curating the appropriate data set to train them, deploying the AI, and putting a continuous learning loop are all very different in scale and complexity in the above two examples. The AI used for the first example is generally called Machine Learning (ML), and the second one is a specific subset of ML called Deep Learning (DL). In both the examples of using ML or DL to solve a particular business problem, the challenges related to the data are probably the trickiest to handle, but in the case of DL, it is even more so.

Concept of Supervised and Unsupervised Learning

Before we look at the actual data and related challenges that it poses, there is one more important thing that needs to be discussed, and that is the concept of supervised and unsupervised learning. A large proportion of the AI applications used today are for business applications that require definitive outputs, which are best produced by algorithms such as regression or classification. They use the supervised learning method. However, unsupervised learning algorithms can solve problems like clustering. In supervised learning, as the name suggests, we tell the AI algorithm what an input data set means and what meaning (prediction/inference/class) we can derive from it. For example, if the input is an image of a cat, we also need to provide an additional input data field (label) along with the image that says this image is a cat. If you extend this to large and complex data sets that real-life business situations have, then it is easy to see how quickly this can become an enormous exercise of data tagging/labeling that needs to be done before we can feed the data to any AI application. And this now becomes a part of the whole data cleansing and preparing exercise.


We will take a closer look at the processes pertaining to the data pipeline and its management in the upcoming discussion. Based on what we have discussed until now, it seems likely that we should devote considerable resources to preparing the data for enabling a technology-led transformation where we can use AI to make data-driven decisions.

Related Articles

Want to explore all the ways you can start, run & grow your business?