The Challenge with ‘AI for Business’
There is a recent trend of business leaders who say “we need to get AI in our business” to remain competitive, but this realization is only a first step. This post is designed to cut through some of the hype and buzz, and is intended to help technology decision makers with adopting a healthy data strategy and to thrive in a digital world.
I will show you a realistic approach to applying AI. AI can provide enormous competitive value, but treating it as a holy grail can lead a business to miss alternative analysis options.
Problem 1 — Understand the Language
I see a lot of buzz-wordy language thrown around by people who may not understand it. We can’t talk about anything if we don’t first know what the words mean.
- Data Science — The overall discipline that relates to working with large data sets (statistical, mathematical, and AI). People in this discipline are referred to as data scientists, and are often PhDs with a background in mathematics, statistics, and even physics.
- AI (artificial intelligence) — refers to the broad science of intelligence demonstrated by machines (as opposed to natural intelligence demonstrated by humans).
- Machine Learning —a subset of AI involving the use of algorithms to learn from data to progressively become better at a specific task, without being programmed to do so. I highly recommend this fun 9-minute animation to understand this key concept, as well as its 2-minute footnote.
- Supervised vs. Unsupervised Learning — in machine learning, Supervised learning is where known real data sets (referred to as training data) are provided to the algorithm to make it better at predicting outputs. Unsupervised learning is where no training data is provided, and is excellent at finding previously unseen categories within datasets.
- Artificial Neural Networks — a popular subset of machine learning algorithms that are vaguely inspired by the structure of biological neural networks.
- Deep Learning — a type of neural network that has hidden (or deep) layers, and for which unsupervised learning becomes possible. Popular real-life examples are computer vision and speech recognition.
- Big Data — refers to data sets that are so big and complex that traditional data science approaches become inadequate. Machine learning may still be effectively used to analyze them, and has successfully been used to learn from these previously overwhelming data sets.
These are just the most popular words in tech media at the moment, however there are many other machine learning algorithms.
You don’t need to remember or understand all of these terms. Most importantly, what people tend to mean when they refer to AI is the use of machine learning algorithms to analyze large data sets to get predictions about their business.
Problem 2— Data Should Be First
Business leaders may think they need to develop machine learning capabilities, but really they first need to adopt a data driven approach to their business activities. This means collecting, cleaning, organizing and labelling their most valuable data. This is usually the most effort and cost intensive activity in data science, as most data is a mess.
Once a business gains a handle of their data sets, many opportunities for data scientists become available:
- reporting and dashboards (tracking what’s going on in the business)
- visualizations (more complex representations of what is happening in a specific area; like graphs on steroids)
- statistics and probability (what is happening across a data set, and what is the probability of an event happening in certain conditions?)
- modelling (taking a series of hypotheses, and predicting future results based on large inputs of data)
- linear programming / optimization (advanced mathematical models to determine the best outcomes from a set of linear inputs)
- machine learning solutions (including supervised and unsupervised learning, neural nets, deep learning, etc.)
Note that machine learning is only one of the areas listed, and the only that can be considered AI.
Problem 3 — Where is the Business Value?
Not all problems are worth solving. Applying machine learning to a data set for which there is no way to drive better business decisions is a waste of time. This is why it’s important to teach data scientists the ins and outs of the business they are analyzing. Discussions of machine learning opportunities should include all of the technical staff (scientists, engineers, operations, administrators) AND all of the business staff (management, financial, leadership).
A good data scientist will start by asking “What is the purpose of this analysis?” If we find the answer we are asking for, what will we do differently? And if that does change, how much real impact will that have? Once these points are understood, the team can decide whether it’s worthwhile to perform the heavy lifting of the analysis in the first place.
The technical approach should only be chosen only once all involved data sets (business, operational, technical, etc.) are understood. This means what sort of data they contain, the confidence of that data, how it is structured, and so on. The final approach may deploy traditional methods or a multitude of machine learning algorithms, or even a combination. It should be expected and welcomed that each situation will be best suited to a different approach.
Problem 4 — Difficulty to Prototype
Machine learning by nature is very hard for a human to understand. It represents a type of learning from data that machines are good at, by repeating millions of math problems very quickly. By extension, humans are NOT good at this type of task (otherwise, we’d just call it “learning”). This can make it difficult to know when a problem is well suited to a machine learning approach. It also means both humans and machines can be good at something (like image classification) but approach it completely differently.
One of the key differences between machine learning and traditional approaches is the ease of prototyping. Most everyone is familiar with Excel, and ambitious members of your business may bootstrap and hack together data driven Excel prototypes that can end up driving real business value and save costs. They can make reports, dashboards, perform statistical analysis and build models, and connect them together.
However, prototyping of machine learning solutions cannot be done with Excel. For this, specialized data science knowledge and specialized programming languages and toolsets like Matlab or Octave are required. Many people do not have this specialized knowledge, and many companies don’t have or allow the installation of these resources. As a result, organic “natural” development of AI solutions will not happen in an organization, even one that naturally develops other traditional solutions.
Problem 5— Some Models are Bad
One risk in machine learning is that all models will give answers, sometimes with very high mathematical confidence, even if the underlying assumptions are poor. This can lead to damaging and even fatal business decisions being made in the real world. The larger the business impact of a decision, the more certain management should be that the assumptions in the machine learning models represent the reality of their business world.
Goal: A Data Driven Culture
I’ll leave you with the analysis below, which provides a frank comparison of how digitally mature businesses are in various industries. I challenge you to find your industry and think about what you could do to “go green”.
In my view, your ultimate goal here is to develop truly data driven organizations, starting from the top. You cannot simply hire a data consultant and be done with it. Everyone from your leadership and management to the scientists and admin folks must be aware of the value of data, how to keep it structured, clean and organized, and notice which situations call for a data science analysis.
This is the true holy grail, and what informed business leaders must push for in the future.
Bonus: If you want to get deeper on AI, waitbutwhy’s brilliant, hilarious and extremely detailed explainer posts (part 1 and part 2) are the best bang for buck.