AI's Light Bulb Moment and the Opportunities
What happened in (Gen) AI this year and the opportunities ahead
2023 was the year that Artificial Intelligence (AI) took over as the dominant tech trend. It has never been more popular for many companies to add AI features to their products and then claim to be an AI company or remind us that they’ve been using AI all along.
But AI has always been with us; it only just had its light bulb moment this year.
As I close out for the year, in this piece I look at what happened this year and the opportunities ahead.
Thomas Edison always liked to go after big things. The American inventor and businessman had sold the quadruplex telegraph for $10,000 ($259,000 in 2023) to his former employer Western Union in 1874 and was finalising his work on the phonograph – a device that recorded sound and could play it back.
But he had his sights on something bigger. His longtime friend had piqued his interest in the newest form of artificial illumination by sending numerous reports and dragging him to see it live in 1876. Straightaway Edison was captivated.
Jill Jonnes narrates this encounter in Empires of Light:
“Edison was now afire with excitement. Ever the competitor, he turned to his host, William Wallace, and said, “I believe I can beat you making the electric light. I do not think you are working in the right direction. William Wallace, who had been working on arc lights for several years and had his system up and going, was a good sport. He accepted the bet and shook hands on it.
Then Edison rushed back to quiet, bucolic Menlo Park, his research workshop in backwater New Jersey, to throw himself into creating a better and more practical electric light. He worked feverishly, thrilled at the possibilities of this new field. “It was all before me. I saw the thing had not gone so far but that I had a chance. I saw that what had been done had never been made practically useful. The intense light had not been subdivided so that it could be brought into private houses.”
Over the next decade, Edison will work tirelessly, many times on the verge of bankruptcy, to win the bet and prove that electricity can be practically useful and brought into every home.
Thomas Edison didn’t discover electricity, which was already used via battery to power telegraphs and telephones, but his invention of the light bulb — and everything else that made it possible to light a house — made electricity more useful to the layman.
AI: The New Electricity
AI has often been likened to Electricity because it changed how the world operates. Just like electricity, AI has applications in every field. The introduction of the light bulb in the house, brought about electrical appliances as brilliant scientists tinkered with how to solve the problem of keeping foodstuff cold or the radio on for much longer.
Similarly, American Psychologist Frank Rosenblatt's experiment in 1959 on whether a computer programme could learn the difference between pictures of a man and a woman kicked off further enquiries into whether computers can learn to perform intelligent behaviours — which is what AI is all about.
The development of AI was stifled by a lack of computing power and training data, causing it to move slowly over the years until the 2010s which featured milestone moments such as AlphaGo – an AI programme by Google that plays the board game Go – beating a human for the first time.
Despite all this progress, AI’s defining moment happened last year with the launch of Open AI’s ChatGPT.
Within five days of its launch in November 2022, the chatbot had a million users. In two months, a hundred million monthly users were using the platform—a number that has now nearly doubled. Where did this astonishingly successful product come from?
In 2015, Sam Altman, Greg Brockman, Reid Hoffman, Jessica Livingston, Peter Thiel, Elon Musk, Amazon Web Services (AWS), Infosys, and YC Research announced the formation of OpenAI and pledged over $1 billion to the venture.
The initial company’s mission was to create "safe and beneficial" artificial general intelligence, which it defines as "highly autonomous systems that outperform humans at most economically valuable work."
Attention is all you need
Unlike most tech start-ups, OpenAI was established as a nonprofit with a board that was responsible for making sure it fulfilled that mission. Efforts in generative AI, a branch of AI dedicated to creating new content, got a renewed vigour with the publishing of the 2017 research paper, “Attention is all you need” by Google DeepMind researchers.
This paper introduced a new architecture called the Transformer and a new way for computers to process information, called attention. Before this paper, computers used something called Recurrent Neural Networks (RNNs) to understand sequences of information. Imagine RNNs as a slow train chugging through the sentence, reading one word at a time. While it works, it's not very efficient and can forget important details at the beginning by the time it reaches the end.
With attention, instead of reading everything at once, the computer can shine its spotlight on different parts of the information, depending on what's most relevant to the task at hand.
The attention mechanism is powerful because it allows computers to understand information in context, just like humans do. The computer is paying attention to the most important parts of the information and ignoring the rest. By focusing on the key points, the Transformer can understand the meaning of information more accurately and generate better outputs.
Early signs from the use of this transformer mechanism were encouraging but not satisfactory. The Generative Pre-Trained Transformers (GPT) AI models required more data and better computing hardware.
By 2019, funding dried up from OpenAI donors — major backer Elon Musk left in 2018 — so Altman changed course by spinning off a for-profit business entity and raising $1 billion from Microsoft. Continuous iterations of Generative Pre-Trained Transformers (GPT) AI models from 2018 (GPT-1) finally bore fruit with the launch of GPT-3 which powered the first version of ChatGPT in November 2022.
The AI Craze
The launch of ChatGPT was so impressive that every major company had to modify their plans for 2023 to include some semblance of an AI strategy or even go as far as hiring a Chief AI Officer to steer them on the right path. Investors responded by pouring over $70 billion into AI startups, up from $24 billion in 2022.
In February 2023, Google responded with the release of its chatbot, Bard; Microsoft integrated OpenAI’s model into its Bing search engine; Meta which had slowed down on its plans to take us to the Metaverse gifted the world with a free AI foundational model (Llama); and OpenAI rival Anthropic released Claude, a “next generation AI assistant”, in March.
Later in the year, IBM, Oracle and Amazon rolled out their generative AI solutions targeted at enterprise clients.
Apple notably has been absent from the AI craze, choosing to remind its customers that it has always been using AI. "I think the first thing to know is that if you're an Apple customer today, AI is in all of the products that we produce.” Apple CEO Tim Cook said in a recent interview.
“In a very significant way. We don't label it as such. If you're composing a message or an email on the phone, you'll see predictive typing tries to predict your next word so you can quickly choose the word, that's AI….What has gathered people's imagination, I think more recently, is generative AI.”
Rightly so AI itself isn’t new. It was already a routine part of our lives that we hardly recognised or paid attention to it. Popular examples include YouTube’s insanely good algorithm that recommends videos to use, delightful conversations with Siri or Alexa, and automatic detection of fraudulent activity by a financial service company.
What was different was that for the first time, everyone felt like they could see a tangible benefit of AI, like the light bulb.
Why Gen AI Matters
The rise of generative AI meant that this was the year we learned to “communicate, create, cheat, and collaborate with robots,” according to the New Yorker. English and
Generative AI models are created by a simple process of large portions of the world’s information on the Internet being put together and analysed with a kind of algorithm that mimics the human brain.
Based on recurrence and the dataset analysed, the system can compose words, images or audio statistically, based on which words, phrases, images, and sounds typically belong together. Think of it as a sophisticated version of auto-complete when you’re typing. Some parts of this process are a bit mysterious and as such the AI system is known to make up words and sentences on its own i.e. Hallucinate.
Why does Gen AI matter? I’ll let tech analyst Ben Thompson explain:
“The evolution of human communication has been about removing whatever bottleneck is in this value chain. Before humans could write, information could only be conveyed orally; that meant that the creation, vocalization, delivery, and consumption of an idea were all one-and-the-same. Writing, though, unbundled consumption, increasing the number of people who could consume an idea.
Now the new bottleneck was duplication: to reach more people whatever was written had to be painstakingly duplicated by hand, which dramatically limited what ideas were recorded and preserved. The printing press removed this bottleneck, dramatically increasing the number of ideas that could be economically distributed:
The new bottleneck was distribution, which is to say this was the new place to make money; thus the aforementioned profitability of newspapers. That bottleneck, though, was removed by the Internet, which made distribution free and available to anyone.
What remains is one final bundle: the creation and substantiation of an idea. To use myself as an example, I have plenty of ideas, and thanks to the Internet, the ability to distribute them around the globe; however, I still need to write them down, just as an artist needs to create an image, or a musician needs to write a song. What is becoming increasingly clear, though, is that this too is a bottleneck that is on the verge of being removed.”
The cost of content creation has been effectively lowered to zero. You might snicker at the quality of content produced but remember it only keeps getting better.
Opportunities in GenAI
While some look at this wave through the lens of a doom scenario where AI will take people’s jobs, it’s inevitable and as such, we’ll be better off looking into how best to use it to improve our lives.
An obvious way to approach this is to look at how AI is used in different industries. In healthcare, AI algorithms can analyze medical images and data to identify diseases with greater accuracy and speed, aiding in early diagnosis and treatment. Machines don’t have to shut down unexpectedly during manufacturing processes, AI can analyze sensor data from machinery to predict potential failures and schedule maintenance before breakdowns occur, reducing downtime and costs.
I could go on and on but I’d take a different way to look at opportunities in the generative AI tech stack. Building generative AI applications involves a complex interplay of different components, often visualized as a "tech stack." Let's dive into the three core layers of this stack: data, infrastructure and AI model/platform.
Data
Data is the lifeblood of AI models, just like petrol is to a car. Imagine an AI model as a newborn baby. It has the potential to learn and grow, but it needs the right environment and experiences to do so. Data is that environment. By feeding an AI model massive amounts of data, we're essentially showing it the world and teaching it how to function within it. The more data it sees, the better it can understand patterns, make connections, and ultimately perform its intended task.
The quality of the data also directly impacts the quality of the AI model. If you feed a model with inaccurate or incomplete data, it will learn inaccurate or incomplete things.
While it's not clear which data sources proprietary AI models such as OpenAI use, most have admitted to using data from open-sourced websites such as Wikipedia, and social media platforms like Reddit. For context, GPT-2 was partly trained with slightly over 8 million documents with a total of 40 gigabytes of text. Meta’s Llama 2 AI model was trained on about 192 billion tokens (read: words/punctuations)1.
Creators and online platforms are kicking back against their content being scraped by AI models for free. But it’ll be largely difficult to prove whether their works were used (in the sea of abundant information) and what truly counts as copyright infringement. If they ever eventually win, it won’t stop the movement as AI chatbots can simply get information by asking their users questions.
As New York-based writer and programmer James Somers forecasts:
If it can’t find the right kind of training data, a chatbot might solicit it. I imagine a conversation with some future version of ChatGPT in which, after a period of inactivity, it starts asking me questions. Perhaps, having observed my own questions and follow-ups, it will have developed an idea of what I know about. “You’re a programmer and a writer, aren’t you?” it might say to me. Sure, I’ll respond. “I thought so! I’m trying to get better at technical writing. I wonder if you could help me decide which of the following sentences is best?” Such an A.I. might ask my sister, who works at a construction company, about what’s going on in the local lumber market; it could ask my doctor friend, who does research on cancer, whether he could clear up something in a recent Nature paper.
The opportunity here is in safeguarding proprietary information and creating a niche AI model on it. For example, most AI models are trained in solely English language. What happens if you could train yours in a local language? Or train a niche AI model based on content not available on the web. Archivi.ng, which is curating archiving old Nigerian newspapers and making them accessible to everyone, is better posed to create a niche AI model.
Infrastructure
Training complex AI models requires immense computational power to crunch through gigabytes, terabytes, or even petabytes of training datasets.
This is where powerhouses like Nvidia, AMD and others come in with their superior data centres and chips. When it comes to AI chips, Nvidia is ahead of the pack and it’s difficult to see anyone beating them anytime soon.
Founded in 1993 by Jensen Huang (Nvidia CEO) and 2 other people, Nvidia’s claim to fame goes way back to its GeForce series launched in 1999. Stephen Witt of the New Yorker explains:
In 1999, the company, shortly after going public, introduced a graphics card called GeForce, which Dan Vivoli, the company’s head of marketing, called a “graphics-processing unit.” (“We invented the category so we could be the leader in it,” Vivoli said.)
Unlike general-purpose C.P.U.s, the G.P.U. breaks complex mathematical tasks apart into small calculations, then processes them all at once, in a method known as parallel computing. A C.P.U. functions like a delivery truck, dropping off one package at a time; a G.P.U. is more like a fleet of motorcycles spreading across a city.
What truly prepared Nvidia for the AI revolution was its supercomputing software package, CUDA, which made its GeForce GPU even better. But the general public didn’t seem interested until AI researchers picked it up. In 2009, a research group led by, Geoffrey Hinton, a professor at the University of Toronto, used Nvidia’s CUDA platform to train a neural network to recognize human speech. And it performed surprisingly well.
The next decade saw word about Nvidia’s superior chips spread – OpenAI being the buyer of the first AI supercomputer — and Nvidia doubled down on being positioned to lead the AI space whenever it came of age.
As the AI era emerged in 2023, the stock market rewarded NVIDIA as its share price shot up by about 230%. Its data centre GPU sales rose from $3.6 billion in Q4 2022 to an expected $16 billion in Q4 2023. AMD comes in a distant second, barely appearing to be a competition. Notably, Apple has been performing its AI functions on its own chips for years now. This AI-specific hardware is tuned to Apple’s own needs and could help Apple emerge as a strong contender in AI infrastructure space, although it’s unlikely that it’ll sell chips to other players.
Several startups such as Cerebras are building generative AI-efficient chips but Nvidia's CEO doesn’t think it’s solely a chip problem.
“You can’t solve this new way of doing computing by just redesigning a chip. Every aspect of the computer has fundamentally changed,” Huang said at a recent event. “That’s everything from networking to switching, to the way the computers as designed to the chips and the software that sits on top of it…It includes a chip but it’s not about that chip.”
There’s a significant barrier in terms of investment cost and technical knowledge preventing new entrants. But keep in mind that IBM used to rule the (personal) computer space and Nokia was once the predominant phone-making company.
AI Models/Platforms
Generative AI platforms are typically user-friendly interfaces and drag-and-drop features to make it easier to build and deploy AI models (algorithms). For example, ChatGPT is the AI platform, while GPT-4 is the AI model.
This space is led by Open AI and its largest shareholder Microsoft which dominates with 39% and 30% marketshare respectively, according to IOT analytics. Microsoft’s platform, Azure AI, offers Azure OpenAI, which uses OpenAI’s LLMs but goes beyond the public ChatGPT offering by promising greater data security and custom AI apps. AWS’ Bedrock service, publicly released in September 2023, provides access to models from several AI companies, such as Anthropic, AI21 labs, and Cohere.
Generally, considered a leader in the AI space, Google is at best playing catch up right now. A few weeks ago, Google released a preview version of its new multi-modal flagship model, Gemini, which blew our minds away until it was revealed that the demo video exaggerated the model’s capabilities.
The AI model/platform space is crowded and the abundance of open-sourced (free-to-use) foundational models has only made it more crowded. Based on the availability of many general-purpose models, an obvious opportunity is to focus on AI models that perform well on a particular niche information. Think of an AI model that assists with patient diagnosis in hospitals or one that predicts customer behaviour, identifies cross-selling and upselling opportunities, and automates repetitive tasks for sales teams. This is where the industry-focused use cases of AI come alive. Business-to-business (B2B) focused models are more prominent in AI models by enterprise companies like Amazon, Microsoft, Oracle and IBM. They’re better positioned to shine where general-purpose models are found lacking.
However, with the introduction of GPTs and the GPT store by OpenAI which helps create a tailored version of a chatbot, it’s clear that the general-purpose AI models aren’t resting on their oars.
The biggest opportunties of all though, though, is probably off our radar completely. AI’s light bulb moment has been all about the improvement in the usefulness of AI to more people. It’ll take another decade to see how this plays out.