The Failed Promise of Corporate AI Productivity
What to Expect Next on the One-Year Anniversary of GPT-4
The announcement of Microsoft Copilot, merely two days after GPT-4's groundbreaking launch on March 14, 2023, made me bullish. Evidence indicated that developers could double their productivity using AI tools like GitHub Copilot. This led me to foresee a future where AI assistance could similarly boost the efficiency of every office worker. I boldly projected that within 18 to 24 months, we could maintain current output levels with half the workforce. However, a year into this journey, reality has proven otherwise.
Unrealized Expectations
Two significant obstacles have surfaced. The first issue was access. Accessing the Microsoft Copilot solution, as a small business owner, took me an astonishing 10 months. Secondly, the challenge of seamlessly integrating advanced AI into Microsoft’s decades-old office tools became apparent. Given their 35-40 years of evolution, these tools resist easy enhancement with added intelligence.
According to Microsoft, Copilot users save an average of 14 minutes daily, translating to roughly 1.2 hours per week. These savings are attributed to tasks such as writing, summarizing meetings, and retrieving information:
Write a first draft: from 14 minutes to 8 minutes with Copilot = 6 minutes saved
Summarize a missed meeting: from 43 minutes to 11 minutes = 32 minutes saved
Search for information: from 24 minutes to 18 minutes = 6 minutes saved
These calculations appear to assume that users lack access to alternative AI tools. In contrast, my setup includes the small business ChatGPT Teams solution, based on GPT-4 Turbo, alongside Google’s Gemini Advanced—built on Gemini Ultra—and the European contender, Le Chat by Mistral. These tools have become integral to my daily operations. My experience with Microsoft Copilot has so far yielded no additional value. Despite having access to all my private files and data, it does not effectively utilize this information. Admittedly, I have not dedicated substantial effort to maximizing its potential. But when, for example, requesting a summary of my activities on January 17th, 2023, I found Copilot’s response to be incoherent, omitted crucial data sources, and irrelevantly highlighted content from random newsletters. Such performance leads me to question whether Copilot truly harnesses GPT-4; at times, it seems not even to keep up with GPT-3.5.
Although it's too soon to discount Microsoft, the company faces a monumental challenge in unlocking full potential, let alone matching the standalone GPT-4. So, I will be watching out for emerging contenders developing office applications, and possibly even operating systems and hardware, from scratch with an AI-first approach.
The Journey Toward General Intelligence
Many organizations, still on the sidelines of fully embracing generative AI, may find solace in the gradual pace of corporate AI integration. Yet, the principle of exponential growth suggests this comfort may be misplaced. Despite the slower-than-anticipated adoption of AI-powered office tools, the landscape is rapidly evolving with daily advancements and updates in AI technology. The march toward Artificial General Intelligence (AGI) is accelerating.
Defining AGI remains a subject of debate, with the threshold for its capabilities continuously rising. Currently, AGI is imagined as having the capability to perform any cognitive task that has economic value, similar to what humans can do. This implies that AI agents could potentially replace every remote worker. Once AGI becomes commercially available, its impact on the market and society is expected to be swift and profound.
On September 18, 2023, an anonymous social media user "Jimmy Apples," claimed on the app X that OpenAI had internally achieved AGI. While one might usually be skeptical of information from anonymous online sources, this particular account has accurately predicted numerous OpenAI events, release dates, and product names. This leads me to consider the possibility that AGI could indeed be closer than we think, at least in the lab. Though I wouldn't wager all my savings on the veracity of this claim, I'm also hesitant to dismiss it outright.
This contrasts with a survey of almost 3,000 AI researchers, who predicted a 50% chance of achieving AGI first by 2047. However, the leaders of the major AI companies, including Sam Altman of OpenAI, Demis Hassabis of Google DeepMind, and Dario Amodei of Anthropic, all believe AGI could be realized much sooner, within this decade. Amodei is particularly optimistic, suggesting AGI could emerge in just two years.
In my opinion, the primary obstacle currently is compute. Sam Altman, the head of OpenAI, is reportedly seeking to secure $7 trillion, approximately 6-7% of the global GDP, to enhance chip-manufacturing capabilities. If my guess about what the architecture of future AI models will look like, it seems plausible that AI firms might initially develop AGI technologies that, due to their immense computational demands, are not feasible for commercial application, potentially delaying their market introduction by years.
A significant milestone on the path to AGI was the unveiling of the text-to-video model Sora by OpenAI on February 15, 2024. Sora has garnered considerable interest for its ability to generate photorealistic, 60-seconds long, coherent videos from just a text prompt. A feature that I predicted in my 2024 Predictions:
“Advanced Text-to-Video Capabilities: I anticipate an AI model capable of generating 60-second high-quality videos, complete with story, speech, music, and coherent scenes, all from a single text prompt.”
What makes Sora truly remarkable is its underlying grasp of how the world operates. For such a model to generate realistic videos, it must comprehend the laws of physics, the behavior of humans and animals, and even the growth of plants. This depth of understanding appears to be an emergent property of the model, which only enhances with increased scale and computing power. Although still in the research phase and requiring more compute resources than currently viable for commercial use, OpenAI's CTO, Mira Murati, aims for a 2024 release date.
This capability suggests that the AI can "think" in a manner akin to human cognition. Presented with an image, for instance, it can infer and simulate potential precursors and outcomes of the captured moment. For example, if it observes a car approaching a puddle next to a sidewalk, the AI can predict the ensuing splash dynamics. Though not yet flawless, it possesses an instinctive understanding of various factors such as the car's mass, tire characteristics, water fluid dynamics, and pedestrian awareness, allowing it to make judgment akin to a human. This intuitive process doesn't rely on breaking down the scenario into complex equations; rather, the AI inherently "knows" the most prudent course of action.
AI Development: Replace vs. Enhance
The evolution of AI encompasses two parallel trajectories: the replacement of human roles by AI and the augmentation of human capabilities through AI. In the realm of commercial applications, especially as we edge closer to AGI, the replacement approach is likely to dominate. Yet, defining "long-term" in this context—whether it spans months or extends over years—remains difficult. In the interim, the focus of AI firms leans towards enhancement, a strategy evidently reflected in Microsoft branding its AI initiatives under the "copilot" moniker. Although OpenAI has hinted that their “GPTs” are precursors to fully autonomous agents, the roadmap towards this transition remains unclear, making a near-term emphasis on enhancement and copilot functionalities pragmatic.
However, a significant development in this area occurred on March 12th, 2024, when Cognition unveiled "Devin," touted as the inaugural AI software engineer. This AI agent is designed to perform the duties of a software engineer, equipped with advanced capabilities in long-term reasoning and planning. This enables Devin to navigate and execute complex engineering projects that involve making thousands of decisions. With the ability to remember pertinent information throughout its tasks, learn progressively, and rectify errors, Devin epitomizes a significant stride towards the AI-driven transformation of the workforce.
The launch of the first commercially useful AI Agent was also on my 2024 predictions, and I expect we will see more like this during the year:
“Launch of the First Commercially Useful AI Agent: I predict the debut of an AI agent based on a foundational model, capable of performing economically valuable tasks independently within a specialized domain.”
Recommendations for Corporate AI Adoption
At this point, I would advocate for a balanced and pragmatic strategy towards corporate AI adoption. In the face of existing uncertainties, companies could focus on developing a diverse portfolio of initiatives that leverages AI technology. This involves combining "no regret" moves with a series of higher-risk, potentially high-reward ventures.
As detailed in my article "Thinking About Developing an Artificial Intelligence Strategy?", there are four critical areas to focus on:
1. Build the foundation for using AI
2. Strengthen Key Capabilities with AI
3. Improve everyday efficiency with AI
4. Improve executive decision-making with AI
The "no regret" initiatives primarily involve formulating a comprehensive AI strategy and concentrating efforts on foundational development (1) and enhancing operational efficiency (3). Following this, organizations can venture into more speculative areas (2 and 4) that carry a higher risk but also the potential for significant impact.
In practical terms, this means ensuring widespread access to, and training on, large language models (LLMs) for employees, aiming to streamline meetings, expedite email and report writing, and overall, enhance productivity. Such strategies are likely to yield tangible benefits.
For ventures with higher risk, engage in initiatives that have a 50-50 chance of success, or possibly lower, but offer the potential for significant impact. This might involve exploring the creation of an AI-supported decision-making framework for the executive leadership team or board of directors. It's crucial to establish clear objectives, timelines, and budgets upfront, and to be ready to terminate these projects if they fall short of expectations. Nonetheless, it's essential to systematically capture and repurpose the insights gained from these ventures, ensuring that knowledge is not lost but instead applied to future efforts, regardless of the project's outcome.