Essays

Four Futures for Large Language Models

June 2022 (original twitter thread)

Almost two years after GPT-3 we've seen continued scaling of large language models, with multiple firms joining the fray

Here are four possible futures for how these competitive dynamics might shake up over the next few years:

1) Commoditization

Current LLMs are mostly trained on publicly available sources, like wikipedia, blogposts, GitHub, online books, etc.

GPT-3 was trained on 570GB of such text (for perspective, that's barely enough to fill half of a 1TB microSD card).

If this continues, LLMs may end up commoditized, with mostly-interchangeable models available from multiple providers

Firms might gain a temporary edge by scaling up their model/data/context length/retrieval bank

(The challenges here are nothing to sneeze at, and will likely pose barriers)

But, if others can quickly follow suit, this wouldn't fundamentally alter the competitive landscape

2) Market specialization via private data

To stave off commoditization, firms might focus on building LLMs for specific applications where private data gives a competitive edge

For example, a software company with a large, private codebase might build superior code LLMs

Similarly, a hospital system with a large EHR database may have an edge when building a medical LLM

And a company with a messaging app may be able to build a better LLM chatbot

Unique sources of unlabeled data are likely to become increasingly important for differentiation

3) Dominance through Data Flywheels (aka "Neural Network Effects")

Another way to prevent commoditization is to build data flywheels, where user behavior creates unique training data not accessible to competitors

For example, when users use an LLM like GitHub Copilot, they can accept or reject proposed completions

This produces training data that creates a powerful feedback loop:

More people use the model -> the model gets better -> more people use the model

These data flywheels can make it challenging for later entrants to catch up:

New users will gravitate towards the best existing models, further strengthening them at the expense of the newcomers

4) Disillusionment

LLMs (and neural networks in general) have formidable problems, and there is no guarantee they will be solved soon

For example, even with today's best mitigations, LLMs still sometimes output false/toxic text and insecure/incorrect code

If these flaws remain unsolved within the next few years, we might see a series of high-profile offensive outputs or security vulnerabilities.

This could lead to a loss of confidence in LLMs and a lack of widespread adoption.

––

These are four possible futures for the competitive dynamics of LLMs, but of course there are tons of others!

I'm curious to hear which people think are the most likely, and if you think I've missed any important ones