Thanks for the questions.
There was an internal model at Google (I think it was called "Meena"), released about a year before ChatGPT. It attracted a lot of discussion because it could say offensive things and wasn’t very guarded. But when you interacted with it—asking it to do or explain things—you could see clear magic. It was never released because of safety concerns and the risk of bad publicity.
Then ChatGPT came, and everything shifted. I have to give credit to Sundar, who pushed the idea of being bold and responsible at the same time. Google took some risky bets, even if that meant things could have gone differently. For example, serving AI queries increased our costs, and fast launches might have led to negative PR. But we made it through that time.
When the ChatGPT moment came in Dec '23, we were way behind ChatGPT, and it was a scary time for all of us. But there was also a strong belief in our technical strength, and everyone knew that the game had just started—it wasn’t over yet. I clearly remember a Google Cloud meeting where our CEO rallied us with a war cry that we would fight it out, and that was the start of something big.
Once all our objectives, OKRs, and team focus were set, it felt like we were in a fast, well-oiled machine, racing ahead. Then came the vision of multimodal models, and it was clear they were coming. Google was one of the first to put a stake in the ground, saying we would build a multimodal agent. That turned out to be a great goal. Just look at the video and audio parts of our models—they’re among the best in the world.
Another key point was the context length. While Anthropic raised it first, Google came in with a one-million token context window, and internally we even had about a 10-million token window. That was a clear advantage. On the technical side, especially in the cloud, we saw that some things ran a lot better on Google because of our design choices—like TPUs and how they work with storage. Some vendors even preferred Google Cloud for building these models.
It was really interesting to be in the middle of that race and experience everything firsthand. Overall, my thinking has changed a lot. I now believe that the model layer will be dominated by just a few players with tight margins, leaving little room for many others. On the other hand, the application layer will be the big winner, much like what we saw with the cloud and the internet, opening up many new opportunities. Also, we’re still very early in creating a great UX experience in the AI space. A few years from now, we might look back and think this was a magical new way to do things—something we haven’t fully seen in many app spaces yet.
Those are my thoughts—feel free to reach out if you want to talk more.