Groq’s founder on why AI’s next big shift isn’t about Nvidia

In This Article:

You can catch Opening Bid on Apple Podcasts, Spotify, YouTube, or wherever you get your podcasts.

The race to better compete with AI chip darling Nvidia (NVDA) is well underway. Enter Groq (GROQ.PVT). The company makes what it calls language processing units (LPUs). These LPUs are designed to make large language models run faster and more efficiently, unlike Nvidia GPUs that target training models. Groq’s last capital raise was in August 2024, when it raised $640 million from the likes of BlackRock (BLK) and Cisco (CSCO). The company’s valuation at the time stood at $2.8 billion, a fraction of Nvidia’s more than $3 trillion market cap. It currently clocks in at $3.5 billion, according to Yahoo Finance private markets data. At the helm of Groq is founder and CEO Jonathan Ross. While at Google (GOOG), Ross designed the custom chips that the tech giant would go on to train its AI models on. Yahoo Finance Executive Editor Brian Sozzi sits down on the Opening Bid podcast with Ross to discuss his future plans for Groq. Ross is fresh off a trip with other key tech executives to Saudi Arabia, joining President Donald Trump in forging a deeper tech relationship with the country. Ross begins by sharing with Sozzi the conversations he had on the ground in Saudi Arabia.

0:04 spk_0

Welcome to a new episode of the opening bid podcast. I'm Yahoo Finance executive editor Brian Sai. Like I always say, it's the podcast, uh, that will make you a smarter investor, period. And of course, opening bids sponsored by our by our friends over at Vanguard, really exciting, uh, guest here uh for this particular episode. Go Grok, founder and CEO Jonathan Ross. Jonathan, good to see you. Uh, I've been following you.From afar, uh, on all the work you've been doing, so welcome to the podcast desk. For those not familiar with your company, because I think you get a looped in a lot with, uh, Nvidia. What, what do you do and what I guess what made you start the company?

0:42 spk_1

All right, so.Uh, thanks for having me. Amazing to be here. Um, we started the company back in 2016, back before people really knew what AI was, um, and my background, I actually started the Google AI chip called the TPU. 2016 he left, uh, realized that the rest of the world needed fast inference and so what we do is we build chips called LPUs.And LPUs are a little different than GPUs. GPUs are really good for training models. Training models is where you take all that data and you turn that data into something that's useful that can interpret the world and and say things. But inference is when you actually run that model, when it's useful. So whenever you go and you type into your favorite chatbot, you ask your questions, that's inference. We accelerate inference and we make it really low cost.

1:25 spk_0

You were part of the Moonshotsfactory, right?

1:27 spk_1

I was, so Google X's Rapid Eval team. So it's about 12 of us, and I was the person who was the expert in computing AI.And there were people who were experts in biology and we kept trying all these really interesting things. Yeah.

1:41 spk_0

Didyou realize the time what you created in that what was the teepeeyou said? Yeah,

1:45 spk_1

I think the first time I realized the scale of it was when Google did an earnings report and they mentioned that it had been responsible for over a billion dollars of revenue, and I actually started that as a 20% or side project.

1:56 spk_0

Can youbelieve what that has now come, what they're.Training on, I mean, I you still of course have to Google what they're working on.

2:01 spk_1

Absolutely and um they're still killing it. They're um ramping massive scale, um like the data centers that they're deploying are massive. Was

2:10 spk_0

ithard to, or to get more into what your company does, um, was it hard to go from being inside that Google environment that that startup mentality, but also obviously a very large company to founding your own company?

2:21 spk_1

So at the time I thought that I was doing a startup and I would be like I'm doing a startup in Google, and then when you go outside and you actually do a startup and you hear other people say that, you almost cringe because you're like, oh you have no idea. Like the near death experiences, the um having to get money at the last minute cause you're about to run out of cash. There was a point where we were 3 weeks from running out of cash and we got uh a fundraising just in time at Grok, yeah, yeah,

2:48 spk_0

have you had to.You know, how did you, I guess, develop your leadership style? I mean, it's one thing to lead a team inside of a giant tech company like Google, but I mean, you're the face of thecompany now.

2:58 spk_1

I am, and actually, uh, part of the deal when we started was I would never have to be CEO. I was the CTO for the 1st 2 years and then I became the CEO afterwards, and it wasn't natural for me, and I think this is a lesson for for people. If you really focus on getting good at almost anything and you actually spend your time on it, you can do it.Um, but it, it's very much about, um, being open to feedback, right? This is the number one thing that prevents someone from crossing the gap, from being a technical founder to being a CEO.No one knows how to manage people. Every single thing that you do wrong is wrong because it's counterintuitive. And so you have to be able to listen to other people who know what they're doing in order to fix things that are counterintuitive problems. So

3:42 spk_0

you sounded like a completely different leader today than you were 2 years ago.

3:45 spk_1

Absolutely, yeah, 6 years ago

3:48 spk_0

or even 6years ago. So let's, you know, on Grok, um, LPUs, when you say they're used to, to train for inference, for inference, how is that different thanThe stuff that Nvidia sells, it gets a lot of hype over.

4:02 spk_1

Yeah, so think of it this way. Suppose you're a car company. You design cars, you manufacture cars. Training is the designing, inference is the manufacturing, it's the production. And the way that you should think about this as a financial investor, if you want to understand what the market is so crazy about right now, um, you wanna think in tokens, mega tokens of capacity. So when you buy electricity, you buy 1 kilowatt hour, and it costs a couple of cents.Uh, when you buy intelligence, you buy it in mega tokens, and a token is kind of like the syllables that the AI uses to construct words. Every word that comes out on average is about 1.4 tokens. Those 1.4 tokens, you put a million of them together, that's a mega token. That's what people sell, and that costs a couple of cents.

4:47 spk_0

What is it wrong to compare you to Nvidia?

4:51 spk_1

Um, it's, it's a little bit hard to compare us to Nvidia because we, we don't just sell this hardware, but we do. So, um, we also sell a service on top. So if you're currently using um OpenAI Anthropic, uh, Google Gemini, uh, you can also use all of the open source models that are available on us as a token as a service, but we also do deploy the hardware for customers. So for example, we just did an amazing deal with Bell Canada.And in that deal we're standing up we're we're build operate transfer we're building and operating uh the systems they provide the data center, we provide the hardware we provide the software and then we work together to bring customers onto that system.

5:32 spk_0

So I interview, you know, I've interviewed CEOs of AMD Intel, um, a lot of more Qualcomm, a lot of more traditional players. When investors come to you and ask you, you know, what's the pitch, you know, and, and maybe you want to raise more money from them. What do you tell them? Like, what's your growth?

5:50 spk_1

Well, back in back in 2016, we first had to explain what inference even was. There the market didn't have a division between training and inference, so we would explain that.Nowadays we don't have to explain that and what we say is we've been focused on inference from the beginning. And the main difference between the two when you are training a model and you put it into production later, you get to amortize that over um millions of users who are using that model over a long period of time. But, um, inference, you're paying for every single query. So you spend money on training, you make your money on inference. If you're spending too much on inference, you can't make money.Most of the companies that are out there, most of them are actually burning VC dollars in order to build AI companies. Uh, we've seen a bunch of customers come to us recently who are spending more on the tokens, the intelligence, than they're making in revenue. Some of them, it's razor thin close, and when they come to us, they want to be able to lower their spend in order to become profitable or in order to get more money so that they can build a business.

6:52 spk_0

So as the next wave, I guess, um, with AI.Are those tail ends by a company like yours. I'm trying to figure out, you know, I, for the past 10 years of my life I've been talking about Nvidia and like I've watched this company grow significantly, but I hear something else happening uh within yourcompany.

7:09 spk_1

So the main pitch to the customer is build fast. So with AI it's about speed. Um, you go to a room full of people and you ask what's gonna happen in a year in AI? No one knows, right? And so.Um, Jensen went on stage at GTC, their big yearly conference, and he said, if you want to spend $100 billion.Over, uh, and, and have um hardware in 2 years in your data center. There's only one company in the world that you can trust to do that, and that's Nvidia, and I agree with that statement. But where people come to us is they don't want to wait 2 years. They want to be able to spend $100 million and have a bunch of chips set up in 3 months, and then they want to be able to spend a billion dollars and then have that in 6 months.Then spend $10 billion.06 months later have that and then spend $100 billion.06 months later and have that. And if you want to do that, there's only one company in the world you can go to and that's Grok. So we're built fast and 51 days from signing the agreement in Saudi Arabia, we stood up 20,000 chips and we're serving production traffic and with the Bell Canada deal as an example, the CEO went on stage and he said there's a lot of people out there who are announcing things that are in the future.We're not, like, the chips are already stood up. We already have chips in production right now in their data center.That's very different from what you're hearing around the world. Most AI it's like it's gonna come. The problem with that is the economics. So you also heard Jensen go on stage and call himself the chief revenue destroyer. Do you know what he meant by that? I did it actually. So what he meant was, he was trying to get his sales team to stop selling H-100 GPUs. They were still selling them to customers, but they were obsolete already because they had the Blackwell GPU.And if it takes you, and we hear this all the time, we'll talk to customers who put in an order, 12 months later, they still don't have their GPUs, let alone have them stood up and earning money. So if you're gonna buy a GPU and it's gonna take you 12 months, 24 months to get it, and you only have a 3 to 4 year life cycle on that before it's obsolete, that really only gives you 2, maybe 3 years where you can earn money off of that, and then it's obsolete.And so what you need is built fast. You need to be able to get those chips in production as quickly as possible, earning revenue as quickly as you can.

9:27 spk_0

Is this the next or the next Coke and Pepsi battle? Is it Grok versus Nvidia? Do you view it that way? Just the fact you're even mentioning their name, a lot of other CEOs at those old big companies wouldn't even mention theircompetitor.

9:39 spk_1

Well, you have to mention Nvidia. They are the 3 trillion pound gorilla, right? But um I actually from a shareholder point of view, I think Grok is gonna be one of the best things that's ever happened to Nvidia.And the reason I say that is Nvidia has a very special ingredient called HBM high bandwidth memory. It's made by Samsung, it's made by SK Heinix, it's made by Micron. The GPU itself in an Nvidia GPU or an AMD GPU is made of the same process, the the chip fabrication process that's used to build mobile phone chips. They can make an infinite number of them, inexpensively.The problem is that HBM or high bandwidth memory, which is only built for servers. There's a very limited capacity in the world, it's very expensive, they cannot make more. Nvidia is a monopsony. That means for your audience, it's not a monopoly, which is a single seller. Monotony is a single buyer. Nvidia is purchasing most of the HBM.And that limits how many GPs they can build. So they want to build 3 to 5 million this year.They can only build as many as they have memory to do that.If they were to sell those GPUs for inference, inference is a lower margin business. They would be forced to lower margins. They have no choice. So what you do is lower margin. What we do is lower margin. We have lower costs to begin with, and then on top of that, lower margin. So when you deploy a GPU just the electricity cost to run that GPU is more than the total cost per token when you're running LPUs.On top of that, we lower the margin below, so it's even more economical.Now, because every time you deploy more inference, there's demand for even better models. There's demand for more training, which means those 3 to 5 million GPUs that they're gonna produce are still going to be sold at a high margin for training, and they're not gonna have to bring their margin down. So for the Nvidia shareholders, I mean this may not be the way they look at it at the leadership team of Nydia, but for the shareholders, we're one of the best things that's happened to Nvidia.

11:35 spk_0

Uh, hang with us, we're gonna go off for a quick break. We'll be right back on opening bid.All right, welcome back to Opening Bid, of course, sponsored by our friends at Vanguard, having a fun chat here with Grok, founder and CEO Jonathan Ross. Um, as someone, I mean, clearly you're in the trenches with this stuff, how powerful are some of these AI AI models getting and what are some of the concerns you're starting to have?

12:03 spk_1

Um, from a power point of view, let's first talk about it from a business point of view. What is it that you can actually accomplish with a modern LLM?There's a technological analogy here. I would, I would compare it to the information age. So the information age is about uh replicating data with high fidelity and distributing it to the world. Uh, that's what phones do, it's what the internet does. It's what the printing press did, right? Same concept.When we were first fundraising, I was asked, is AI um the next internet?And the answer is absolutely not, because the internet is an information age technology. AI is not. AI is not about copying, it's not about replicating. It's about generating a contextually relevant, creative answer that is customed to the situation.And right now, we're in the printing press era of AI, the very beginning, the very primitive start. That's why most people, when they're trying to build their applications, it can't do what they're trying to do cause they're thinking of it as, what could a human do?What AI can do today is it's a what I call language user interface. So you know what a GUI is or a graphical user interface. When you use your phone when you use your phone, there's buttons. Back in the 1970s, computers didn't have buttons. You had to type all of these commands and you had to memorize the commands. Otherwise you couldn't use the computer. What the graphical user interface gave you was the ability to look at a screen, see a button and get a sense of what you could do. It was a huge advancement. It made it much easier to use a computer.Language user interfaces are gonna revolutionize using computers because now you just speak to it. You don't have to know what it can do. You talk to it. And you know how these models can translate from English to Spanish or French or whatever. Well, they can translate your voice into actions, into button pushes, into this sort of thing. They're very good at being a language user interface.If you do something like that, so think of customer service. Customer service, it's a language user interface to the business. Um, if you're doing, um, programming, it's a language user interface to telling the computer what to do. Perfect fit.But now you start asking it to be intelligent, and it's not quite there yet, right? If you ask it to invent something, it can't do that. Why can't it do that? Well, what the LLMs do is they give you the most probable next token, which is the syllable of the LLM.The most probable is the most obvious, and if it's the most obvious, it's not gonna be good writing, it's not gonna be a good science, it's not gonna invent something for you, it's not gonna invent a drug that isn't known.That's what we're working on next. That's, that's next. But actually there's some other steps too. The, the next thing you're gonna see is you're gonna see Agentic. So the reason Agentic doesn't work so well yet, but by the end of the year it may, is you have to get the model good enough where it can fix its own mistakes, recognize and fix its own mistakes. I write some code, I get an error. How do I fix it?It's a little bit like the difference between um a ballistic missile or guided missile. Ballistic missile, you have to get it right from the beginning to be able to get to your target, right? When you ask a query and you have to get an exact answer, that's like a ballistic missile, no corrections. Whereas when you're guided, it's sort of like it can see the air, it can fix it, and it can start correcting as it's writing the code, right?Um, and then we're gonna start seeing hallucinations reduced over time. That's next. It's, it's getting better. It's, and none of these are binary. It's not like one day it'll stop hallucinating or I prefer to say make errors as opposed to hallucinates

15:27 spk_0

hallucinations just sound better,

15:29 spk_1

but it, but it doesn't clearly articulate the problem. The problem is LLMs make mistakes and they're always gonna make mistakes, but they're gonna make fewer and fewer.At some point they will make so few mistakes, you can use it in medicine, you can use it in law, but you can't now, right? That's gonna get better. So we're just gonna see them getting better and better, and I think over the next 5 years, you'll see hallucinations go down, you'll see agenttic, you'll see inventiveness, and then lastly, you'll see alignment enough where the models will be able to accept coming on to a podcast. So in our lifetime.Very well within our lifetime, but you need all those steps before you get to AGI. AGI is not around the corner. You'll see each of those solved before we get anywhere close to AGI. Do you

16:10 spk_0

share umOcta co-founder or uh Tom McKinnon's, his concerns that with Agenta comes security problems.

16:20 spk_1

So there's already been some interesting results, one of them from Google. So Google, um, you know what a zero day exploit is? Um, so a zero day, a zero day exploit is an exploit that you find in software that you can immediately use um to to get around the security, to get into the system or or to manipulate the system in some undesired way.And Google, using an LLM found for the very first time, a security vulnerability that wasn't yet known.It it people didn't know it existed.Security has always been a cat and mouse game, always. Um, the people doing the security do a really great job securing the systems. You find one vulnerability you're in, the security people figure that out and they fix it. What's gonna happen now is you're gonna see LLMs that are trying to find security vulnerabilities, and they're gonna get better at it, but this is the inventive part, right? This is why it's not happening very widespread yet. They're not very good at this yet.But then you're also gonna need to see AI LLMs, um, patching the vulnerabilities, detecting and patching as quickly as possible. And so it's gonna be what it's always been. Technology advances and people use the tools.

17:34 spk_0

How big you were on that trip um with a lot of other big uh tech executives uh to Saudi Arabia, President Trump, Elon, Jensen, AMDCL, Lisa Su.How big is the sovereign AI opportunity? I've talked to a few folks that have gone to that trip. I haven't gotten a number. How big is it for your company?

17:52 spk_1

So, um, I would say that the demand for compute, and let's let's go back to what people are buying, right? I, we talked about tokens, we talked about mega tokens as the unit of um intelligence and and what people are buying. Imagine trying to run a civilization or a society without having energy, without having power plants.And we have a stack up that um we we don't really think about much in civilization, which is there were materials. There were the Iron Age, the the stone age, right, material ages. Then we got into the industrial age. The industrial age was about energy. That's where oil became important, coal became important, wood became, you know, for burning for for heat, right?Uh, and then we got into the information age where it was all about having information to run your society. Those were the three layers of a civilization and you needed all three of those things. Now we've added another, compute. Compute used to only be provided by human beings. Now we can artificially do compute. We can artificially think and add intelligence to society.You're not gonna be able to be competitive without it.Now, if you were trying to build a society that could manufacture, then you needed energy. The more energy you had, the more manufacturing plants you could build, the more you could produce, the bigger your GDP. That's gonna be what we're gonna see with intelligence. You're gonna need more intelligence to be able to be part of the AI or the the generative age that's coming, right?And sovereign entities want that. For example, um, with Bell Canada, we're going after um the Canadian sovereign uh push, right? They're, uh, I think they've already announced $2 billion for that. Then you see in the Middle East massive pushes. We're seeing like hundreds of billions of dollars um of funding going towards that, and their ambition isn't just for in their country, they actually want to power the region. So when we think of this, think of it as producing a barrel of oil, right?So if you had um 25 million tokens per second of capacity, that'd be like a barrel of oil every 4 seconds. If you're doing a billion tokens per second, that's the equivalent of about 10 barrels of oil per second, or about 7% of the revenue of the kingdom of Saudi Arabia.They want to become net exporters of intelligence, the way that they're net exporters of energy at the moment.

20:06 spk_0

It feels like my IQ is going up in real time listening to you now. This is like fascinating stuff, but before I let you go, um, our private market data at Yahoo finds has your company valued at $3.5 billion. Uh, just listening to your story and what you're working on.Investors have to be giving you money hand over fist. You're out there trying to raise more money to achieve what you want to do.

20:26 spk_1

So we're not trying to raise any money. Um, we had a very oversubscribed round, so we raised, um, I think publicly it's $640 million. It's actually more more than that, but, but we haven't announced, um, what it was, and we're, we're good for now. Um, we may might take some strategic money from people that we're working with, but not, not just as a purely financial investment.

20:49 spk_0

So you don't.Like what are you building? The capital that you raised? Are you building new plants, structures, hiring, or all of it?

20:56 spk_1

So the way that we operate, um, we have a very capital intensive business. You, you heard, um, Microsoft announcing what was it $75 billion a year, um, I forget, Google, what are they? 80 billion metas 65, 60, um, and Amazon $100 billion a year, uh, that also includes the data center side of it.Now for us everyone's um opex is someone else's capex. They're treating data centers as capex because they're paying for it because they're doing so much.We use about 1/3 of the energy per token we produce, and our chips are air cooled rather than liquid cooled. So we can use the existing data centers, we don't have to build new ones. So for us, we don't need anywhere near as much money. The data centers are the majority that spend. We don't have to do that. We're using other people, so we're using Bell Canada's data centers. We're using um uh Humane's data centers in Saudi Arabia, right? Uh, we're using Equinix's data center. We're using all sorts of data centers all over the place. That's their business. They can do that then.We partner with someone like Bell Canada, Humane or any of these other partners who then put up the capex for us to deploy our hardware.And um one of the reasons telcos like working with us is they're very good at a geography, they're amazing at building out a geography, amazing infrastructure. What we do is we work with them, they build out a geography, and then we connect that into our global network because inference is a global problem. You can't do it locally. Training you can do locally, inference has to be around the world because you build an app, a service, everyone has to be able to access it low latency and um we work with them and now we're building out this global inference platform.And they each bring a part of it, we run it, we bring the the software, we bring the models onto it, um, and then we share that revenue. So it's a very different model, but for us it's quite capex light, and there's plenty of other people who want to invest in Capex and we work with them.

22:48 spk_0

Folks, pay attention to Jonathan, pay attention to Grok. I can't tell you how many of you have messaged me over the years on on X, LinkedIn saying we missed Nvidia, we missed all these great next AI opstarts.This guy's clearly working on something very, very fascinating. Grok founder and CEO Jonathan Ross. It was good to see you. I appreciate that, appreciate that. Uh, good to see you as well. Um, that's it for the latest episode of Opening Bid, of course, sponsored by our friends at Vanguard. Continue to hit us with all those thumbs up and likes and all the podcast platforms and YouTube. Love all the feedback. Talk to you soon.

For full episodes of Opening Bid, listen on your favorite podcast platform or watch on our website.

Yahoo Finance's Opening Bid is produced by Langston Sessoms