More

villgax · 2025-12-21T18:09:06 1766340546

Skill problem not an LLM problem

villgax · 2025-12-21T00:09:25 1766275765

Vibe code this…lol

villgax · 2025-12-19T11:14:30 1766142870

it's just people looking to do experiments locally on the main machine rather than just get a dedicated spark, which can be used properly as a headless box than a Mac of which you are at the mercy of system shenanigans albiet still bearable compared to windows

villgax · 2025-12-11T18:21:15 1765477275

Marginal gains for exorbitantly pricey and closed model…..

villgax · 2025-12-10T16:23:09 1765383789

As evident by recent HN coverage, SemiAnalysis is just becoming another shi*posting publication. Not one person in the industry consider them reliable/technically sound.

villgax · 2025-12-10T02:27:46 1765333666

Donate?! Pshawh………more like vibe manage it yourself lol

villgax · 2025-12-09T18:30:55 1765305055

Modified MIT?????

Just call it Mistral License & flush it down

villgax · 2025-11-27T15:55:48 1764258948

Exactly, ChatGPT pretty much ate away ad volume & retention if th already garbage search results weren't enough. Don't even get me started on Android & Android TV as an ecosystem.

IncreasePosts · 2025-11-27T17:50:46 1764265846

That's not the story that GOOGs quarterly earning reports tell(ad revenue up 12% YoY)

pzo · 2025-11-27T19:44:16 1764272656

most likely because they got more aggressive with campaign against adblock in chrome and more ads in youtube.

villgax · 2025-11-27T15:54:03 1764258843

Fuschia or me?

villgax · 2025-11-27T15:53:11 1764258791

100 times more chips for equivalent memory, sure.

m4r1k · 2025-11-27T16:17:26 1764260246

Check the specs again. Per chip, TPU 7x has 192GB of HBM3e, whereas the NVIDIA B200 has 186GB.

While the B200 wins on raw FP8 throughput (~9000 vs 4614 TFLOPs), that makes sense given NVIDIA has optimized for the single-chip game for over 20 years. But the bottleneck here isn't the chip—it's the domain size.

NVIDIA's top-tier NVL72 tops out at an NVLink domain of 72 Blackwell GPUs. Meanwhile, Google is connecting 9216 chips at 9.6Tbps to deliver nearly 43 ExaFlops. NVIDIA has the ecosystem (CUDA, community, etc.), but until they can match that interconnect scale, they simply don't compete in this weight class.

cwzwarich · 2025-11-27T18:43:17 1764268997

Isn’t the 9000 TFLOP/s number Nvidia’s relatively useless sparse FLOP count that is 2x the actual dense FLOP count?

smilekzs · 2025-11-27T23:07:55 1764284875

Correct --- found a remark on Twitter calling this "Jenson Math".

Same logic when NVidia quote the "bidirectional bandwidth" of high speed interconnects to make the numbers look big, instead of the more common BW per direction, forcing everyone else to adopt the same metric in marketing materials.

7e · 2025-11-27T22:34:51 1764282891

Wow, no, not at all. It’s better to have a set of smaller, faster cliques connected by a slow network than a slower-than-clique flat network that connects everything. The cliques connected by a slow DCN can scale to arbitrary size. Even Google has had to resort to that for its biggest clusters.

markhahn · 2025-11-28T06:30:32 1764311432

Is this claim based on observed comm patterns in some particular AI architecture?

oivey · 2025-11-28T00:01:25 1764288085

I guess “this weight class” is some theoretical class divorced from any application? Almost all players are running Nvidia other than Google. The other players are certainly more than just competing with Google.

overfeed · 2025-11-28T04:01:28 1764302488

> Almost all players are running Nvidia other than Google.

No surprises there, Google is not the greatest company at productizing their tech for external consumption.

> The other players are certainly more than just competing with Google.

TBF, its easy to stay in the game when you're flush with cash, and for the past N-quarters, investors have been throwing money at AI companies, Nvidia's margins have greatly benefited from this largesse. There will be blood on the floor once investors start demanding returns to their investments.

oivey · 2025-11-28T04:54:10 1764305650

Ok? The person I was replying to was saying that Google’s compute offering is substantially superior to Nvidia’s. What do your comments about market positioning have to do with that?

If Google’s TPUs were really substantially superior, don’t you think that would result in at least short term market advantages for Gemini? Where are they?

ZiiS · 2025-11-28T06:18:23 1764310703

They are suggesting it is easier for others to buy buy more NVidia chips and feed them more power. Whilst operating costs are covered by investors. If they move on to competing on having to do inference the cheepest then the TPUs will shine.

oivey · 2025-11-28T06:52:01 1764312721

The original post made no comments about inference or training or even cost in any way. It said you could hook up more TPUs together with more memory and higher average bandwidth than you could with a datacenter of Nvidia GPUs. From an architectural point of view, it isn’t clear (and is not explained) what that enables. It clearly hasn’t led to a business outcome for Google where they are the clear market leader.

Seemingly fast interconnects benefit training more than inference since training can have more parallel communication between nodes. Inference for users is more embarrassingly parallel (requires less communication) than updating and merging network weights.

My point: cool benchmark, what does it matter? The original post says Nvidia doesn’t have anything to compete with massively interconnected TPUs. It didn’t merely say Google’s TPUs were better. It said that Nvidia can’t compete. That’s clearly bullshit and wishful thinking, right? There is no evidence in the market to support that, and no actual technical points have been presented in this thread either. OpenAI, Anthropic, etc are certainly competing with Google, right?

Dylan16807 · 2025-11-30T04:10:51 1764475851

> My point: cool benchmark, what does it matter?

And then people explained why the effects are smoothed over right now but will matter eventually and you rejected them as if they didn't understand your question. They answered it, take the answer.

> It didn’t merely say Google’s TPUs were better. It said that Nvidia can’t compete.

Can't compete at clusters of a certain size. The argument is that anyone on nVidia simply isn't building clusters that big.

ZiiS · 2025-11-28T08:29:30 1764318570

The fact that NVidia are currently winning is undisputed.

PunchyHamster · 2025-11-27T19:52:58 1764273178

Yet everyone uses NVIDIA and Google is at catchup position.

Ecosystem is MASSIVE factor and will be a massive factor for all but the biggest models

epolanski · 2025-11-27T20:11:50 1764274310

Catch-up in what exactly? Google isn't building hardware to sell, they aren't in the same market.

Also I feel you completely misunderstand that the problem isn't how fast is ONE gpu vs ONE tpu, what matters is the costs for the same output. If I can fill a datacenter at half the cost for the same output, does it matters I've used twice the TPUs and that a single Nvidia Blackwell was faster? No...

And hardware cost isn't even the biggest problem, operational costs, mostly power and cooling are another huge one.

So if you design a solution that fits your stack (designed for it) and optimize for your operational costs you're light years ahead of your competition using the more powerful solution, that costs 5 times more in hardware and twice in operational costs.

All I say is more or less true for inference economics, have no clue about training.

butvacuum · 2025-11-27T20:37:35 1764275855

Also, isn't memory a bit moot? At scale I thought that the ASICs frequently sat idle waiting for memory.

pests · 2025-11-27T21:31:15 1764279075

You're doing operations on the memory once it's been transferred to gpu memory. Either shuffling it around various caches or processors or feeding it into tensor cores or other matrix operations. You don't want to be sitting idle.

NaomiLehman · 2025-11-27T15:55:46 1764258946

I think it's not about the cost but the limits of quickly accessible RAM

croon · 2025-11-27T16:08:39 1764259719

Ironwood is 192GB, Blackwell is 96GB, right? Or am i missing something?

villgax · 2025-11-28T10:42:51 1764326571

182GB and B300 is 288GB. IIRC