The TPUv4 and TPUv6 docs were stolen by a Chinese national in 2022/2023: https://www.cyberhaven.com/blog/lessons-learned-from-the-goo... https://www.justice.gov/opa/pr/superseding-indictment-charge...
And that's just 1 guy that got caught. Who knows how many other cases were there.
A Chinese startup is already making clusters of TPUs and has revenue https://www.scmp.com/tech/tech-war/article/3334244/ai-start-...
There is a dark art to semiconductor manufacturing that pretty much only TSMC really has the wizards for. Maybe intel and samsung a bit too.
The knowledge of making 2008 era chips is not a gating factor for getting a handful of atoms to function as a transistor in current SOTA chips. There are probably 100 people on earth who know how to do this, and the majority of them are in Taiwan.
Again, China has literally stolen the plans for EUV lithography, years ago, and still cannot get it to work. Even Samsung and Intel, using the same machines as TSMC, cannot match what they are doing.
It's a dark art in the most literal sense.
Nevermind that new these cutting edge fabs cost ~$50 Billion each.
The question is when? Does that come in time to deflate the US tech stock bubble? Or will the bubble start to level out and reality catch up, or will the market crash for another reason beforehand?
I dont understand this part. What has nuclear base got to do with chip manufacturing? And surely, not all 600k students are learning chip design or stealing plans
About students, have you seen the microelectronic labs in American universities lately? A huge chunk are Chinese already. Same with some of the top AI labs.
This is like this funny idea people had in the early 2000s that China would continue to manufacture most US technology but they could never design their own competitive tech. Why would anyone think that?
Wrt invading Taiwan, I don't think there is any way China can get TSMC intact. If they do invade Taiwan (please God no), it would be a horrible bloodbath. Deaths in the hundreds of thousands and probably relentless bombing. Taiwan would likely destroy its own fabs to avoid them being taken. It would be sad and horrible.
The killer really is training, which is insanely compute intensive and really only recently hardware practical on the scale needed.
The work that XLA & schedulers are doing here is wildly impressive.
This feels so much drastically harder to work with than Itanium must have been. ~400bit VLIW, across extremely diverse execution units. The workload is different, it's not general purpose, but still awe inspiring to know not just that they built the chip but that the software folks can actually use such a wildly weird beast.
I wish we saw more industry uptake for XLA. Uptakes not bad, per-se: there's a bunch of different hardware it can target! But what amazing secret sauce, it's open source, and it doesn't feel like there's the industry rally behind it it deserves. It feels like Nvidia is only barely beginning to catch up, to dig a new moat, with the just announced Nvidia Tiles. Such huge overlap. Afaik, please correct if wrong, but XLA isn't at present particularly useful at scheduling across machines, is it? https://github.com/openxla/xla
They’ll just catch the next wave of tech or eventually break into EUV.
> XLA isn't at present particularly useful at scheduling across machines,
I'm not sure if you mean compiler-based distributed optimizations, but JAX does this with XLA: https://docs.jax.dev/en/latest/notebooks/Distributed_arrays_...
JAX/XLA does offer some really nice tools for doing automated sharding of models across devices, but for really large performance-optimized models we often handle the comms stuff manually, similar in spirit to MPI.
China has fabs. Most are older nodes and are used to manufacture chips used in cars and consumer electronics. They have companies that design chips (manufactured by TSMC), like the Ascend 910, which are purpose built for AI. They may be behind, but they’re not standing still.
We desperately need more open frameworks for competition to work
There are so many trade and manufacturing links between China and Taiwan that an outright war would be economically disastrous for both countries.
How would this be a deadly blow to Google? Google makes TPUs for their own services and products, avoiding paying the expensive nvidia tax. If other people make similar products, this has effectively zero impact on Google.
nvidia knew their days were numbered, at least in their ownership of the whole market. And China hardly had to steal the great plans for a TPU to make one, and a FMA/MAC unit is actually a surprisingly simple bit of hardware to design. Everyone is adding "TPUs" in their chips - Apple, Qualcomm, Google, AMD, Amazon, Huawei, nvidia (that's what tensor cores are) and everyone else.
And that startup isn't the big secret. Huawei already has solutions matching the H20. Once the specific need that can be serviced by an ASIC is clear, everyone starts building it.
>America will train 600k Chinese students as Trump agreed to
What great advantage do you think this is?
America isn't remotely the great gatekeeper on this. If anything, Taiwan + the Netherlands (ASML) are. China would yield infinitely more value in learning manufacturing and fabrication secrets than cloning some specific ASIC.
That'd be the belief in good old American exceptionalism. Up until recently, a common meme on HN was "freedom" is fundamental to innovation, and naturally the country with the most Freedom(TM) wins. This even persisted after it was clear that DJI was kicking all kinds of ass, outcompeting multiple western drone companies.
There are things about China not to be celebrated but one cannot help but admire the way that they invest in their country as a whole. The US is all about "what's in it for me".
But if you make it 2900 words through this 9000 word document, to the "Sample VLIW Instructions" and "Simplified TPU Instruction Overlay" diagrams, trying to map the VLIW slots ("They contain slots for 2 scalar, 4 vector, 2 matrix, 1 miscellaneous, and 6 immediate instructions") to useful work one can do seems incredibly incredible challenging. Given the vast disparity of functionality and style of the attached units that that governs, and given the extreme complexity in keeping that MXU constantly fed, keeping very tight timing so that it is constantly well utilized.
> Subsystems operate with different latencies: scalar arithmetic might take single digit cycles, vector arithmetic 10s, and matrix multiplies 100s. DMAs, VMEM loads/stores, FIFO buffer fill/drain, etc. all must be coordinated with precise timing.
Where-as Itanium's compilers needed to pack parallel work into a single instruction, there's maybe less need for that here. But that quote there feels like an incredible heart of the machine challenge, to write instruction bundles that are going to feed a variety of systems all at once, when these systems have such drastically different performance profiles / pipeline depths. Truly an awe-some system, IMO.
Still though, yes: Itanium's software teams did have an incredibly hard challenge finding enough work at compile time to pack into instructions. Maybe it was a harder task. What a marvel modern cores are, having almost a dozen execution units that cpu control can juggle and keep utilized, analyzing incoming instructions on the fly, with deep out-of-order depenency-tracking insight. Trying to figure it all out ahead of time & packing it into the instructions apriori was a wildly hard task.
Is all that construction really worth it when we could be protecting neighborhoods and historic views?
Everyone is still dependent on a single American manufacturer for this tech after decades of development. This strongly suggests that it is considerably more difficult than just "funding a second source".
And it's not an entirely binary choice on protecting neighborhoods and views; for example what's happening in south Memphis with the power plant that's powering the Grok center there is a classic case of environmental racism -- they are cutting costs on pollution regulation because they have a community that they can dump the externalized costs on via their emissions.
Nobody's saying Grok shouldn't have the power, it's just a small detail on how that impact is managed.