GTC This week’s GPU tech conference saw Nvidia do something we haven’t seen much of lately from the chip designer: a consumer product update.
For the increasingly enterprise-obsessed tech giant, the GTC is becoming less and more about GPUs for gamers and everything to take advantage of new and emerging markets, such as artificial intelligence, robotics, self-driving vehicles, and frequent metaphors. By metaverse, in this context, we mean 3D virtual reality worlds in which you can interact and collaborate with simulations and applications and with each other.
Wearing his signature leather jacket, Nvidia CEO Jensen Huang took to the stage — or is it a hologram? We’re not sure – to reveal three RTX 40-series graphics cards powered by its engineers’ Ada Lovelace architecture.
For many who have been following Huang’s nearly 1 hour and 45 minute keynote, this reveal may be the only relevant strong announcement at this fall’s event.
Using a selection of benchmarks, Huang boasted the RTX 4090 and 4080 graphics card’s performance gains over their predecessors. The chip designer said the RTX 4090 will offer 2x-4x higher performance than the company’s previous flagship 3090 TI launched this spring.
Then there’s the price of these new RTX units. The cards are some of the most expensive Nvidia cards to date. At $899 for the 12GB 4080 and $1199 for the 16GB, the cards are $200-$500 more expensive than the 3080 when they launched two years ago. The price creep on the 4090 is not severe. At $1,599, it’s about $100 more than when the 3090 debuted in 2020.
Huang, speaking during a press conference on Wednesday, defended the increase, arguing that the performance gains and specific features more than compensate for the higher price. He claimed that the higher prices were also justified by higher manufacturing and material costs.
“A 12-inch chip is a lot more expensive today than it was yesterday, and it’s not a bit more expensive, it’s more expensive,” he said, adding that “our performance with Ada Lovelace is dramatically better.”
But unlike the new cards, which Huang spent less than two minutes detailing, he’s back to business as usual. Here is a summary of Nvidia’s biggest announcements in the GTC.
Back to the dual architecture model
Roughly 15 minutes were spent before RTX’s announcement on Nvidia’s new Ada Lovelace architecture, which sees the chipset designer revert back to the dual-architecture paradigm.
Nvidia’s previously announced Hopper architecture will power AI-focused HPC and AI processors, such as the H100, while the Ada Lovelace architecture will power graphics-focused Nvidia chips.
Named after a 19th century mathematician, the Ada Lovelace architecture is built on the TSMC 4N process and features 3rd generation Nv real-time ray tracing cores and 4th generation Tensor cores.
So there’s the split: Hooper is aimed primarily at high-performance computing and large AI workloads, and Lovelace is aimed primarily at everything else, from cloud-server GPUs to game cards.
This isn’t the first time Nvidia has used a dual-architecture model. Back in two generations, Nvidia’s data center chips, like the V100, used their own Volta architecture. Meanwhile, its consumer- and graphics-focused chips, the RTX 2000 series and the Quadro RTX family for example, used the Turing microarchitecture.
In addition to Nvidia’s RTX 40 series parts, Ada Lovelace will also power Nvidia’s RTX 6000 series workstation cards and L40 data center GPUs. However, unlike Hopper, Huang says the new architecture is designed to meet a new generation of graphics-focused challenges, including the rise of cloud games and metaverses. Those will need graphics chips somewhere to render those environments in real time – cloud gaming where the game is primarily rendered in the background and streamed live over the Internet to a screen in front of the user, such as a laptop or phone. This absolves players from buying and upgrading gaming platforms, and/or carrying them everywhere.
“In China, cloud gaming is going to be very big and the reason for that is because there are a billion phones that game developers don’t know how to serve anymore,” he said, “the best way to solve that is with cloud gaming. You can access integrated graphics, you can access mobile devices” .
metaverse but as a service
However, Ada Lovelace is not limited to cloud gaming applications. Nvidia is positioning the architecture as the workhorse of its first SaaS offering, which it says will allow customers to access the Omniverse hardware and software stack from the cloud.
The Omniverse cloud provides the remote computing and software resources needed to run metaverse applications on demand, from the cloud. The idea is that not every organization wants or even has the budget to spend millions of dollars on one of Nvidia’s OVX SuperPods to provide this level of simulation and rendering if the metaverse isn’t actually available somewhere. Instead, they can build their own metaverses in the Omniverse Cloud.
Right now, Nvidia appears to be courting a slew of other logistics, manufacturing, and industrial partners, promising to help them build and conceptualize digital twins. These twins are a full-scale simulation – each simulation is twinning with the real world, using real data and modeling – and are presented as a way to test and validate designs, processes, and systems in a virtual world before they are rolled out into the real world.
Yes, it’s more luxurious modeling and simulation, but with new silicon, interaction, virtual reality and billing.
While Omniverse Cloud is Nvidia’s first foray into managed cloud services, it won’t be the last, according to Huang, who noted that his company is evaluating a similar model for its other software platforms.
Smarter cars, robots
Nvidia doesn’t just want to run digital twins for customer warehouses and manufacturing plants. During the keynote, Huang also detailed a slew of devices designed to power everything from autonomous robots to cars.
Huang talked about Drive Thor, Nvidia’s all-in-one computing platform designed to replace the multiple computer systems used in vehicles today.
The technology will debut in China, where Nvidia says it will power its Zeekr and Xpeng 2025 lineup, and QCraft’s independent taxi service. That is, of course, if US export restrictions aren’t tightened to the point that Nvidia can no longer offer them — a prospect that Huang played down during Wednesday’s press conference.
Meanwhile, to power the robotic minions that roam alongside human workers, Nvidia has offered its IGX and Orin Nano platforms.
IGX is based on Nvidia’s previously announced Orin AGX synthetic system but adds a high-speed network. According to Nvidia, one of the first uses of the pad will be in surgical robotics. Meanwhile, Nvidia’s Jetson Orin Nano modules are designed to handle less demanding applications.
Big language models for audiences
As with previous GTCs, the software took control of a large part of the keyword. Two of the biggest releases for this fall’s event were Nvidia’s Large Language Model (LLM) services called NeMo and BioNeMo.
The services aim to make LLM recruitment easier for AI researchers and biologists seeking to extract insights from complex data sets. Services allow customers to communicate their existing data in basic, customizable forms with minimal effort. For example, BioNeMo may be used to speed up protein folding research, he suggested.
Every individual company, in every country, every single language speaking has probably dozens of different skills their company could adapt our big language model to do performance
However, looking beyond the medical field, Huang expects that LLMs will have wide application for the vast majority of companies. “My sense is that every company, in every country, that speaks every single language has probably dozens of different skills that their company could adapt our big language model to work,” he said.
“I’m not quite sure how big this opportunity is, but it’s probably one of the biggest software opportunities ever.”
hopper in production
Finally, Nvidia has provided an update on the availability of its long-awaited Hopper H100 GPUs, which it says have entered mass production and will begin shipping to OEM system builders next month.
Announced at Nvidia’s GTC spring event, the 700W GPUs promise 6 times higher AI performance compared to the A100 released thanks to 8-bit floating point arithmetic support. Meanwhile, for HPC applications, Nvidia says the chip will deliver 3x performance in dual-resolution FP64 computations.
However, those hoping to get Nvidia’s internal DGX H100 servers, complete with their own dedicated interconnect technology, will have to wait until some time in the first quarter of 2023, a full quarter later than expected.
While Nvidia has blamed the greater complexity of the DGX system, the likely culprit is Intel’s Sapphire Rapids processors used in the systems, which were reportedly delayed until late in the first quarter. ®