Nvidia Puts On Graphic Power Display With Fermi
Nvidia, a company often associated with the graphics processors that go into high-end gaming computers, has revealed a few details on Fermi, its latest GPU architecture. Fermi is designed for general purpose parallel computing that can scale up to supercomputing levels.
Graphics processor vendor Nvidia on Wednesday announced its next-generation CUDA graphics processor unit (GPU) architecture, code-named "Fermi."
The new architecture has also garnered support from Cray, IBM, HP, Dell and other companies.
Enter the Fermi
Nvidia announced the Fermi architecture at its inaugural GPU Technology Conference in San Jose, Calif., on Wednesday.
In unveiling the architecture, Nvidia CEO and cofounder Jen-Hsun Huang said GPUs have gone beyond being just graphics chips and are now general purpose parallel computing processors with "amazing" graphics.
Fermi will be the foundation for Nvidia's next generation of GeForce, Quadro and Tesla processors.
"Fermi differs from ordinary GPUs, as it's the first to be designed from the ground up for general-purpose computation with features like ECC, support for C++, a true cache hierarchy and concurrent kernel execution that are critical requirements for the computing space," Nvidia spokesperson Andrew Humber told TechNewsWorld.
ECC stands for error-correcting code. It's used to reduce soft errors in computing. A soft error is an error in the data -- such as the electrons in a storage circuit -- and not the physical circuit itself. If the data is rewritten, the circuit will work perfectly again. Highly reliable systems use ECC to correct soft errors on the fly.
Fermi offers ECC support in the processor register file, the L1 data cache, the L2 data cache, and the device memory, Nvidia's Humber said.
Fermi has more than 3 billion transistors and up to 512 CUDA cores. These cores feature the IEEE 754-2008 floating-point standard, which is derived from and replaces IEEE 754-1985 and also includes the IEEE Standard for Radix-Independent Floating-Point Arithmetic -- IEEE 854-1987.
Nvidia says the architecture delivers supercomputing features and performance at one-tenth the price and one-twentieth the power of traditional CPU-only servers.
Fermi is designed for C++ and is available with a Visual Studio development environment, Nexus, which Nvidia announced on Tuesday.
Nexus will let users debug, profile and analyze GPU code using standard workflow and tools. It supports CUDA C, OpenCL, DirectCompute, Direct3D, and OpenGL. The Nexus beta consists of a debugger, an analyzer, and a graphics inspector. All of them are integrated into Microsoft Visual Studio. Nvidia is sending out invitations to beta test Nexus now.
Fremi uses Nvidia Parallel DataCache technology and Nvidia GigaThread Engine. The GigaThread Engine enables true parallel processing.
"Independent kernels within the same app can execute in parallel," Nvidia's Humber explained. "Kernels are issued in the order they were submitted by the app. New kernels don't have to wait for older kernels that are still running to finish up."
Fermi also supports C, Fortran, Java, Python, OpenCL and DirectCompute.
A New World Order
Fermi will change the face of computing, contends Rob Enderle, principal analyst at the Enderle Group. Over the years, GPUs have been used more often to run applications that require massive amounts of parallel processing and have been redesigned and retuned for this, he told TechNewsWorld.
"While subtle, this is actually a rather significant move in the industry because for certain tasks, it is creating the opportunity for more than 100x performance increases," Enderle said. "As the first GPU which has been designed from the ground up to address this processing opportunity, Fermi, in a way, it represents a rethinking of what a GPU can do."
The Oak Ridge National Laboratory's plans to build a supercomputer based on Fermi is possibly the first step toward a sweeping change in the computer industry. GPU-based supercomputers will get their performance from the GPU, not the CPU, Enderle said. "They are changing the landscape so much they are trivializing the CPU in the supercomputer space while driving low-cost supercomputing into markets that are starved for computing power like this," he added.
Fermi is also a swipe at Intel's recently announced Larrabee multi-core microarchitecture, developed for compute- and memory-intensive applications such as PC games and high-performance computing. "They're taking a run at Larrabee," Laura DiDio, principal at ITIC, said. "And why not?"
Nvidia will probably retain its presence in the video game industry while entering new markets with Fermi, DiDio told TechNewsWorld.
Oak Ridge and Other Supporters
Oak Ridge National Laboratory, the U.S. Department of Energy's largest science and energy laboratory, has announced plans to build a supercomputer that will use the Fermi architecture. This will be used for research in areas such as energy and climate change.
The new supercomputer is expected to be 10 times more powerful than today's fastest supercomputer, according to the lab. The world's fastest supercomputer is currently IBM's Roadrunner, at the Los Alamos National Laboratory. With a US$133 million price tag, the Roadrunner is designed for a peak performance of 1.7 petaflops. It achieved 1.026 petaflops in May of 2008. It is built from off-the-shelf parts, with many novel design features. In November 2008, it reached a top performance of 1.456 petaflops. The Roadrunner is the fourth-most energy-efficient supercomputer in the world on the Supermicro Green500 list, with an operational rate of 444.94 megaflops per watt of power used.