Breaking the Speed Barrier: The Frontside Bus Bottleneck
AMD, Intel, Apple and Motorola seem headed down the right path -- at least in terms of frontside bus technology. After all, in a market hungry for speed, the disparity between processor and bus speed is certain, at least eventually, to foster major breakthroughs.
Processors continue to get faster and faster, but the frontside bus (FSB) remains one of the biggest bottlenecks on system performance. Most major manufacturers, including Intel, Motorola and AMD, are trying to develop methods to sidestep FSB limitations.
As one of the most important processing channels, the frontside bus is the primary interface that connects a microprocessor to other system devices. Typically, the FSB allows the processor to communicate with main memory (RAM), the system chipset, PCI devices, the graphics card and other peripheral buses.
What most computer users do not realize is that the speed of the frontside bus -- in addition to the speed of RAM -- determines computer speed more than the absolute clock speed of the CPU. But because FSB speeds do not increase as often or as dramatically as CPU speeds, processor manufacturers do not typically call attention to them, instead focusing on the raw speed of the processor.
FSB speeds traditionally have followed fairly predictable migration paths. AMD's Athlon processors, for example, have migrated from 200 MHz to 333 MHz. Meanwhile, Intel has pushed the FSB speeds of its Pentium 4 processors more rapidly than AMD, recently doubling them from 400 MHz to 800 MHz.
"The main problem with FSB migration is that it really cannot occur in a vacuum," Greg Fawson, an analyst at Semico Research, told TechNewsWorld. "System designers really prefer to see a balanced platform where the FSB and the memory bus are in balance; AMD has been able to achieve this with their platforms." However, until recently, he said, Intel platforms were largely out of balance.
For example, Intel's 845 chipset shipped with a 533-MHz FSB but had a memory bus that supported 266-MHz memory, which is half the bandwidth the processor required. Meanwhile, the upcoming Athlon 64 processors will use a HyperTransport bus, which will provide bidirectional bandwidth of 800 MHz in each direction.
"The big change [with AMD's HyperTransport bus] is that the memory controller is integrated into the processor," said Fawson. "This gives you much better latency characteristics, which in many applications is more important than bandwidth to achieve better performance."
Part of the slowdown in FSB speed migration has been due to an industry-wide delay in the introduction of dual-channel memory, which eventually will help manufacturers sidestep certain bus limitations.
Apple in the News
Apple's Power Mac G5 has been in the news lately, especially as a result of the company's claim that it is the fastest personal computer on the planet. Apple's answer to the frontside bus problem is to use a 64-bit, bidirectional frontside bus running at half the processor's speed.
Because Apple's bus is bidirectional, the G5 processor -- developed by IBM -- can send and receive information continuously, unlike most other processor interfaces that can carry data in only one direction at a time.[*correction] The bidirectional technique -- which also will be employed by AMD's 64-bit processors -- lets data move in opposite directions simultaneously with no other demands on the data stream and no wait time while the processor and system controller compete for use of the bus.
In Apple's G5 configuration, the data streams integrate clock signals along with the data, allowing the frontside bus to work at speeds of up to 1 GHz for a total of 8 Gbps of aggregate bandwidth.
Also, in the G5's dual-processor configuration, each processor is connected to its own dedicated 64-bit bidirectional bus, allowing each CPU to send and receive data without competing with the other processor for bandwidth. According to Apple, this technique allows larger chunks of data to be processed per clock cycle.
David Heisey, manager of advanced technology initiatives at HP's Industry Standard Servers division, said FSBs are not the only bottleneck hampering the speed of today's computers. "The reality is that every interface that sits between where the data is and where the data needs to be will impact computer performance," he noted.
"The CPU FSB is just one of these interfaces that needs to be continually updated to increase bandwidth, decrease latency and improve efficiency," Heisey told TechNewsWorld.
Certain aspects of today's CPU FSB have in fact been keeping up fairly well with the requirements, he said. For example, according to Heisey, Intel's CPUs over the past 10 years have been keeping pace reasonably well with the trend toward higher CPU core frequencies.
Some CPUs, like HP's EV7 Alpha processor, have multiple point-to-point interfaces that allow data to travel in and out of memory and to and from CPUs across multiple paths. This technique can help data move from where it is to where it needs to be in the least amount of time possible without having an impact on data transfer.
"As CPU core frequencies increase, the need for overall lower memory access latency becomes more critical," said Heisey. "Unfortunately, FSB latencies for some industry CPUs have not experienced an overall decrease in latency commensurate with the increase in CPU core frequencies." Latency, he said, has become the major factor in FSB performance limiting CPU performance.
In other words, the speed with which the processor can shuttle data to and from various components is key -- regardless of how many paths the processor uses. Many CPUs currently break this latency bottleneck by providing multiple paths for data to flow in and out of and between processors and system memory, he said.
"CPUs and systems that do so," he noted, "are keeping up with the increasing demands for rapid data movement and keeping up with the demand of today's and future generations of powerful microprocessors."
Smart Memory Next
Researchers from the University of Southern California's Information Sciences Institute (ISI) recently demonstrated a new memory chip -- called "smart memory" or processor-in-memory (PIM) -- that can accelerate certain highly complicated calculations. This kind of technology might be the breakthrough that will help processor manufacturers sidestep frontside bus limitations.
ISI computer scientist John Granacki, one of the leaders of the project that created the new chip, said it addresses a longstanding and growing mismatch in computer components. "We at ISI have worked on PIM technology and have demonstrated in the lab that a large part of the bus congestion can be eliminated by moving certain processing to memory," Granacki told TechNewsWorld. "However, this then requires new memory devices that will cost more.
"I believe that most people's performance requirements are being met by the conventional technology," he added, referring to standard RAM and processor technologies. "You have to have a real need for using PIM technology, since you will pay more and the demand is low, so prices for solutions like PIM will probably not be a success in the marketplace."
Whether or not PIM technology will be too expensive for mass adoption remains to be seen. Meanwhile, AMD, Intel, Apple and Motorola seem headed down the right path -- at least in terms of frontside bus technology. After all, in a market hungry for speed, the disparity between processor and bus speed is certain, at least eventually, to foster major breakthroughs.
*Editor's Correction Note: The original version of this article incorrectly identified Motorola as the developer of Apple's G5 processor. While Motorola has historically been involved with Apple's destop processors, it was IBM who ultimately developed the G5 chip with Apple. We apologize for this error.