PCI: past, present and future Outline: - History - Talk about ISA/EISA/VESA/MCA - 32-bit, 33MHz, 5V card - 64-bit, 66MHz, 3.3V - Features - Configuration space - IO space - Memory space - Topology - Ordering rules - Busmastering - Linux implementation - pci_{read,write}_config_{byte,word,dword}() - {in,out}[bwl]() - {read,write][bwlq]() - pci_dev, pci_bus - Related Tech - PCI-X - AGP - CompactPCI - MiniPCI - Cardbus - Present state of PCI - Hotplug - 133MHz - MSI - Future - 266 / 533MHz - PCI-E Talk: Introduction ~~~~~~~~~~~~ PCI is probably the most successful bus technology in computing history. The PCI SIG was founded in June 1992 and currently has over 860 members. It's available for almost every architecture, from your x86 desktop to embedded designs (ARM, MIPS, SuperH, V850) to workstations and supercomputers (Alpha, IA64, PA-RISC, PowerPC, SPARC). You can even buy an S/390 on a PCI card. At the time it was introduced, it was viewed in the PC press as being a competitor to the VESA Local Bus. It was certainly similar -- both were 32-bit, 33MHz bus systems, but due to most Pentium chipsets only supporting PCI, it won in the PC marketplace. PCI cards could be automatically configured by the BIOS, unlike most ISA cards. They didn't require description files on a floppy, unlike EISA and MCA cards. And unlike MCA cards, you didn't need to pay royalties to IBM. Over time, PCI was expanded to 64-bit and 66MHz. It has been revised in evolutionary ways over the past 11 years, clarifying details and providing support for new features. It has spawned a host of successor, competitor and complementary technologies such as CardBus, AGP, MiniPCI, CompactPCI, PCI-X and PCI-Express. It has proved to be sufficiently flexible to meet the low-margin needs of the sound card manufacturers as well as the high-performance needs of Ultra 320 SCSI cards. Features ~~~~~~~~ In order to keep the pin count down (which reduces costs), the PCI bus uses the same pins for both addresses and data. This makes it necessary to have a PCI bus protocol which all devices must understand. In addition to the Address/Data pins, there are a group of pins known as "Interface Control". These pins are used to communicate what state the PCI bus is in. PCI supports three different address spaces -- Configuration, Memory and I/O. I/O space is also commonly known as port space. Under Linux, you can examine how it is assigned by looking at /proc/ioports. It's a 16-bit address space which exists to provide compatibility with ISA cards. Linux provides the functions inb(), inw(), inl(), outb(), outw() and outl() for reading and writing byte, 2-byte and 4-byte quantities. Memory space is memory-mapped I/O. It's up to 64-bits in size and offers significant performance benefits over port I/O space. To access it in a portable manner, first ioremap() it, then call readb(), readw(), readl(), readq(), writeb(), writew(), writel() or writeq(). Configuration space is a mere 256 bytes in size. It is the mechanism for plug-and-play configuration and reports much useful information about the device. Linux provides the functions pci_read_config_byte(), pci_read_config_word(), pci_read_config_dword(), pci_write_config_byte(), pci_write_config_word() and pci_write_config_dword() to access this space. It's not normally necessary to do this as Linux caches much of the useful information from this space in the pci_dev structure. Before a PCI device can be accessed in any other way, it must be configured. Platform-dependent code tells the Linux PCI code which root busses exist. The PCI code scans each bus for devices then configures the ones it finds. Each device on a bus is uniquely identified by its device and function number. There can be up to 32 devices on each bus, though physical constraints normally limit the number of devices to around 5. Each device has a function 0 and may have up to 7 additional functions, though it is rare to see more than 2. Each PCI bus has a number, ranging from 0 to 255. Some of the devices may be PCI-to-PCI bridges allowing for expansion to many secondary busses. To uniquely identify a device, you must know the number of the bus it is on, its device number and its function number. This is normally written out (for example by lspci) as 01:09.0. That represents bus 1, device 9, function 0. Even this is not sufficient for some manufacturers so Linux 2.6 supports PCI domains (aka PCI segments). This adds yet another layer of hierarchy so a configuration address is now written in the form 0003:01:09.0 for domain 3, bus 1, device 9, function 0. Once a device has been found, it can be configured. This involves assigning ranges of port and memory space to it, setting up interrupts, configuring DMA, error reporting and so on. Bandwidth ~~~~~~~~~ The first systems had a 33MHz, 32-bit PCI bus. This has a raw bandwidth of 132MB/s. The PCI protocol restricts that somewhat. Every PCI transaction starts with an address phase which is then followed by one or more data phases. If the device is doing bulk data transfers, the effective data bandwidth may be over 100MB/s, but if this is mixed in with a lot of single register accesses, data bandwidth can be as low as 60MB/s. Several complementary approaches were taken to increase the effective data bandwidth available. One was to increase the width of the bus to 64 bits, doubling the amount of data that could be transferred in each cycle. PCI 2.1 allowed the bus speed to double to 66MHz. Combining both of these approaches was known as PCI 4x. Each of these approaches has its downsides. Doubling the clock speed to 66MHz means that when a 33MHz card is plugged into the bus, every card on the bus has to go at the slower speed. Expanding the bus width to 64 bits doesn't give quite as much bandwidth improvement as 66MHz as a wasted cycle wastes twice as much bandwidth on a 64-bit 33MHz bus as it would on a 32-bit 66MHz bus. A more subtle approach was to introduce systems with multiple independent PCI busses. Since devices on different busses could not interfere with each other, transfers would tend to be longer. Several top-end manufacturers took this approach to the extreme with each slot being on its own PCI bus. This approach also combines well with expanding the bus width and doubling the clock speed. It allows the bus to approach 450MB/s of data bandwidth. Related Technologies ~~~~~~~~~~~~~~~~~~~~ Cardbus is basically PCI in a different format suitable for the low pin count 32-bit PC Cards. Cards are PCI devices in all but shape, having configuration, memory and port space. MiniPCI is a fairly new standard form factor for laptop expansion. Many laptops have their modem & internal network card on a MiniPCI expansion card. CompactPCI is an industrial form of PCI, similar to the VME bus. It's managed by PICMG rather than the PCI SIG. Some of the blade architectures are based around a CompactPCI backplane, and the standard is also popular with telecom companies. PCI-X is currently deployed on the higher-end workstations and servers. It's an evolution of PCI, refining the bus protocol in some subtle ways and introducing faster clock rates. PCI-X 1.0 goes up to 133MHz and PCI-X 2.0 standardises 266 and 533MHz. The committee are currently working on 1066 and 2133MHz variants. AGP is related to PCI but is not developed by the PCI SIG. The protocol and connector are optimised for graphics. It can only have one card on the bus, there is no parity checking and there are special AGP transactions. For any given nX rating, AGP has double the raw bandwidth of PCI -- for example, PCI-4X is 528MB/s and AGP-4X is 1GB/s. Present ~~~~~~~ Most desktop systems have not evolved beyond the original PCI-1x specification. Server and workstation chipsets have support for PCI-2x, PCI-4x and PCI-X 133, but these are a much smaller market. The reason for this is simply a matter of demand. Except for graphics, there are no devices that demand anything even close to PCI's bandwidth. Graphics cards are almost exclusively handled through the AGP bus. This is quite an astute decision -- if you place a 133MHz PCI-X card and a 33MHz PCI card on the same bus, the bus is configured to the lowest common denominator. To avoid this, machines need multiple busses -- one for high performance cards and another for low performance cards. But the average desktop has only one high-performance card in it. By using a different bus, you prevent customers from putting their cards in the wrong slots and getting abysmal performance. Future ~~~~~~ The 1GB/s offered by PCI-X 133 isn't enough for the top end cards. AGP has already moved beyond it to the 2GB/s AGP-8x. 10Gbps ethernet cards require 2GB/s of data bandwidth. Serial ATA and Serial Attached SCSI will both require upwards of 2GB/s bandwidth per card in the next few years. PCI-X has faster speeds specified, taking it to 4GB/s and probably beyond, but PCI-Express aims to be the technology for the future. Since Intel has announced plans to kill future AGP development in favour of PCI-Express and the graphics cards tend to lead the market in terms of bandwidth consumption, it seems like a pretty safe bet that PCI Express will become prevalent. PCI Express was originally called 3GIO. It is a serial, point-to-point protocol, not entirely dissimilar to Serial ATA or USB. The first generation of products is intended to achieve 16GB/s -- 8 times as much as AGP-8x can achieve today and 16 times as much as PCI-X 133. The design is also supposed to be cheap to manufacture, though as the PCI SIG wryly note on their website, "market forces will ultimately determine the cost of PCI Express Architecture systems". >From a software point of view, PCI Express changes very little. The configuration space is expanded from 256 bytes to 1024. There will be more bridges involved. On a hardware level, the changes are extensive. A lot of sideband signals in PCI have been converted into data packets in PCI Express. For example, interrupts are now sent as data packets rather than being separate lines. This is not the same thing as MSI -- it cannot carry additional data, but rather it is a replacement for having an additional set of interrupt lines per controller. The PCI Express Root Port is expected to convert these packets back into standard PCI interrupts. A new feature in PCI Express is Quality of Service. The bus is partitioned into multiple channels and the card can specify whether the data is low-latency or isosynchronous. (XXX: more here) Credits ~~~~~~~ My long-suffering wife for proof-reading. Ottawa Canada Linux User's Group for listening to an earlier version of this talk and providing feedback.