DATA COMMUNICATIONS WITH ENHANCED SPEED MODE
An interconnect controller for a data processing platform includes a data link layer controller for selectively receiving data packets from and sending data packets to a higher protocol layer, and a physical layer controller coupled to the data link layer controller and adapted to be coupled to a communication link. The physical layer controller operates according to a predetermined protocol selectively at one of a plurality of enhanced speeds that are not specified by any published standard and are separated from each other by the same predetermined amount. In response to performing a link initialization, the interconnect controller performs at least one setup operation to select a speed, and subsequently operates the communication link using a selected speed.
Latest ATI Technologies ULC Patents:
- SYSTEMS AND METHODS FOR IMPLEMENTING FINE-GRAIN SINGLE ROOT INPUT/OUTPUT (I/O) VIRTUALIZATION (SR-IOV)
- DEVICES, SYSTEMS, AND METHODS FOR DYNAMICALLY CHANGING FREQUENCIES OF CLOCKS FOR THE DATA LINK LAYER WITHOUT DOWNTIME
- HYBRID METHODS AND STRUCTURES FOR INCREASING CAPACITANCE DENSITY IN INTEGRATED PASSIVE DEVICES
- AREA-OPTIMIZED CELLS FOR LOW POWER TECHNOLOGY NODES
- METHODS AND STRUCTURES FOR INCREASING CAPACITANCE DENSITY IN INTEGRATED PASSIVE DEVICES
Data communications systems are conventionally designed to adhere to published communications standards so components of one manufacturer can interoperate with components from different manufacturers. For example, many modern computing devices make use of input/output (I/O) adapters and buses that utilize some version or implementation of the Peripheral Component Interconnect (PCI) or PCI Express (PCIe) interconnect standards. The PCIe standard specifies a computer communication interconnect for attaching peripheral devices to a host computer. PCIe is an extension of the earlier PCI standard that uses existing PCI programming concepts, but bases the computer interconnect on a faster physical-layer communications protocol. The PCIe physical layer consists of dual uni-directional links between upstream and downstream devices.
The PCIe standard is published by the Peripheral Component Interconnect Special Interest Group (PCI-SIG). The PCI-SIG revises the standard from time to time to reflect enhanced speed and capabilities. For example, PCIe 1.0 was published in 2003 and specified a transfer rate of 2.5 giga transfers per second (GT/s). PCIe 2.0 was introduced in 2007 and provided a 5.0 GT/s transfer rate, and was followed by PCIe 3.0 in 2010 with a 8.0 GT/s transfer rate, and PCIe 4.0 in 2017 with a 16.0 GT/s transfer rate. Thus the standard has increased the transfer rate in large, discrete steps with new versions that are published in 3-7 year cycles.
At the same time, semiconductor manufacturing technology has advanced rapidly. Advances such as deep sub-micron photolithography and low voltage complementary metal-oxide-semiconductor (CMOS) transistors have advanced processing speeds, making it difficult for standards-setting bodies such as the PCI-SIG to keep pace.
In the following description, the use of the same reference numbers in different drawings indicates similar or identical items. Unless otherwise noted, the word “coupled” and its associated verb forms include both direct connection and indirect electrical connection by means known in the art, and unless otherwise noted any description of direct connection implies alternate embodiments using suitable forms of indirect electrical connection as well.
DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTSAn interconnect controller for a data processing platform includes a data link layer controller for selectively receiving data packets from and sending data packets to a higher protocol layer, and a physical layer controller coupled to the data link layer controller and adapted to be coupled to a communication link. The physical layer controller operates according to a predetermined protocol selectively at one of a plurality of enhanced speeds that are not specified by any published standard and are separated from each other by the same predetermined amount. In response to performing a link initialization, the interconnect controller performs at least one setup operation to select a speed, and subsequently operates the communication link using a selected speed.
A data processing platform includes a basic input/output system (BIOS) and a data processor. The data processor includes a central processing unit coupled to and responsive to the BIOS to execute an initialization procedure, and an interconnect controller. The interconnect controller is coupled to the central processing unit and is adapted to be coupled to a communication link that operates according to a predetermined protocol selectively at one of a plurality of enhanced speeds that are not specified by any published standard and are separated from each other by the same predetermined amount. The BIOS comprises instructions that when executed by the central processing unit cause the interconnect controller to perform at least one setup operation to select a speed, and subsequently operate the communication link using a selected speed.
A method for use in a data processing platform having an interconnect controller that operates a communication link according to a published standard includes querying the data processing platform to determine whether an enhanced speed mode is permitted. At least one setup operation is performed to select an operating speed for operating according to a predetermined protocol from among a plurality of enhanced speeds that are not specified by the published standard and are separated from each other by the same predetermined amount. The communication link is subsequently operated using the operating speed.
CPU 110 includes several components including a CPU core 112, a root complex 114, and several other components not relevant to the present disclosure and that are omitted from
CPU 110 is also bidirectionally connected to memory 120. While
In operation, system BIOS 122 is used to boot up and initialize data processing platform 100. As part of the initialization, system BIOS 122 determines the input/output (I/O) devices present in data processing platform 100, and continues through a process of configuring the PCIe fabric known as enumeration. System BIOS 122 reads configuration registers associated with each I/O device present in the system to determine their respective characteristics and capabilities. Once system BIOS 122 finishes enumerating the system, initializing root complex 114, and performing various other startup tasks, it transfers control to operating system 124, which forms the environment in which application programs are run.
Over the years, the PCIe standard has changed as the capabilities of integrated circuit technology have advanced to allow new and faster speeds and enhanced capabilities. However, the standards setting process is relatively slow, lagging behind improvements in the capabilities of integrated circuit fabrication technology. Thus, PCI and PCIe systems have not been able to adapt seamlessly to take advantage of advances in integrated circuit technology.
Data processing platform 200 differs from data processing platform 100 of
The speeds supported by different generations of published PCIe standards are shown in TABLE I below:
In one example of an implementation of ESM, an enhanced speed can be a speed between the 8.0 GT/s speed specified by the published PCIe 2.0 standard and the 16.0 GT/s speed specified by the published PCIe 3.0 standard. In another example, the enhanced speed can be any of a number of discrete speeds between 8.0 GT/s and 16.0 GT/s. In yet another example, the enhanced speed can be a single discrete speed higher than 16.0 GT/s specified by the published PCIe 3.0 standard, such as 25.0 GT/s. In still another example, the enhanced speed can be any of a number of discrete speeds between 16.0 GT/s and 25.0 GT/s.
By supporting speeds above the published PCIe speeds, data processing platform 200 allows the performance of the data processing platform to be improved on PCIe links where both link partners support the same enhanced speed capabilities. The speed can be set, for example, to the highest speed supported by both link partners. Thus, ESM allows scalable performance improvements that are not limited by the discrete speeds specified by the standards, but only by the capabilities of the semiconductor manufacturing processes used by the upstream and downstream ports and the controllers associated with them.
PCIe transaction layer controller 310 has an upstream bidirectional connection for receiving data accesses and providing data responses, collectively labeled “ACCESSES” in
PCIe data link layer controller 320 has a bidirectional upstream port connected to the downstream port of PCIe transaction layer controller 310, and a bidirectional downstream port. It performs link setup, packet sequencing, flow control, retry, and other features normally associated with a data link layer according the open systems interconnect (OSI) model. PCIe data link layer controller 320 adds headers, control information, frame check sequences, and the like to form a data link layer packet that it provides to PCIe physical layer controller 330 for access requests, and processes the headers, control information, and frame check sequences of data link layer packets received from PCIe physical layer controller 330 to form PCIe packets for access responses.
PCIe physical layer controller 330 has an upstream bidirectional port connected to the downstream bidirectional port of PCIe data link layer controller 320, and a downstream port connected to a medium 340 that includes a unidirectional transmit port labeled “TX” and a unidirectional receive port labeled “RX”. PCIe physical layer controller 330 supports one or more enhanced speeds as will be described further below.
In the exemplary embodiment, PCIe transaction layer controller 310, PCIe data link layer controller 320, and PCIe physical layer controller 330 are all circuit blocks on a CPU chip. However the blocks of interconnect controller 300 can be implemented with various combinations of hardware and software (e.g. operating system drivers). For example, PCIe physical layer controller 330 can implemented in hardware and PCIe transaction layer controller 310 can be implemented in software, while PCIe data link layer controller 320 can be implemented partially in hardware and partially in software.
In action box 410, the system BIOS controls interconnect controller 300 to query data processing platform 200 to determine whether ESM is permitted. Querying the platform involves enumerating the bus hierarchy structure to detect that the platform is of the type that permits ESM to be used. This permission involves first determining that the platform is enabled for ESM operation, and second that a port exists at the other end of the link to which interconnect controller 300 is connected that is capable of ESM operation.
In action box 420, the system BIOS controls interconnect controller 300 to perform a setup procedure to use ESM. Once it is determined that ESM will be run, one or the other, both, or neither of the components may require some setup before actually running in ESM. Interconnect controller 300 begins the setup phase of the process in response to software, such as the system BIOS, writing its configuration registers to enable ESM in both the upstream port (USP) and downstream port (DSP). For an ESM-aware port, the setup may be when an “ESM Enable” bit of an ESM Control register is written with a “1”. For a non-ESM aware Port, a vendor specific initiator register may be used. The initiator register in the DSP must be written prior to the initiator in the USP, since writing the register in the USP also initiates a transition to link state L1 and an electrical idle (EI) bus state as defined in the PCIe standard.
Writing the initiator register triggers the following sequence. First, interconnect controller 300 sets a variable that causes it to perform the setup necessary to prepare for entering the ESM, on the next occurrence only of the port's transmitters going to the EI state, and its receivers detecting an EIOS, or detecting or inferring EI, while the Link is in the LinkUp state. Second, for an ESM-aware USP, the link controller is directed to initiate an entry to the L1 state. A non-ESM aware USP will be put into the D3hot state to cause it to transition to the software directed L1 link state. Third, interconnect controller 300 performs required environmental changes, e.g., changing the voltage, recalibrating the physical layer controllers (PHYs) and other hardware, and the like.
Subsequently, each and every time that the initiator bit is written with a “1”, this sequence is triggered and interconnect controller 300 performs the setup. The setup is only performed on the first transition to the EI state following the writing of the initiator register, and all subsequent EI occurrences behave normally and do not trigger this setup procedure.
In action box 430, interconnect controller 300 operates the link at the selected ESM data rate. Thus the link is able to achieve performance beyond the performance specified in the published standard, limited only by the capabilities of the process technology used for the circuitry at both ends of the link but not by the arbitrary speed definitions necessitated by the process of publishing a technical standard by a standards setting organization like the PCI-SIG.
Exemplary PCIe ImplementationDetails of an implementation of ESM in PCIe will now be explained with reference to a specific example.
ESM Capability Revision 820 is a version number that indicates the version of the extended capability. The system BIOS qualifies both the ESM Vendor ID and the ESM Capability ID (see below) before interpreting this field. ESM Capability Length 830 indicates the number of bytes in the entire extended capability data structure, including the PCI Express Extended Capability header, the ESM Header, and the ESM registers.
Although not shown, ESM Capability 2 Register 552, ESM Capability 3 Register 553, ESM Capability 4 Register 554, ESM Capability 5 Register 555, and ESM Capability 6 Register 556 have generally corresponding bit assignments as ESM Capability 1 Register 551. ESM Capability 2 Register 552 indicates support for transfer rates between 11.0 GT/2 and 13.9 GT/s in 100 MT/s increments. ESM Capability 3 Register 553 indicates support for transfer rates between 14.0 GT/2 and 15.9 GT/s in 100 MT/s increments, and unlike the other registers, has reserved bits in bit positions 20-31. ESM Capability 4 Register 554 indicates support for transfer rates between 16.0 GT/2 and 18.9 GT/s in 100 MT/s increments. ESM Capability 5 Register 555 indicates support for transfer rates between 19.0 GT/2 and 21.9 GT/s in 100 MT/s increments. ESM Capability 6 Register 556 indicates support for transfer rates between 22.0 GT/2 and 24.9 GT/s in 100 MT/s increments.
During discovery and enumeration, root complex 214 discovers the existence and capabilities of all PCIe devices in data processing platform 200 by examining the registers in PCIe configuration space, and determining a speed of operation between each pair of link partners. Root complex 214 first determines whether extended speed mode is supported, and if it is, further examines the ESM capability descriptor of each device and port between the root complex and the endpoint. Generally, the determined speed will be the highest supported speed in common with the link controller in root complex 214 and all PCI links to the endpoints in the tree.
Between to and t1, the Procedure Phase is in the query phase. During this time the link is being used to read the registers in the PCIe configuration space, including the vendor ID (VID), device ID (DID), and the ESM Developer Designated Vendor Specific Extended Capability identified in extended capability descriptor 500. Moreover, the speed is set to the initial boot speed for Gen 3, namely 8.0 GT/s. The Link State is in the L0 (fully on) state, and the Link Attributes are Data Stream.
Between t1 and t4, the Procedure Phase is in the setup phase. Between t1 and t2, there is no data transfer operation on the link. The speed changes to the Gen 1 (2.5 GT/s) link speed, and the Link State enters the Recovery state before transitioning back to the L0 at the new Link Speed. During this time, the link transmits training sets labeled “TSx” separated by an EI condition. Between t2 and t3, root complex 314 writes to registers, e.g. ESM Control Register 542, to enable ESM on all supported links. The Link Speed remains in Gen 1, and the Link State in L0. The Link Attributes are Data Stream. Between t3 and t4, the Link State changes from L0 to L1 and Recovery before returning to L0. The Link Attributes are L1 DLLPs, followed by EI, followed by training set TSx, followed by a Data Stream.
Between t4 and t5, the Procedure Phase is to Execute the Speed Change according to the highest commonly supported speed by all DSPs and USPs on a supported link. The Link Speed is changed to the selected ESM link speed, which can be different than the Gen 3 link speed. The link enters the Recovery link state before returning to the L0 state. The link transmits training sets TSx interrupted by the EI state during the speed change, before transmitting further training sets TSx at the new ESM link speed.
Following the end of training occurring at t5, the link operates in ESM, and performs data accesses in the L0 link state at the enhanced rate. Once the necessary setup has been completed and the link is back in the L0, any subsequent speed changes that negotiate to the Gen 3 or Gen 4 data rates will run at the ESM data rate programmed into the ESM Control register. Any speed changes that negotiate to the Gen 1 or Gen 2 data rates will run at their ‘normal’ data rates (2.5 GT/s for Gen 1, 5.0 GT/s for Gen 2).
CCIX protocol layer block 1502 is responsible for the coherency protocol, including memory read and write flows. Cache states defined in this layer allow the determination of the state of the memory, for example whether the data is unique and clean or if it is shared and dirty. CCIX protocol layer block 1502 is bidirectionally connected to CCIX port with CCIX link layer block 1504. The CCIX Protocol Layer is responsible for formatting CCIX traffic and forming and decoding CCIX.
PCIe port 1510 includes a CCIX transaction layer block 1512, a PCIe transaction layer block 1514, a PCIe data link layer block 1520, and a CCIX/PCIe physical layer block 1530. CCIX transaction layer block 1512 is responsible for handling CCIX packets, while PCIe transaction layer block 1514 is responsible for handling PCIe packets. PCIe port 1510 supports virtual channels to allow different data streams to travel across a single PCIe link. By splitting CCIX traffic into one virtual channel and PCIe traffic into a second virtual channel, PCIe port 1510 allows both CCIX and PCIe traffic to share the same PCIe medium 1540. PCIe data link layer block 1520 performs all of the normal functions of the data link layer, including CRC error checking, packet acknowledgment and timeout checking, and credit initialization and exchange. CCIX/PCIe physical layer block 1530 is built on a standard PCIe physical layer. CCIX extends PCIe to support a 25 GT/s ESM (Extended Speed Mode), which extends beyond the 16.0 GT/s speed first introduced in the PCI Express 4.0 standard. In addition, it supports extended speeds between the PCI Express 4.0 standard speed (16 GT/s) and a higher speed (such as 25 GT/s). Thus it supports greater granularity and a more robust migration path to higher speeds.
Thus, the interconnect controller disclosed herein provides a robust controller that provides a seamless upgrade path to higher performance without the necessity of waiting for periodic revisions to a published standard. The higher performance can take one of two forms. First, it can extend beyond a highest data transfer rate and/or highest clock speed yet specified by the standard, avoiding the need for corresponding communication components to be developed after a new revision of a standard has been published. Second, it adds intermediate data transfer rates and/or clock speeds that can be supported by the semiconductor manufacturing technology without making a whole step to a next published rate. The interconnect controller and data processors incorporating such an interconnect controller leverages advances in semiconductor manufacturing technology that may lead development of published standards, and allow enhanced performance without the constraints of large, discrete speed steps. The techniques are useful in a variety of data communication protocols, including PCIe and CCIX (that operates using PCIe data link and physical layers).
PCIe port controller 216 and CPU 210 or any portions thereof may be described or represented by a computer accessible data structure in the form of a database or other data structure which can be read by a program and used, directly or indirectly, to fabricate integrated circuits. For example, this data structure may be a behavioral-level description or register-transfer level (RTL) description of the hardware functionality in a high-level design language (HDL) such as Verilog or VHDL. The description may be read by a synthesis tool which may synthesize the description to produce a netlist including a list of gates from a synthesis library. The netlist includes a set of gates that also represent the functionality of the hardware including integrated circuits. The netlist may then be placed and routed to produce a data set describing geometric shapes to be applied to masks. The masks may then be used in various semiconductor fabrication steps to produce the integrated circuits. Alternatively, the database on the computer accessible storage medium may be the netlist (with or without the synthesis library) or the data set, as desired, or Graphic Data System (GDS) II data.
While particular embodiments have been described, various modifications to these embodiments will be apparent to those skilled in the art. For example, various high-frequency oscillators can be used for the reference oscillator in a time-to-digital converter. These include a CMOS ring oscillator, a series-resonant LC oscillator, a parallel-resonant LC oscillator, and an RC oscillator. Moreover various current controlled oscillator circuits can be used. In current controlled oscillator circuits that are based on a resistor, the resistor can take various forms such as a polysilicon resistor, a thin-film metal alloy resistor, and a thin-film metal mixture resistor. Moreover the divider can use a variety of fixed numbers for the numerator.
Accordingly, it is intended by the appended claims to cover all modifications of the disclosed embodiments that fall within the scope of the disclosed embodiments.
Claims
1. An interconnect controller for a data processing platform comprising:
- a data link layer controller for selectively receiving data packets from and sending data packets to a higher protocol layer; and
- a physical layer controller coupled to said data link layer controller and adapted to be coupled to a communication link, said physical layer controller operating according to a predetermined protocol selectively at one of a plurality of enhanced speeds that are not specified by any published standard and are separated from each other by the same predetermined amount,
- wherein in response to performing a link initialization, the interconnect controller: performs at least one setup operation to select a selected speed; and subsequently operates the communication link using said selected speed.
2. The interconnect controller of claim 1, wherein:
- each of said plurality of enhanced speeds is between a first predetermined link speed and a second predetermined link speed specified by a published standard.
3. The interconnect controller of claim 1, wherein:
- each of said plurality of enhanced speeds is higher than a highest published speed supported by a published standard.
4. The interconnect controller of claim 1, wherein:
- the interconnect controller subsequently operates the communication link by programming a loop divider of a phase locked loop.
5. The interconnect controller of claim 4, wherein:
- the interconnect controller determines said selected speed by accessing at least one enhanced speed mode capability register.
6. The interconnect controller of claim 1, wherein:
- said physical layer controller is compliant with the Peripheral Component Interconnect Express (PCIe) Base Specification;
- the interconnect controller queries the data processing platform to determine whether an enhanced speed mode is permitted; and
- the interconnect controller is part of a PCIe root complex that determines an enhanced speed for the interconnect controller based on capabilities of an upstream port of an endpoint as a highest mutually supported speed by the interconnect controller and said upstream port of said endpoint.
7. The interconnect controller of claim 1, wherein:
- said higher protocol layer comprises one of a PCIe transaction layer and a cache coherent interconnect for accelerators (CCIX) transaction layer.
8. The interconnect controller of claim 1, wherein:
- said physical layer controller further operating according to said predetermined protocol selectively at one of a first predetermined link speed specified by a published standard and said plurality of enhanced speeds.
9. A data processing platform comprising:
- a basic input/output system (BIOS);
- a data processor comprising: a central processing unit coupled to and responsive to said BIOS to execute an initialization procedure; and an interconnect controller coupled to said central processing unit and adapted to be coupled to a communication link that operates according to a predetermined protocol selectively at one of a plurality of enhanced speeds that are not specified by any published standard and are separated from each other by the same predetermined amount;
- wherein said BIOS comprises instructions that when executed by said central processing unit cause said interconnect controller to: perform at least one setup operation to select a speed; and subsequently operate the communication link using a selected speed.
10. The data processing platform of claim 9, wherein:
- each of said plurality of enhanced speeds is between a first predetermined link speed and a second predetermined speed specified by a published standard.
11. The data processing platform of claim 9, wherein:
- each of said plurality of enhanced speeds is higher than a highest published speed supported by a published standard.
12. The data processing platform of claim 9, wherein:
- the interconnect controller subsequently operates the communication link by programming a loop divider of a phase locked loop.
13. The data processing platform of claim 12, wherein:
- the interconnect controller determines said selected speed by accessing at least one enhanced speed mode capability register.
14. The data processing platform of claim 9, wherein:
- the interconnect controller is compliant with the Peripheral Component Interconnect Express (PCIe) Base Specification; and
- the interconnect controller queries the data processing platform to determine whether an enhanced speed mode is permitted.
15. The data processing platform of claim 9, wherein:
- the interconnect controller is part of a PCIe root complex that determines an enhanced speed for the interconnect controller based on capabilities of an upstream port of an endpoint as a highest mutually supported speed by the interconnect controller and said upstream port of said endpoint.
16. A method for use in a data processing platform having an interconnect controller that operates a communication link according to a published standard, comprising:
- querying the data processing platform to determine whether an enhanced speed mode is permitted;
- performing at least one setup operation to select an operating speed for operating according to a predetermined protocol from among a plurality of enhanced speeds that are not specified by the published standard and are separated from each other by the same predetermined amount; and
- subsequently operating the communication link using said operating speed.
17. The method of claim 16, wherein:
- the published standard supports a first predetermined link speed and a second predetermined link speed greater than said first predetermined link speed; and
- said operating speed is between said first predetermined link speed and said second predetermined link speed.
18. The method of claim 16, wherein:
- said plurality of enhanced speeds are greater than a first predetermined link speed, wherein said first predetermined link speed is a highest published speed supported by said published standard.
19. The method of claim 18, wherein:
- subsequently operating the communication link using said operating speed comprises programming a loop divider of a phase locked loop.
20. The method of claim 19, further comprising:
- determining said operating speed by accessing at least one enhanced speed mode capability register.
Type: Application
Filed: Oct 18, 2021
Publication Date: Feb 3, 2022
Applicants: ATI Technologies ULC (Markham, ON), Advanced Micro Devices, Inc. (Santa Clara, CA)
Inventors: Gordon Caruk (Brampton), Gerald R. Talbot (Concord, MA)
Application Number: 17/503,959