Patents Assigned to Advanced Micro Devices, Inc.
  • Publication number: 20200202605
    Abstract: A technique for determining the centroid for fragments generated using variable rate shading is provided. Because the barycentric interpolation used to determine texture coordinates for pixels is based on the premise that the point being interpolated is within the triangle, centroids that are outside of the triangle can produce undesirable visual artifacts. Another concern, however, is that the further the centroid is from the center of a pixel, the less accurate quad-based pixel derivatives become for attributes of that pixel. To address these concerns, the position of the sample that is both covered by the triangle and the closest to the center of the pixel, out of all covered samples of the pixel, is used as the centroid for a partially covered pixel. For a fully covered pixel (all samples in a pixel are covered by a triangle), the center of that pixel is used as the centroid.
    Type: Application
    Filed: December 19, 2018
    Publication date: June 25, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Skyler Jonathon Saleh, Pazhani Pillai
  • Publication number: 20200202594
    Abstract: A technique for performing rasterization and pixel shading with decoupled resolution is provided herein. The technique involves performing rasterization as normal to generate quads. The quads are accumulated into a tile buffer. A shading rate is determined for the contents of the tile buffer. If the shading rate is a sub-sampling shading rate, then the quads in the tile buffer are down-sampled, which reduces the amount of work to be performed by a pixel shader. The shaded down-sampled quads are then restored to the resolution of the render target. If the shading rate is a super-sampling shading rate, then the quads in the tile buffer are up-sampled. The results of the shaded down-sampled or up-sampled quads are written to the render target.
    Type: Application
    Filed: December 20, 2018
    Publication date: June 25, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Skyler Jonathon Saleh, Andrew S. Pomianowski
  • Publication number: 20200201763
    Abstract: Improvements to traditional schemes for storing data for processing tasks and for executing those processing tasks are disclosed. A set of data for which processing tasks are to be executed is processed through a hierarchy to distribute the data through various elements of a computer system. Levels of the hierarchy represent different types of memory or storage elements. Higher levels represent coarser portions of memory or storage elements and lower levels represent finer portions of memory or storage elements. Data proceeds through the hierarchy as “tasks” at different levels. Tasks at non-leaf nodes comprise tasks to subdivide data for storage in the finer granularity memories or storage units associated with a lower hierarchy level. Tasks at leaf nodes comprise processing work, such as a portion of a calculation. Two techniques for organizing the tasks in the hierarchy presented herein include a queue-based technique and a graph-based technique.
    Type: Application
    Filed: March 2, 2020
    Publication date: June 25, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Shuai Che
  • Patent number: 10692545
    Abstract: Systems, apparatuses, and methods for performing efficient data transfer in a computing system are disclosed. A termination voltage generator includes an inverter-based chopper circuit, which uses a first group of an even number of serially connected inverters coupled between the output node of the chopper circuit and the gate terminal of an output pmos transistor. Additionally, a second group of an even number of serially connected inverters is coupled between the output node and the gate terminal of an output nmos transistor. A replica inverter includes two serially connected pmos transistors and two serially connected nmos transistors. Each of one pmos transistor and one nmos transistor receives a generated voltage set as the expected value of the termination voltage. Each of the other pmos transistor and nmos transistor receives an output based on a comparison between the expected value to the output of the replica inverter.
    Type: Grant
    Filed: September 24, 2018
    Date of Patent: June 23, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Milam Paraschou, Balwinder Singh, Gerald R. Talbot, Alushulla Jack Ambundo, Edoardo Prete, Thomas H. Likens, III, Michael A. Margules
  • Patent number: 10691772
    Abstract: A method includes storing a sparse triangular matrix as a compressed sparse row (CSR) dataset. For each factor of a plurality of factors in a first vector, a value of the factor is calculated by identifying for the factor a set of one or more antecedent factors in the first vector, where the value of the factor is dependent on each of the one or more antecedent factors. In response to a completion array indicating that all of the one or more antecedent factor values are solved, the value of the factor is calculated based on one or more elements in a row of the matrix and a product value corresponding to the row. In the completion array, a first completion flag for the factor is asserted, indicating that the factor is solved.
    Type: Grant
    Filed: April 20, 2018
    Date of Patent: June 23, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Joseph Lee Greathouse
  • Patent number: 10692271
    Abstract: A technique for classifying a ray tracing intersection with a triangle edge or vertex avoids either rendering holes or multiple hits of the same ray for different triangles. The technique employs a tie-breaking scheme in which certain types of edges are classified as hits and certain types of edges are classified as misses. The test is performed in a coordinate space that comprises a projection into the viewspace of the ray, and thus where the ray direction has a non-zero magnitude in one axis (e.g., z) but a zero magnitude in the two other axes. In this coordinate space, edges are classified as one of top, bottom, left, and right, and an intersection on an edge counts as a hit if the intersection hits a top or left edge, but a miss if the intersection hits a bottom or right edge. Vertices are processed in a related manner.
    Type: Grant
    Filed: December 13, 2018
    Date of Patent: June 23, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Skyler Jonathon Saleh
  • Publication number: 20200193684
    Abstract: Described herein is a technique for performing ray-triangle intersection without a floating point division unit. A division unit would be useful for a straightforward implementation of a certain type of ray-triangle intersection test that is useful in ray tracing operations. This certain type of ray-triangle intersection test includes a step that transforms the coordinate system into the viewspace of the ray, thereby reducing the problem of intersection to one of 2D triangle rasterization. However, a straightforward implementation of this transformation requires floating point division, as the transformation utilizes a shear operation to set the coordinate system such that the magnitudes of the ray direction on two of the axes are zero. Instead of using the most straightforward implementation of this transform, the technique described herein scales the entire coordinate system by the magnitude of the ray direction in the axis that is the denominator of the shear ratio, removing division.
    Type: Application
    Filed: December 13, 2018
    Publication date: June 18, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Skyler Jonathon Saleh, Ruijin Wu
  • Publication number: 20200195273
    Abstract: Described are systems and methods for lossy compression and restoration of data. The raw data is first truncated. Then the truncated data is compressed. The compressed truncated data can then be efficiently stored and/or transmitted using fewer bits. To restore the data, the compressed data is then decompressed and restoration bits are concatenated. The restoration bits are selected to compensate for statistical biasing introduced by the truncation.
    Type: Application
    Filed: December 14, 2018
    Publication date: June 18, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Gabriel H. Loh
  • Publication number: 20200193682
    Abstract: Described herein is a merged data path unit that has elements that are configurable to switch between different instruction types. The merged data path unit is a pipelined unit that has multiple stages. Between different stages lie multiplexor layers that are configurable to route data from functional blocks of a prior stage to a subsequent stage. The manner in which the multiplexor layers are configured for a particular stage is based on the instruction type executed at that stage. In some implementations, the functional blocks in different stages are also configurable by the control unit to change the operations performed. Further, in some implementations, the control unit has sideband storage that stores data that “skips stages.” An example of a merged data path used for performing a ray-triangle intersection test and a ray-box intersection test is also described herein.
    Type: Application
    Filed: December 13, 2018
    Publication date: June 18, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Skyler Jonathon Saleh, Jian Mao
  • Publication number: 20200193673
    Abstract: A technique for executing pixel shader programs is provided. The pixel shader programs are executed in workgroups, which allows access by work-items to a local data store and also allows program synchronization at barrier points. Utilizing workgroups allows for more flexible and efficient execution than previous implementations in the pixel shader stage. Several techniques for assigning fragments to wavefronts and workgroups are also provided. The techniques differ in the degree of geometric locality of fragments within wavefronts and/or workgroups. In some techniques, a greater degree of locality is enforced, which reduces processing unit occupancy but also reduces program complexity. In other techniques, a lower degree of locality is enforced, which increases processing unit occupancy.
    Type: Application
    Filed: December 13, 2018
    Publication date: June 18, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Skyler Jonathon Saleh
  • Publication number: 20200193683
    Abstract: A technique for classifying a ray tracing intersection with a triangle edge or vertex avoids either rendering holes or multiple hits of the same ray for different triangles. The technique employs a tie-breaking scheme in which certain types of edges are classified as hits and certain types of edges are classified as misses. The test is performed in a coordinate space that comprises a projection into the viewspace of the ray, and thus where the ray direction has a non-zero magnitude in one axis (e.g., z) but a zero magnitude in the two other axes. In this coordinate space, edges are classified as one of top, bottom, left, and right, and an intersection on an edge counts as a hit if the intersection hits a top or left edge, but a miss if the intersection hits a bottom or right edge. Vertices are processed in a related manner.
    Type: Application
    Filed: December 13, 2018
    Publication date: June 18, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Skyler Jonathon Saleh
  • Publication number: 20200192852
    Abstract: An interconnect controller includes a data link layer controller coupled to a transaction layer, wherein the data link layer controller selectively receives data packets from and sends data packets to the transaction layer, and a physical layer controller coupled to the data link layer controller and to a communication link. The physical layer controller selectively operates at a first predetermined link speed. The physical layer controller has an enhanced speed mode, wherein in response to performing a link initialization, the interconnect controller queries a data processing platform to determine whether the enhanced speed mode is permitted, performs at least one setup operation to select an enhanced speed, wherein the enhanced speed is greater than the first predetermined link speed, and subsequently operates the communication link using the enhanced speed.
    Type: Application
    Filed: December 14, 2018
    Publication date: June 18, 2020
    Applicants: ATI Technologies ULC, Advanced Micro Devices, Inc.
    Inventors: Gordon Caruk, Gerald R. Talbot
  • Publication number: 20200192842
    Abstract: Bus protocol features are provided for chaining memory access requests on a high speed interconnect bus, allowing for reduced signaling overhead. Multiple memory request messages are received over a bus. A first message has a source identifier, a target identifier, a first address, and first payload data. The first payload data is stored in a memory at locations indicated by the first address. Within a selected second one of the request messages, a chaining indicator is received associated with the first request message and second payload data. The second request message does not include an address. Based on the chaining indicator, a second address for which memory access is requested is calculated based on the first address. The second payload data is stored in the memory at locations indicated by the second address.
    Type: Application
    Filed: December 14, 2018
    Publication date: June 18, 2020
    Applicants: ATI Technologies ULC, Advanced Micro Devices, Inc.
    Inventors: Philip Ng, Vydhyanathan Kalyanasundharam
  • Publication number: 20200192671
    Abstract: A processing device is provided which includes memory and at least one processor. The memory includes main memory and cache memory in communication with the main memory via a link. The at least one processor is configured to receive a request for a cache line and read the cache line from main memory. The at least one processor is also configured to compress the cache line according to a compression algorithm and, when the compressed cache line includes at least one byte predicted not to be accessed, drop the at least one byte from the compressed cache line based on whether the compression algorithm is determined to successfully compress the cache line according to a compression parameter.
    Type: Application
    Filed: December 14, 2018
    Publication date: June 18, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Shomit N. Das, Kishore Punniyamurthy, Matthew Tomei, Bradford M. Beckmann
  • Publication number: 20200193681
    Abstract: Described herein is a technique for performing ray tracing. According to this technique, instead of executing intersection and/or any hit shaders during traversal of an acceleration structure to determine the closest hit for a ray, an acceleration structure is fully traversed in an invocation of a shader program, and the closest intersection with a triangle is recorded in a data structure associated with the material of the triangle. Later, a scheduler launches waves by grouping together multiple data items associated with the same material. The rays processed by that wave are processed with a continuation ray, rather than the full original ray. A continuation ray starts from the previous point of intersection and extends in the direction of the original ray. These steps help counter divergence that would occur if a single shader program that inlined the intersection and any hit shaders were executed.
    Type: Application
    Filed: December 13, 2018
    Publication date: June 18, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventor: Skyler Jonathon Saleh
  • Publication number: 20200192853
    Abstract: A link controller includes a Peripheral Component Interconnect Express (PCIe) physical layer circuit for coupling to a communication link and providing a data path over the communication link, a first data link layer controller which operates according to a PCIe protocol, and a second data link layer controller which operates according to a Gen-Z protocol. A multiplexer-demultiplexer selectively connects both data link layer controllers to the PCIe physical layer circuit. A protocol translation circuit is coupled between the multiplexer-demultiplexer and the second data link layer controller, the protocol translation circuit receiving traffic data from the second data link layer controller in a Gen-Z format, encapsulating the Gen-Z format in a PCIe format, and passing traffic data to the multiplexer-demultiplexer circuit.
    Type: Application
    Filed: May 30, 2019
    Publication date: June 18, 2020
    Applicants: ATI Technologies ULC, Advanced Micro Devices, Inc.
    Inventors: Gordon Caruk, Maurice B. Steinman, Gerald R. Talbot, Joseph D. Macri
  • Publication number: 20200193685
    Abstract: Described herein is a technique for performing ray-triangle intersection test in a manner that produces watertight results. The technique involves translating the coordinates of the triangle such that the origin is at the origin of the ray. The technique involves projecting the coordinate system into the viewspace of the ray. The technique then involves calculating barycentric coordinates and interpolating the barycentric coordinates to get a time of intersect. The signs of the barycentric coordinates indicate whether a hit occurs. The above calculations are performed with a non-directed floating point rounding mode to provide watertightness. A non-directed rounding mode is one in which the mantissa of a rounded number is rounded in a manner that is not dependent on the sign of the number.
    Type: Application
    Filed: December 13, 2018
    Publication date: June 18, 2020
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Skyler Jonathon Saleh, Ruijin Wu
  • Publication number: 20200192850
    Abstract: A link controller, method, and data processing platform are provided with dual-protocol capability. The link controller includes a physical layer circuit for providing a data lane over a communication link, a first data link layer controller which operates according to a first protocol, and a second data link layer controller which operates according to a second protocol. A multiplexer/demultiplexer selectively connects both data link layer controllers to the physical layer circuit. A link training and status state machine (LTSSM) selectively controls the physical layer circuit to transmit and receive first training ordered sets over the data lane, and inside the training ordered sets, transmit and receive alternative protocol negotiation information over the data lane. In response to receiving the alternative protocol negotiation information, the LTSSM causes the multiplexer/demultiplexer to selectively connect the physical layer circuit to the second data link layer controller.
    Type: Application
    Filed: December 18, 2018
    Publication date: June 18, 2020
    Applicants: ATI Technologies ULC, Advanced Micro Devices, Inc.
    Inventors: Gordon Caruk, Gerald R. Talbot
  • Patent number: 10684957
    Abstract: An apparatus and method performs neighborhood-aware virtual to physical address translations. A coalescing opportunity for a first virtual address is determined, based on completing a memory access corresponding to a page walk for a second virtual address. Metadata corresponding to the first virtual address is provided to a page table walk buffer based on the coalescing opportunity and a page walk for the first virtual address is performed based on the metadata corresponding to the first virtual address.
    Type: Grant
    Filed: August 23, 2018
    Date of Patent: June 16, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Michael W. Lebeane, Seunghee Shin
  • Patent number: 10684969
    Abstract: In one form, a memory controller includes a command queue and an arbiter. The command queue receives and stores memory access requests. The arbiter includes a plurality of sub-arbiters for providing a corresponding plurality of sub-arbitration winners from among the memory access requests during a controller cycle, and for selecting among the plurality of sub-arbitration winners to provide a plurality of memory commands in a corresponding controller cycle. In another form, a data processing system includes a memory accessing agent for providing memory accesses requests, a memory system, and the memory controller coupled to the memory accessing agent and the memory system.
    Type: Grant
    Filed: July 15, 2016
    Date of Patent: June 16, 2020
    Assignee: Advanced Micro Devices, Inc.
    Inventors: James R. Magro, Kedarnath Balakrishnan, Jackson Peng, Hideki Kanayama