Patents Assigned to Advanced Micro Devices
-
Publication number: 20200193681Abstract: Described herein is a technique for performing ray tracing. According to this technique, instead of executing intersection and/or any hit shaders during traversal of an acceleration structure to determine the closest hit for a ray, an acceleration structure is fully traversed in an invocation of a shader program, and the closest intersection with a triangle is recorded in a data structure associated with the material of the triangle. Later, a scheduler launches waves by grouping together multiple data items associated with the same material. The rays processed by that wave are processed with a continuation ray, rather than the full original ray. A continuation ray starts from the previous point of intersection and extends in the direction of the original ray. These steps help counter divergence that would occur if a single shader program that inlined the intersection and any hit shaders were executed.Type: ApplicationFiled: December 13, 2018Publication date: June 18, 2020Applicant: Advanced Micro Devices, Inc.Inventor: Skyler Jonathon Saleh
-
Publication number: 20200193683Abstract: A technique for classifying a ray tracing intersection with a triangle edge or vertex avoids either rendering holes or multiple hits of the same ray for different triangles. The technique employs a tie-breaking scheme in which certain types of edges are classified as hits and certain types of edges are classified as misses. The test is performed in a coordinate space that comprises a projection into the viewspace of the ray, and thus where the ray direction has a non-zero magnitude in one axis (e.g., z) but a zero magnitude in the two other axes. In this coordinate space, edges are classified as one of top, bottom, left, and right, and an intersection on an edge counts as a hit if the intersection hits a top or left edge, but a miss if the intersection hits a bottom or right edge. Vertices are processed in a related manner.Type: ApplicationFiled: December 13, 2018Publication date: June 18, 2020Applicant: Advanced Micro Devices, Inc.Inventor: Skyler Jonathon Saleh
-
Publication number: 20200195273Abstract: Described are systems and methods for lossy compression and restoration of data. The raw data is first truncated. Then the truncated data is compressed. The compressed truncated data can then be efficiently stored and/or transmitted using fewer bits. To restore the data, the compressed data is then decompressed and restoration bits are concatenated. The restoration bits are selected to compensate for statistical biasing introduced by the truncation.Type: ApplicationFiled: December 14, 2018Publication date: June 18, 2020Applicant: Advanced Micro Devices, Inc.Inventor: Gabriel H. Loh
-
Publication number: 20200192853Abstract: A link controller includes a Peripheral Component Interconnect Express (PCIe) physical layer circuit for coupling to a communication link and providing a data path over the communication link, a first data link layer controller which operates according to a PCIe protocol, and a second data link layer controller which operates according to a Gen-Z protocol. A multiplexer-demultiplexer selectively connects both data link layer controllers to the PCIe physical layer circuit. A protocol translation circuit is coupled between the multiplexer-demultiplexer and the second data link layer controller, the protocol translation circuit receiving traffic data from the second data link layer controller in a Gen-Z format, encapsulating the Gen-Z format in a PCIe format, and passing traffic data to the multiplexer-demultiplexer circuit.Type: ApplicationFiled: May 30, 2019Publication date: June 18, 2020Applicants: ATI Technologies ULC, Advanced Micro Devices, Inc.Inventors: Gordon Caruk, Maurice B. Steinman, Gerald R. Talbot, Joseph D. Macri
-
Publication number: 20200193682Abstract: Described herein is a merged data path unit that has elements that are configurable to switch between different instruction types. The merged data path unit is a pipelined unit that has multiple stages. Between different stages lie multiplexor layers that are configurable to route data from functional blocks of a prior stage to a subsequent stage. The manner in which the multiplexor layers are configured for a particular stage is based on the instruction type executed at that stage. In some implementations, the functional blocks in different stages are also configurable by the control unit to change the operations performed. Further, in some implementations, the control unit has sideband storage that stores data that “skips stages.” An example of a merged data path used for performing a ray-triangle intersection test and a ray-box intersection test is also described herein.Type: ApplicationFiled: December 13, 2018Publication date: June 18, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Skyler Jonathon Saleh, Jian Mao
-
Publication number: 20200193684Abstract: Described herein is a technique for performing ray-triangle intersection without a floating point division unit. A division unit would be useful for a straightforward implementation of a certain type of ray-triangle intersection test that is useful in ray tracing operations. This certain type of ray-triangle intersection test includes a step that transforms the coordinate system into the viewspace of the ray, thereby reducing the problem of intersection to one of 2D triangle rasterization. However, a straightforward implementation of this transformation requires floating point division, as the transformation utilizes a shear operation to set the coordinate system such that the magnitudes of the ray direction on two of the axes are zero. Instead of using the most straightforward implementation of this transform, the technique described herein scales the entire coordinate system by the magnitude of the ray direction in the axis that is the denominator of the shear ratio, removing division.Type: ApplicationFiled: December 13, 2018Publication date: June 18, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Skyler Jonathon Saleh, Ruijin Wu
-
Publication number: 20200192671Abstract: A processing device is provided which includes memory and at least one processor. The memory includes main memory and cache memory in communication with the main memory via a link. The at least one processor is configured to receive a request for a cache line and read the cache line from main memory. The at least one processor is also configured to compress the cache line according to a compression algorithm and, when the compressed cache line includes at least one byte predicted not to be accessed, drop the at least one byte from the compressed cache line based on whether the compression algorithm is determined to successfully compress the cache line according to a compression parameter.Type: ApplicationFiled: December 14, 2018Publication date: June 18, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Shomit N. Das, Kishore Punniyamurthy, Matthew Tomei, Bradford M. Beckmann
-
Publication number: 20200192852Abstract: An interconnect controller includes a data link layer controller coupled to a transaction layer, wherein the data link layer controller selectively receives data packets from and sends data packets to the transaction layer, and a physical layer controller coupled to the data link layer controller and to a communication link. The physical layer controller selectively operates at a first predetermined link speed. The physical layer controller has an enhanced speed mode, wherein in response to performing a link initialization, the interconnect controller queries a data processing platform to determine whether the enhanced speed mode is permitted, performs at least one setup operation to select an enhanced speed, wherein the enhanced speed is greater than the first predetermined link speed, and subsequently operates the communication link using the enhanced speed.Type: ApplicationFiled: December 14, 2018Publication date: June 18, 2020Applicants: ATI Technologies ULC, Advanced Micro Devices, Inc.Inventors: Gordon Caruk, Gerald R. Talbot
-
Publication number: 20200193673Abstract: A technique for executing pixel shader programs is provided. The pixel shader programs are executed in workgroups, which allows access by work-items to a local data store and also allows program synchronization at barrier points. Utilizing workgroups allows for more flexible and efficient execution than previous implementations in the pixel shader stage. Several techniques for assigning fragments to wavefronts and workgroups are also provided. The techniques differ in the degree of geometric locality of fragments within wavefronts and/or workgroups. In some techniques, a greater degree of locality is enforced, which reduces processing unit occupancy but also reduces program complexity. In other techniques, a lower degree of locality is enforced, which increases processing unit occupancy.Type: ApplicationFiled: December 13, 2018Publication date: June 18, 2020Applicant: Advanced Micro Devices, Inc.Inventor: Skyler Jonathon Saleh
-
Publication number: 20200192842Abstract: Bus protocol features are provided for chaining memory access requests on a high speed interconnect bus, allowing for reduced signaling overhead. Multiple memory request messages are received over a bus. A first message has a source identifier, a target identifier, a first address, and first payload data. The first payload data is stored in a memory at locations indicated by the first address. Within a selected second one of the request messages, a chaining indicator is received associated with the first request message and second payload data. The second request message does not include an address. Based on the chaining indicator, a second address for which memory access is requested is calculated based on the first address. The second payload data is stored in the memory at locations indicated by the second address.Type: ApplicationFiled: December 14, 2018Publication date: June 18, 2020Applicants: ATI Technologies ULC, Advanced Micro Devices, Inc.Inventors: Philip Ng, Vydhyanathan Kalyanasundharam
-
Patent number: 10684965Abstract: Systems, apparatuses, and methods for routing traffic between clients and system memory are disclosed. A computing system includes system memory and one or more clients, each capable of generating memory access requests. The computing system also includes a communication fabric for transferring traffic between the clients and the system memory. The fabric includes master units for interfacing with clients and grouping write requests with a same target together. The fabric also includes slave units for interfacing with memory controllers and for sending a single write response when each write request in a group has been serviced. When the master unit receives the single write response for the group, it sends a respective acknowledgment response for each of the multiple write requests in the group to clients that generated the multiple write requests.Type: GrantFiled: November 8, 2017Date of Patent: June 16, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Vydhyanathan Kalyanasundharam, Amit P. Apte, Chen-Ping Yang
-
Patent number: 10684902Abstract: Described herein are a method and apparatus for memory vulnerability prediction. A memory vulnerability predictor predicts the reliability of a memory region when it is first accessed, based on past program history. The memory vulnerability predictor uses a table to store reliability predictions and predicts reliability needs of a new memory region. A memory management module uses the reliability information to make decisions, (such as to guide memory placement policies in a heterogeneous memory system).Type: GrantFiled: July 28, 2017Date of Patent: June 16, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Vilas Sridharan, David A. Roberts
-
Patent number: 10684957Abstract: An apparatus and method performs neighborhood-aware virtual to physical address translations. A coalescing opportunity for a first virtual address is determined, based on completing a memory access corresponding to a page walk for a second virtual address. Metadata corresponding to the first virtual address is provided to a page table walk buffer based on the coalescing opportunity and a page walk for the first virtual address is performed based on the metadata corresponding to the first virtual address.Type: GrantFiled: August 23, 2018Date of Patent: June 16, 2020Assignee: Advanced Micro Devices, Inc.Inventors: Michael W. Lebeane, Seunghee Shin
-
Patent number: 10684969Abstract: In one form, a memory controller includes a command queue and an arbiter. The command queue receives and stores memory access requests. The arbiter includes a plurality of sub-arbiters for providing a corresponding plurality of sub-arbitration winners from among the memory access requests during a controller cycle, and for selecting among the plurality of sub-arbitration winners to provide a plurality of memory commands in a corresponding controller cycle. In another form, a data processing system includes a memory accessing agent for providing memory accesses requests, a memory system, and the memory controller coupled to the memory accessing agent and the memory system.Type: GrantFiled: July 15, 2016Date of Patent: June 16, 2020Assignee: Advanced Micro Devices, Inc.Inventors: James R. Magro, Kedarnath Balakrishnan, Jackson Peng, Hideki Kanayama
-
Publication number: 20200184002Abstract: A processing device is provided which includes memory configured to store data and a processor configured to determine, based on convolutional parameters associated with an image, a virtual general matrix-matrix multiplication (GEMM) space of a virtual GEMM space output matrix and generate, in the virtual GEMM space output matrix, a convolution result by matrix multiplying the data corresponding to a virtual GEMM space input matrix with the data corresponding to a virtual GEMM space filter matrix. The processing device also includes convolutional mapping hardware configured to map, based on the convolutional parameters, positions of the virtual GEMM space input matrix to positions of an image space of the image.Type: ApplicationFiled: August 30, 2019Publication date: June 11, 2020Applicant: Advanced Micro Devices, Inc.Inventors: Swapnil P. Sakharshete, Samuel Lawrence Wasmundt, Maxim V. Kazakov, Vineet Goel
-
Patent number: 10679316Abstract: Systems, apparatuses, and methods for implementing a single pass stipple pattern generation process are disclosed. A processor initiates parallel execution of a first and second plurality of wavefronts. A first wavefront of the first plurality of wavefronts converts a first local coordinate into a first global coordinate, wherein the first local coordinate corresponds to a first portion of a primitive. Also, a first wavefront of the second plurality of wavefronts applies a first attribute to the first global coordinate prior to a second wavefront, of the first plurality of wavefronts, converting a second local coordinate of a second portion of the primitive into a second global coordinate. The second plurality of wavefronts generate image data based on applying the first attribute to global coordinates generated by the first plurality of wavefronts, and the image data is conveyed for display on a display device.Type: GrantFiled: June 13, 2018Date of Patent: June 9, 2020Assignee: Advanced Micro Devices, Inc.Inventor: Sean M. O'Connell
-
Patent number: 10681125Abstract: A method of message-based communication is provided which includes executing, on one or more accelerated processing units, a plurality of groups of work items, receiving a first message from a first group of work items of the plurality of groups of work items executing on the one or more accelerated processing units and storing the first message at a first segment of memory allocated to a second group of work items of the plurality of groups of work items executing on the accelerated processing unit.Type: GrantFiled: March 29, 2016Date of Patent: June 9, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventor: Shuai Che
-
Patent number: 10680927Abstract: Systems, apparatuses, and methods for dynamically adjusting data bandwidth to fit a predicted available bandwidth of a link are disclosed. A system includes a transmitter and a receiver communicating wirelessly over a wireless link. The system monitors a status of the wireless link at a plurality of points in time. The system trains a predictive model using a plurality of indicators of the status of the link at the plurality of points in time. The model generates a prediction of the available bandwidth of the link based on the plurality of indicators. Then, the transmitter dynamically adjusts an amount of data sent on the link to fit the predicted available bandwidth of the link.Type: GrantFiled: August 25, 2017Date of Patent: June 9, 2020Assignee: Advanced Micro Devices, Inc.Inventor: Ngoc Vinh Vu
-
Patent number: 10678702Abstract: The described embodiments include an input-output memory management unit (IOMMU) with two or more memory elements and a controller. The controller is configured to select, based on one or more factors, one or more selected memory elements from among the two or more memory elements for performing virtual address to physical address translations in the IOMMU. The controller then performs the virtual address to physical address translations using the one or more selected memory elements.Type: GrantFiled: May 27, 2016Date of Patent: June 9, 2020Assignee: ADVANCED MICRO DEVICES, INC.Inventors: Sergey Blagodurov, Andrew G. Kegel
-
Publication number: 20200175329Abstract: A generator for generating artificial data, and training for the same. Data corresponding to a first label is altered within a reference labeled data set. A discriminator is trained based on the reference labeled data set to create a selectively poisoned discriminator. A generator is trained based on the selectively poisoned discriminator to create a selectively poisoned generator. The selectively poisoned generator is tested for the first label and tested for the second label to determine whether the generator is sufficiently poisoned for the first label and sufficiently accurate for the second label. If it is not, the generator is retrained based on the data set including the further altered data. The generator includes a first ANN to input first information and output a set of artificial data that is classifiable using a first label and not classifiable using a second label of the set of labeled data.Type: ApplicationFiled: December 3, 2018Publication date: June 4, 2020Applicant: Advanced Micro Devices, Inc.Inventor: Nicholas Malaya