Patents Assigned to ATI Technologies ULC

Integrated video codec and inference engine

Patent number: 10582250

Abstract: Systems, apparatuses, and methods for integrating a video codec with an inference engine are disclosed. A system is configured to implement an inference engine and a video codec while sharing at least a portion of its processing elements between the inference engine and the video codec. By sharing processing elements when combining the inference engine and the video codec, the silicon area of the combination is reduced. In one embodiment, the portion of processing elements which are shared include a motion prediction/motion estimation/MACs engine with a plurality of multiplier-accumulator (MAC) units, an internal memory, and peripherals. The peripherals include a memory interface, a direct memory access (DMA) engine, and a microprocessor. The system is configured to perform a context switch to reprogram the processing elements to switch between operating modes. The context switch can occur at a frame boundary or at a sub-frame boundary.

Type: Grant

Filed: July 24, 2017

Date of Patent: March 3, 2020

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Lei Zhang, Sateesh Lagudu, Allen Rush, Razvan Dan-Dobre
Deskewing method for a physical layer interface on a multi-chip module

Patent number: 10581587

Abstract: Systems, apparatuses, and methods for implementing a deskewing method for a physical layer interface on a multi-chip module are disclosed. A circuit connected to a plurality of communication lanes trains each lane to synchronize a local clock of the lane with a corresponding global clock at a beginning of a timing window. Next, the circuit symbol rotates each lane by a single step responsive to determining that all of the plurality of lanes have an incorrect symbol alignment. Responsive to determining that some but not all of the plurality of lanes have a correct symbol alignment, the circuit symbol rotates lanes which have an incorrect symbol alignment by a single step. When the end of the timing window has been reached, the circuit symbol rotates lanes which have a correct symbol alignment and adjusts a phase of a corresponding global clock to compensate for missed symbol rotations.

Type: Grant

Filed: April 29, 2019

Date of Patent: March 3, 2020

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Varun Gupta, Milam Paraschou, Gerald R. Talbot, Gurunath Dollin, Damon Tohidi, Eric Ian Carpenter, Chad S. Gallun, Jeffrey Cooper, Hanwoo Cho, Thomas H. Likens, III, Scott F. Dow, Michael J. Tresidder
Live update of a kernel device module

Patent number: 10572246

Abstract: Systems, apparatuses, and methods for implementing live device driver updates are disclosed. When a processor loads a given version of a device driver, the given version registers with a proxy module rather than registering with the operating system. If a previous version of the device driver is already running, the proxy module provides the given version with a pointer to the previous version. The given version uses the pointer to retrieve static data from the previous version. After the previous version is quiesced, the given version retrieves transient data from the previous version and then takes over as the running version of the device driver. Subsequent versions of the device driver are able to replace previous versions in a similar manner. Also, previous versions of the device driver are able to replace subsequent versions in a similar manner in the case of downgrading.

Type: Grant

Filed: August 30, 2018

Date of Patent: February 25, 2020

Assignee: ATI Technologies ULC

Inventor: Kelly Donald Clark Zytaruk
VARYING FIRMWARE FOR VIRTUALIZED DEVICE

Publication number: 20200034183

Abstract: A technique for varying firmware for different virtual functions in a virtualized device is provided. The virtualized device includes a hardware accelerator and a microcontroller that executes firmware. The virtualized device is virtualized in that the virtualized device performs work for different virtual functions (with different virtual functions associated with different virtual machines), each function getting a “time-slice” during which work is performed for that function. To vary the firmware, each time the virtualized device switches from performing work for a current virtual function to work for a subsequent virtual function, one or more microcontrollers of the virtualized device examines memory storing addresses for firmware for the subsequent virtual function and begins executing the firmware for that subsequent virtual function. The addresses for the firmware are provided by a corresponding virtual machine at configuration time.

Type: Application

Filed: October 2, 2019

Publication date: January 30, 2020

Applicant: ATI Technologies ULC

Inventors: Yinan JIANG, Ahmed M. ABDELKHALEK, Guopei QIAO, Andy SUNG, Haibo LIU, Dezhi MING, Zhidong XU
Single pass flexible screen/scale rasterization

Patent number: 10546365

Abstract: An apparatus, such as a head mounted device (HMD), includes one or more processors configured to implement a graphics pipeline that renders pixels in window space with a nonuniform pixel spacing. The apparatus also includes a first distortion function that maps the non-uniformly spaced pixels in window space to uniformly spaced pixels in raster space. The apparatus further includes a scan converter configured to sample the pixels in window space through the first distortion function. The scan converter is configured to render display pixels used to generate an image for display to a user based on the uniformly spaced pixels in raster space. In some cases, the pixels in the window space are rendered such that a pixel density per subtended area is constant across the user's field of view.

Type: Grant

Filed: December 15, 2017

Date of Patent: January 28, 2020

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Michael Mantor, Laurent Lefebvre, Mika Tuomi, Kiia Kallio
Multiple linked list data structure

Patent number: 10545887

Abstract: A system and method for maintaining information of pending operations are described. A buffer uses multiple linked lists implementing a single logical queue for a single requestor. The buffer maintains multiple head pointers and multiple tail pointers for the single requestor. Data entries of the single logical queue are stored in an alternating pattern among the multiple linked lists. During the allocation of buffer entries, the tail pointers are selected in the same alternating manner, and during the deallocation of buffer entries, the multiple head pointers are selected in the same manner.

Type: Grant

Filed: February 24, 2017

Date of Patent: January 28, 2020

Assignee: ATI Technologies ULC

Inventors: Jimshed Mirza, Qian Ma
Direct doorbell ring in virtualized processing device

Patent number: 10545800

Abstract: A technique for facilitating direct doorbell rings in a virtualized system is provided. A first device is configured to “ring” a “doorbell” of a second device, where both the first and second devices are not a host processor such as a central processing unit and are coupled to an interconnect fabric such as peripheral component interconnect express (“PCIe”). The first device is configured to ring the doorbell of the second device by writing to a doorbell address in a guest physical address space. For security reasons, a check block checks an offset portion of the doorbell address against a set of allowed doorbell addresses for doorbells specified in the guest physical address space, allowing the doorbell to be written if the doorbell is included in the set of allowed doorbell addresses.

Type: Grant

Filed: May 31, 2017

Date of Patent: January 28, 2020

Assignee: ATI Technologies ULC

Inventors: Anthony Asaro, Gongxian Jeffrey Cheng
Hardware transmit equalization for high speed

Patent number: 10541841

Abstract: Systems, apparatuses, and methods for performing transmit equalization at a target high speed are disclosed. A computing system includes at least a transmitter, receiver, and a communication channel connecting the transmitter and the receiver. The communication channel includes a plurality of lanes which are subdivided into a first subset of lanes and a second subset of lanes. During equalization training, the first subset of lanes operate at a first speed while the second subset of lanes operate at a second speed. The first speed is the desired target speed for operating the communication link while the second speed is a relatively low speed capable of reliably carrying data over a given lane prior to equalization training. The first subset of lanes are trained at the first speed while feedback is conveyed from the receiver to the transmitter using the second subset of lanes operating at the second speed.

Type: Grant

Filed: September 13, 2018

Date of Patent: January 21, 2020

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Shiqi Sun, Michael J. Tresidder, Yanfeng Wang
System for video compression

Patent number: 10542268

Abstract: A system and method for providing video compression that includes encoding using an encoding engine a YUV stream wherein Y, U and V color values are encoded in parallel and patching together the Y, U and V color streams to form a compressed YUV output stream. The encoding engine further includes encoding each color value of the YUV stream in parallel using parallel encoding engines and a control engine for controlling operation all of the encoding engines in parallel. The YUV stream has an average bits per pixel value that varies from a first value to a second value that is double the first value. The encoding engine includes encoding the YUV stream in generally the same amount of time regardless of the average bits per pixel value.

Type: Grant

Filed: April 19, 2017

Date of Patent: January 21, 2020

Assignees: ADVANCED MICRO DEVICES, INC., ATI TECHNOLOGIES ULC

Inventors: Haibin Li, Zhen Chen, Lei Zhang, Ji Zhou, Zhong Cai
Method and apparatus for translation lookaside buffer with multiple compressed encodings

Patent number: 10540290

Abstract: Methods and apparatus obtain one or more system page table entries that represent virtual system (e.g., memory) page to physical system page translations. A number of the obtained system page table entries that can be encoded in each of a plurality of translation lookaside buffer (TLB) entry encoding formats are determined. The method and apparatus may select one of the TLB entry encoding formats that encode a number of the obtained system page table entries. The method and apparatus may encode a number of obtained system page table entries in the TLB entry encoding format selected into a compressed encoding format TLB entry. The method and apparatus may associate the compressed encoding format TLB entry with an encoding format indication of the encoding format selected. The method and apparatus may decode a compressed encoding format TLB entry based on a determined TLB entry encoding format.

Type: Grant

Filed: April 27, 2016

Date of Patent: January 21, 2020

Assignees: ATI Technologies ULC, Advanced Micro Devices, Inc.

Inventors: Gabriel H Loh, Jimshed Mirza
High-speed selective cache invalidates and write-backs on GPUS

Patent number: 10540280

Abstract: Techniques for performing cache invalidates and write-backs in an accelerated processing device (e.g., a graphics processing device that renders three-dimensional graphics) are disclosed. The techniques involve receiving requests from a “master” (e.g., the central processing unit). The techniques involve invalidating virtual-to-physical address translations in an address translation request. The techniques include splitting up the requests based on whether the requests target virtually or physically tagged caches. Addresses for the portions of a request that target physically tagged caches are translated using invalidated virtual-to-physical address translations for speed. The split up request is processed to generate micro-transactions for individual caches targeted by the request. Micro-transactions for physically and virtually tagged caches are processed in parallel. Once all micro-transactions for a request have been processed, the unit that made the request is notified.

Type: Grant

Filed: December 23, 2016

Date of Patent: January 21, 2020

Assignees: ADVANCED MICRO DEVICES, INC., ATI TECHNOLOGIES ULC

Inventors: Mark Fowler, Jimshed Mirza, Anthony Asaro
METHOD AND SYSTEM FOR PARTIAL WAVEFRONT MERGER

Publication number: 20200019530

Abstract: A method and system for partial wavefront merger is described. Vector processing machines employ the partial wavefront merger to merge partial wavefronts into one or more wavefronts. The system includes a partial wavefront manager and unified registers. The partial wavefront manager detects wavefronts in different single-instruction-multiple-data (“SIMD”) units which contain inactive work items and active work items (hereinafter referred to as “partial wavefronts”), moves the partial wavefronts into one or more SIMD unit(s) and merges the partial wavefronts into one or more wavefront(s). The unified register allows each active work item in the one or more merged wavefront(s) to access the previously allocated registers in the originating SIMD units. Consequently, the contents of the unified registers do not have to be copied to the SIMD unit(s) executing the one or merged wavefront(s).

Type: Application

Filed: July 23, 2018

Publication date: January 16, 2020

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Yunpeng Zhu, Jimshed Mirza
Shader writes to compressed resources

Patent number: 10535178

Abstract: Systems, apparatuses, and methods for performing shader writes to compressed surfaces are disclosed. In one embodiment, a processor includes at least a memory and one or more shader units. In one embodiment, a shader unit of the processor is configured to receive a write request targeted to a compressed surface. The shader unit is configured to identify a first block of the compressed surface targeted by the write request. Responsive to determining the data of the write request targets less than the entirety of the first block, the first shader unit reads the first block from the cache and decompress the first block. Next, the first shader unit merges the data of the write request with the decompressed first block. Then, the shader unit compresses the merged data and writes the merged data to the cache.

Type: Grant

Filed: December 22, 2016

Date of Patent: January 14, 2020

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Jimshed Mirza, Christopher J. Brennan, Anthony Chan, Leon Lai
Storing microcode for a virtual function in a trusted memory region

Patent number: 10534730

Abstract: A first processor that has a trusted relationship with a trusted memory region (TMR) that includes a first region for storing microcode used to execute a microcontroller on a second processor and a second region for storing data associated with the microcontroller. The microcontroller supports a virtual function that is executed on the second processor. An access controller is configured by the first processor to selectively provide the microcontroller with access to the TMR based on whether the request is to write in the first region. The access controller grants read requests from the microcontroller to read from the first region and denies write requests from the microcontroller to write to the first region. The access controller grants requests from the microcontroller to read from the second region or write to the second region.

Type: Grant

Filed: December 20, 2018

Date of Patent: January 14, 2020

Assignee: ATI Technologies ULC

Inventors: Kathirkamanathan Nadarajah, Anthony Asaro
Page table management for differing virtual and physical address page alignment

Patent number: 10528478

Abstract: Techniques for managing page tables for an accelerated processing device are provided. The page tables for the accelerated processing device include a primary page table and secondary page tables. The page size selected for any particular secondary page table is dependent on characteristics of the memory allocations for which translations are stored in the secondary page table. Any particular memory allocation is associated with a particular “initial” page size. Translations for multiple allocations may be placed into a single secondary page table, and a particular page size is chosen for all such translations. The page size is the smallest of the natural page sizes for the allocations that are not using a translate further technique. The translation further technique is a technique wherein secondary page table entries do not themselves provide translations but instead point to an additional page table level referred to as the translate further page table level.

Type: Grant

Filed: May 30, 2017

Date of Patent: January 7, 2020

Assignee: ATI TECHNOLOGIES ULC

Inventor: Dhirendra Partap Singh Rana
Pixelation optimized delta color compression

Patent number: 10529118

Abstract: A technique for compressing an original image is disclosed. According to the technique, an original image is obtained and a delta-encoded image is generated based on the original image. Next, a segregated image is generated based on the delta-encoded image and then the segregated image is compressed to produce a compressed image. The segregated image is generated because the segregated image may be compressed more efficiently than the original image and the delta image.

Type: Grant

Filed: June 29, 2018

Date of Patent: January 7, 2020

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Ruijin Wu, Skyler Jonathon Saleh, Christopher J. Brennan, Kei Ming Kwong, Anthony Hung-Cheong Chan
Low loss T-coil configuration with frequency boost for an analog receiver front end

Patent number: 10530325

Abstract: Systems, apparatuses, and methods for performing efficient data transfer in a computing system are disclosed. A computing system includes multiple transmitters sending singled-ended data signals to multiple receivers. A receiver includes multiple series inductors moved from a signal path to sampling circuitry to a termination path used for impedance matching. The removed direct current (DC) resistances of the inductors in the signal path reduces signal attenuation. The termination path has alternating current (AC) reactances of the inductors, which provide a frequency-dependent termination impedance. This termination impedance provides a positive reflection coefficient for high operating frequencies, which boosts the input signal being received by the sampling circuitry.

Type: Grant

Filed: August 30, 2018

Date of Patent: January 7, 2020

Assignees: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Dean E. Gonzales, Xuan Chen, Jeffrey Cooper, Milam Paraschou
PIXELATION OPTIMIZED DELTA COLOR COMPRESSION

Publication number: 20200005514

Abstract: A technique for compressing an original image is disclosed. According to the technique, an original image is obtained and a delta-encoded image is generated based on the original image. Next, a segregated image is generated based on the delta-encoded image and then the segregated image is compressed to produce a compressed image. The segregated image is generated because the segregated image may be compressed more efficiently than the original image and the delta image.

Type: Application

Filed: June 29, 2018

Publication date: January 2, 2020

Applicants: Advanced Micro Devices, Inc., ATI Technologies ULC

Inventors: Ruijin Wu, Skyler Jonathon Saleh, Christopher J. Brennan, Kei Ming Kwong, Anthony Hung-Cheong Chan
Server-based encoding of adjustable frame rate content

Patent number: 10523947

Abstract: Systems, apparatuses, and methods for encoding bitstreams of uniquely rendered video frames with variable frame rates are disclosed. A rendering unit and an encoder in a server are coupled via a network to a client with a decoder. The rendering unit dynamically adjusts the frame rate of uniquely rendered frames. Depending on the operating mode, the rendering unit conveys a constant frame rate to the encoder by repeating some frames or the rendering unit conveys a variable frame rate to the encoder by conveying only uniquely rendered frames to the encoder. Depending on the operating mode, the encoder conveys a constant frame rate bitstream to the decoder by encoding repeated frames as skip frames, or the encoder conveys a variable frame rate bitstream to the decoder by dropping repeated frames from the bitstream.

Type: Grant

Filed: September 29, 2017

Date of Patent: December 31, 2019

Assignee: ATI Technologies ULC

Inventors: Ihab Amer, Boris Ivanovic, Gabor Sines, Yang Liu, Ho Hin Lau, Haibo Liu, Kyle Plumadore
Method and apparatus for accessing non-volatile memory as byte addressable memory

Patent number: 10521389

Abstract: Described herein is a method and system for accessing a block addressable input/output (I/O) device, such as a non-volatile memory (NVM), as byte addressable memory. A front end processor connected to a Peripheral Component Interconnect Express (PCIe) switch performs as a front end interface to the block addressable I/O device to emulate byte addressability. A PCIe device, such as a graphics processing unit (GPU), can directly access the necessary bytes via the front end processor from the block addressable I/O device. The PCIe compatible devices can access data from the block I/O devices without having to go through system memory and a host processor. In an implementation, a system can include block addressable I/O, byte addressable I/O and hybrids thereof which support direct access to byte addressable memory by the host processor, GPU and any other PCIe compatible device.

Type: Grant

Filed: December 23, 2016

Date of Patent: December 31, 2019

Assignee: ATI Technologies ULC

Inventor: Gongxian Jeffrey Cheng

prev … 25 26 27 28 29 30 31 32 33 … next