Patents by Inventor Jeff Jiao
Jeff Jiao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20210240543Abstract: The present disclosure provides a method for syncing data of a computing task across a plurality of groups of computing nodes, each group comprising a set of computing nodes A-D, a set of intra-group interconnects that communicatively couple computing node A with computing nodes B and C and computing node D with computing nodes B and C, and a set of inter-group interconnects that communicatively couple a computing node A of a first group of the plurality of groups with a computing node A of a second group neighboring the first group, a computing node B of the first group with a computing node B of the second group, a computing node C of the first group with the computing node C of the second group, and a computing node D of the first group with a computing node D of the second group, the method comprising: syncing across a first dimension of computing nodes using a first set of ring connections, wherein the first set of ring connections are formed using inter-group and intra-group interconnects that communicaType: ApplicationFiled: January 30, 2020Publication date: August 5, 2021Inventors: Liang HAN, Jeff JIAO
-
Patent number: 9214007Abstract: Graphics processing units (GPUs) are used, for example, to process data related to three-dimensional objects or scenes and to render the three-dimensional data onto a two-dimensional display screen. One embodiment, among others, of a unified cache system used in a GPU comprises a data storage device and a storage device controller. The data storage device is configured to store graphics data processed by or to be processed by one or more shader units. The storage device controller is placed in communication with the data storage device. The storage device controller is configured to dynamically control a storage allocation of the graphics data within the data storage device.Type: GrantFiled: January 25, 2008Date of Patent: December 15, 2015Assignee: VIA TECHNOLOGIES, INC.Inventors: Jeff Jiao, Timour Paltashev
-
Patent number: 8963930Abstract: A system for integrating triangle setup and attribute setup operations into a programmable execution unit of a graphics processing unit is disclosed. A method for integrating triangle setup and attribute setup operations into a programmable execution unit graphics processing unit is also disclosed. In one embodiment, at least one execution unit is configured for multi-threaded operation. The at least one execution unit is configured to execute at least one thread for triangle setup operations and attribute setup operations as well as threads for pixel shader, geometry shader and vertex shader operations.Type: GrantFiled: December 12, 2007Date of Patent: February 24, 2015Assignee: Via Technologies, Inc.Inventors: Yang (Jeff) Jiao, Mike Hong, Yin Li, Yunjie Xu
-
Patent number: 8769207Abstract: Systems and methods for sharing a physical cache among one or more clients in a stream data processing pipeline are described. One embodiment is directed to a system for sharing caches between two or more clients. The system comprises a physical cache memory having a memory portion accessed through a cache index. The system further comprises at least two virtual cache spaces mapping to the memory portion, each of the virtual cache spaces has an active window which has a different size than the memory portion. Further, the system comprises at least one virtual cache controller configured to perform a hit-miss test on the active window of the virtual cache space in response to a request from one of the clients for accessing the physical cache memory. Furthermore, data is accessed from the corresponding location of the memory portion when the hit-miss test of the cache index returns a hit.Type: GrantFiled: January 16, 2008Date of Patent: July 1, 2014Assignee: Via Technologies, Inc.Inventors: Jeff Jiao, Timour Paltashev
-
Patent number: 8681162Abstract: A programmable graphics processing unit (GPU) includes a first shader stage configured to receive slice data from a frame buffer and perform variable length decoding (VLD), wherein the first shader stage outputs data to a first buffer within the frame buffer; a second shader stage configured to receive the output data from the first shader stage and perform transformation and motion compensation on the slice data, wherein the second shader stage outputs decoded slice data to a second buffer within the frame buffer; a third shader stage configured to receive the decoded slice data and perform in-loop deblocking filtering (IDF) on the frame buffer; a fourth shader stage configured to perform post-processing on the frame buffer; and a scheduler configured to schedule execution of the shader stages, the scheduler comprising a plurality of counter registers; wherein execution of the shader stages is synchronized utilizing the counter registers.Type: GrantFiled: October 15, 2010Date of Patent: March 25, 2014Assignee: VIA Technologies, Inc.Inventors: Timour Paltashev, John Brothers, Yi-Jung Su, Yang (Jeff) Jiao
-
Patent number: 8564604Abstract: Systems and methods for improving throughput of a graphics processing unit are disclosed. In one embodiment, a system includes a multithreaded execution unit capable of processing requests to access a constant cache, a vertex attribute cache, at least one common register file, and an execution unit data path substantially simultaneously.Type: GrantFiled: April 21, 2010Date of Patent: October 22, 2013Assignee: VIA Technologies, Inc.Inventor: Yang (Jeff) Jiao
-
Patent number: 8547385Abstract: Various systems and methods are described for accessing a shared memory in a graphics processing unit (GPU). One embodiment comprises determining whether data to be read from a shared memory aligns to a boundary of the shared memory, wherein the data comprises a plurality of data blocks, and wherein the shared memory comprises a plurality of banks and a plurality of offsets. A swizzle pattern in which the data blocks are to be arranged for processing is determined. Based on whether the data aligns with a boundary of the shared memory and based on the determined swizzle pattern, an order for performing one or more wrapping functions is determined. The shared memory is accessed by performing the one or more wrapping functions and reading the data blocks to construct the data according to the swizzle pattern.Type: GrantFiled: October 15, 2010Date of Patent: October 1, 2013Assignee: Via Technologies, Inc.Inventor: Yang (Jeff) Jiao
-
Patent number: 8514235Abstract: The present disclosure describes implementations for performing register accesses and operations in a graphics processing apparatus. In one implementation, a graphics processing apparatus comprises an execution unit for processing programmed shader operations, wherein the execution unit is configured for processing operations of a plurality of threads. The apparatus further comprises memory forming a register file that accommodates all register operations for all the threads executed by the execution unit, the memory being organized in a plurality of banks, with a first plurality of banks being allocated to a first plurality of the threads and a second plurality of banks being allocated to the remaining threads. In addition, the apparatus comprises address translation logic configured to translate logical register identifiers into physical register addresses.Type: GrantFiled: April 21, 2010Date of Patent: August 20, 2013Assignee: Via Technologies, Inc.Inventor: Yang (Jeff) Jiao
-
Patent number: 8499305Abstract: Systems and methods for thread group kickoff and thread synchronization are described. One method is directed to synchronizing a plurality of threads in a general purpose shader in a graphics processor. The method comprises determining an entry point for execution of the threads in the general purpose shader, performing a fork operation at the entry point, whereby the plurality of threads are dispatched, wherein the plurality of threads comprise a main thread and one or more sub-threads. The method further comprises performing a join operation whereby the plurality of threads are synchronized upon the main thread reaching a synchronization point. Upon completion of the join operation, a second fork operation is performed to resume parallel execution of the plurality of threads.Type: GrantFiled: October 15, 2010Date of Patent: July 30, 2013Assignee: VIA Technologies, Inc.Inventor: Yang (Jeff) Jiao
-
Patent number: 8319774Abstract: Embodiments of the present disclosure are directed to graphics processing systems, comprising: a plurality of execution units, wherein one of the execution units is configurable to process a thread corresponding to a rendering context, wherein the rendering context comprises a plurality of constants with a priority level; a constant buffer configurable to store the constants of the rendering context into a plurality of slot in a physical storage space; and an execution unit control unit configurable to assign the thread to one of the execution units; a constant buffer control unit providing a translation table for the rendering context to map the corresponding constants into the slots of the physical storage space. Comparable methods are also disclosed.Type: GrantFiled: November 29, 2011Date of Patent: November 27, 2012Assignee: Via Technologies, Inc.Inventors: Yang (Jeff) Jiao, Yijung Su, John Brothers
-
Patent number: 8174534Abstract: Various embodiments of shader processing systems and methods are disclosed. One method embodiment, among others, comprises a dependent texture read method executed using a multi-threaded, parallel computational core of a graphics processing unit (GPU). Such a method includes generating a dependent texture read request at logic configured to perform shader computations corresponding to a first thread, and sending shader-calculated, texture-sampling related parameters corresponding to the first thread to a texture pipeline while retaining at the logic all other shader processing related information corresponding to the first thread.Type: GrantFiled: December 6, 2007Date of Patent: May 8, 2012Assignee: Via Technologies, Inc.Inventor: Yang (Jeff) Jiao
-
Publication number: 20120096474Abstract: Systems and methods for thread group kickoff and thread synchronization are described. One method is directed to synchronizing a plurality of threads in a general purpose shader in a graphics processor. The method comprises determining an entry point for execution of the threads in the general purpose shader, performing a fork operation at the entry point, whereby the plurality of threads are dispatched, wherein the plurality of threads comprise a main thread and one or more sub-threads. The method further comprises performing a join operation whereby the plurality of threads are synchronized upon the main thread reaching a synchronization point. Upon completion of the join operation, a second fork operation is performed to resume parallel execution of the plurality of threads.Type: ApplicationFiled: October 15, 2010Publication date: April 19, 2012Applicant: VIA TECHNOLOGIES, INC.Inventor: Yang (Jeff) Jiao
-
Publication number: 20120092356Abstract: Various systems and methods are described for accessing a shared memory in a graphics processing unit (GPU). One embodiment comprises determining whether data to be read from a shared memory aligns to a boundary of the shared memory, wherein the data comprises a plurality of data blocks, and wherein the shared memory comprises a plurality of banks and a plurality of offsets. A swizzle pattern in which the data blocks are to be arranged for processing is determined. Based on whether the data aligns with a boundary of the shared memory and based on the determined swizzle pattern, an order for performing one or more wrapping functions is determined. The shared memory is accessed by performing the one or more wrapping functions and reading the data blocks to construct the data according to the swizzle pattern.Type: ApplicationFiled: October 15, 2010Publication date: April 19, 2012Applicant: VIA TECHNOLOGIES, INC.Inventor: Yang (Jeff) Jiao
-
Publication number: 20120092353Abstract: A multi-shader system in a programmable graphics processing unit (GPU) for processing video data, includes a first shader stage configured to receive slice data from a frame buffer and perform variable length decoding (VLD), wherein the first shader stage outputs data to a first buffer within the frame buffer; a second shader stage configured to receive the output data from the first shader stage and perform transformation and motion compensation on the slice data, wherein the second shader stage outputs decoded slice data to a second buffer within the frame buffer; a third shader stage configured to receive the decoded slice data and perform in-loop deblocking filtering (IDF) on the frame buffer; a fourth shader stage configured to perform post-processing on the frame buffer; and a scheduler configured to schedule execution of the shader stages, the scheduler comprising a plurality of counter registers; wherein execution of the shader stages is synchronized utilizing the counter registers.Type: ApplicationFiled: October 15, 2010Publication date: April 19, 2012Applicant: VIA TECHNOLOGIES, INC.Inventors: Timour Paltashev, John Brothers, Yi-Jung Su, Yang (Jeff) Jiao
-
Patent number: 8144149Abstract: The present disclosure is directed to novel methods and apparatus for managing or performing the dynamic allocation or reallocation of processing resources among a vertex shader, a geometry shader, and pixel shader of a graphics processing unit. In one embodiment a method for graphics processing comprises assigning at least one execution unit to each of a plurality of shader units, the plurality of shader units comprising a vertex shader, a geometry shader, and a pixel shader, wherein an execution unit assigned to a given shader unit performs processing tasks for only that shader unit, determining that one of the plurality of shader units is bottlenecked, and reassigning at least one execution unit from a non-bottlenecked shader unit to the shader unit determined to be bottlenecked.Type: GrantFiled: April 19, 2006Date of Patent: March 27, 2012Assignee: Via Technologies, Inc.Inventors: Yang (Jeff) Jiao, Yijung Su
-
Publication number: 20120069033Abstract: Embodiments of the present disclosure are directed to graphics processing systems, comprising: a plurality of execution units, wherein one of the execution units is configurable to process a thread corresponding to a rendering context, wherein the rendering context comprises a plurality of constants with a priority level; a constant buffer configurable to store the constants of the rendering context into a plurality of slot in a physical storage space; and an execution unit control unit configurable to assign the thread to one of the execution units; a constant buffer control unit providing a translation table for the rendering context to map the corresponding constants into the slots of the physical storage space. Comparable methods are also disclosed.Type: ApplicationFiled: November 29, 2011Publication date: March 22, 2012Applicant: VIA TECHNOLOGIES, INC.Inventors: Yang (Jeff) Jiao, Yijung Su, John Brothers
-
Patent number: 8120608Abstract: Embodiments of systems and methods for managing a constant buffer with rendering context specific data in multithreaded parallel computational GPU core are disclosed. Briefly described, one method embodiment, among others, comprises responsive to a first shader operation, receiving at a constant buffer a first group of constants corresponding to a first rendering context, and responsive to a second shader operation, receiving at the constant buffer a second group of constants corresponding to a second context without flushing the first group.Type: GrantFiled: April 4, 2008Date of Patent: February 21, 2012Assignee: Via Technologies, Inc.Inventors: Yang (Jeff) Jiao, Yijung Su, John Brothers
-
Patent number: 8049760Abstract: The present disclosure describes implementations for processing instructions and data across multiple Arithmetic Logic Units (ALUs). In one implementation, a graphics processing apparatus comprises a plurality of ALUs configured to process independent instructions in parallel. Pre-processing logic is configured to receive instructions and associated data to be directed to one of the plurality of ALUs for processing from a register file, the pre-processing logic being configured to selectively format received instructions for delivery to a plurality of the ALUs. In addition, post-processing logic is configured to receive data output from the plurality of the ALUs and deliver the received data to the register file for write-back, the post-processing logic being configured to selectively format data output from a plurality of the ALUs for delivery to the register file as though the data had been output by a single ALU.Type: GrantFiled: December 13, 2006Date of Patent: November 1, 2011Assignee: Via Technologies, Inc.Inventors: Yang (Jeff) Jiao, Chien Te Ho
-
Publication number: 20110261063Abstract: The present disclosure describes implementations for performing register accesses and operations in a graphics processing apparatus. In one implementation, a graphics processing apparatus comprises an execution unit for processing programmed shader operations, wherein the execution unit is configured for processing operations of a plurality of threads. The apparatus further comprises memory forming a register file that accommodates all register operations for all the threads executed by the execution unit, the memory being organized in a plurality of banks, with a first plurality of banks being allocated to a first plurality of the threads and a second plurality of banks being allocated to the remaining threads. In addition, the apparatus comprises address translation logic configured to translate logical register identifiers into physical register addresses.Type: ApplicationFiled: April 21, 2010Publication date: October 27, 2011Applicant: VIA TECHNOLOGIES, INC.Inventor: Yang (Jeff) Jiao
-
Patent number: 7876328Abstract: Provided is a system for managing multiple contexts in a decentralized graphics processing unit. The system includes multiple control units that can include a context buffer, a context processor, and a context scheduler. Also included is logic to receive multiple contexts, logic to identify at least one of the contexts, and logic to facilitate communication among the control units.Type: GrantFiled: February 8, 2007Date of Patent: January 25, 2011Assignee: Via Technologies, Inc.Inventors: Qunfeng (Fred) Liao, Yang (Jeff) Jiao, Yijung Su