Patents by Inventor Jeff Jiao

Jeff Jiao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

EFFICIENT AND MORE ADVANCED IMPLEMENTATION OF RING-ALLREDUCE ALGORITHM FOR DISTRIBUTED PARALLEL DEEP LEARNING

Publication number: 20210240543

Abstract: The present disclosure provides a method for syncing data of a computing task across a plurality of groups of computing nodes, each group comprising a set of computing nodes A-D, a set of intra-group interconnects that communicatively couple computing node A with computing nodes B and C and computing node D with computing nodes B and C, and a set of inter-group interconnects that communicatively couple a computing node A of a first group of the plurality of groups with a computing node A of a second group neighboring the first group, a computing node B of the first group with a computing node B of the second group, a computing node C of the first group with the computing node C of the second group, and a computing node D of the first group with a computing node D of the second group, the method comprising: syncing across a first dimension of computing nodes using a first set of ring connections, wherein the first set of ring connections are formed using inter-group and intra-group interconnects that communica

Type: Application

Filed: January 30, 2020

Publication date: August 5, 2021

Inventors: Liang HAN, Jeff JIAO
Graphics processor having unified cache system

Patent number: 9214007

Abstract: Graphics processing units (GPUs) are used, for example, to process data related to three-dimensional objects or scenes and to render the three-dimensional data onto a two-dimensional display screen. One embodiment, among others, of a unified cache system used in a GPU comprises a data storage device and a storage device controller. The data storage device is configured to store graphics data processed by or to be processed by one or more shader units. The storage device controller is placed in communication with the data storage device. The storage device controller is configured to dynamically control a storage allocation of the graphics data within the data storage device.

Type: Grant

Filed: January 25, 2008

Date of Patent: December 15, 2015

Assignee: VIA TECHNOLOGIES, INC.

Inventors: Jeff Jiao, Timour Paltashev
Triangle setup and attribute setup integration with programmable execution unit

Patent number: 8963930

Abstract: A system for integrating triangle setup and attribute setup operations into a programmable execution unit of a graphics processing unit is disclosed. A method for integrating triangle setup and attribute setup operations into a programmable execution unit graphics processing unit is also disclosed. In one embodiment, at least one execution unit is configured for multi-threaded operation. The at least one execution unit is configured to execute at least one thread for triangle setup operations and attribute setup operations as well as threads for pixel shader, geometry shader and vertex shader operations.

Type: Grant

Filed: December 12, 2007

Date of Patent: February 24, 2015

Assignee: Via Technologies, Inc.

Inventors: Yang (Jeff) Jiao, Mike Hong, Yin Li, Yunjie Xu
Caching method and apparatus for a vertex shader and geometry shader

Patent number: 8769207

Abstract: Systems and methods for sharing a physical cache among one or more clients in a stream data processing pipeline are described. One embodiment is directed to a system for sharing caches between two or more clients. The system comprises a physical cache memory having a memory portion accessed through a cache index. The system further comprises at least two virtual cache spaces mapping to the memory portion, each of the virtual cache spaces has an active window which has a different size than the memory portion. Further, the system comprises at least one virtual cache controller configured to perform a hit-miss test on the active window of the virtual cache space in response to a request from one of the clients for accessing the physical cache memory. Furthermore, data is accessed from the corresponding location of the memory portion when the hit-miss test of the cache index returns a hit.

Type: Grant

Filed: January 16, 2008

Date of Patent: July 1, 2014

Assignee: Via Technologies, Inc.

Inventors: Jeff Jiao, Timour Paltashev
Systems and methods for video processing

Patent number: 8681162

Abstract: A programmable graphics processing unit (GPU) includes a first shader stage configured to receive slice data from a frame buffer and perform variable length decoding (VLD), wherein the first shader stage outputs data to a first buffer within the frame buffer; a second shader stage configured to receive the output data from the first shader stage and perform transformation and motion compensation on the slice data, wherein the second shader stage outputs decoded slice data to a second buffer within the frame buffer; a third shader stage configured to receive the decoded slice data and perform in-loop deblocking filtering (IDF) on the frame buffer; a fourth shader stage configured to perform post-processing on the frame buffer; and a scheduler configured to schedule execution of the shader stages, the scheduler comprising a plurality of counter registers; wherein execution of the shader stages is synchronized utilizing the counter registers.

Type: Grant

Filed: October 15, 2010

Date of Patent: March 25, 2014

Assignee: VIA Technologies, Inc.

Inventors: Timour Paltashev, John Brothers, Yi-Jung Su, Yang (Jeff) Jiao
Systems and methods for improving throughput of a graphics processing unit

Patent number: 8564604

Abstract: Systems and methods for improving throughput of a graphics processing unit are disclosed. In one embodiment, a system includes a multithreaded execution unit capable of processing requests to access a constant cache, a vertex attribute cache, at least one common register file, and an execution unit data path substantially simultaneously.

Type: Grant

Filed: April 21, 2010

Date of Patent: October 22, 2013

Assignee: VIA Technologies, Inc.

Inventor: Yang (Jeff) Jiao
Systems and methods for performing shared memory accesses

Patent number: 8547385

Abstract: Various systems and methods are described for accessing a shared memory in a graphics processing unit (GPU). One embodiment comprises determining whether data to be read from a shared memory aligns to a boundary of the shared memory, wherein the data comprises a plurality of data blocks, and wherein the shared memory comprises a plurality of banks and a plurality of offsets. A swizzle pattern in which the data blocks are to be arranged for processing is determined. Based on whether the data aligns with a boundary of the shared memory and based on the determined swizzle pattern, an order for performing one or more wrapping functions is determined. The shared memory is accessed by performing the one or more wrapping functions and reading the data blocks to construct the data according to the swizzle pattern.

Type: Grant

Filed: October 15, 2010

Date of Patent: October 1, 2013

Assignee: Via Technologies, Inc.

Inventor: Yang (Jeff) Jiao
System and method for managing the computation of graphics shading operations

Patent number: 8514235

Abstract: The present disclosure describes implementations for performing register accesses and operations in a graphics processing apparatus. In one implementation, a graphics processing apparatus comprises an execution unit for processing programmed shader operations, wherein the execution unit is configured for processing operations of a plurality of threads. The apparatus further comprises memory forming a register file that accommodates all register operations for all the threads executed by the execution unit, the memory being organized in a plurality of banks, with a first plurality of banks being allocated to a first plurality of the threads and a second plurality of banks being allocated to the remaining threads. In addition, the apparatus comprises address translation logic configured to translate logical register identifiers into physical register addresses.

Type: Grant

Filed: April 21, 2010

Date of Patent: August 20, 2013

Assignee: Via Technologies, Inc.

Inventor: Yang (Jeff) Jiao
Systems and methods for performing multi-program general purpose shader kickoff

Patent number: 8499305

Abstract: Systems and methods for thread group kickoff and thread synchronization are described. One method is directed to synchronizing a plurality of threads in a general purpose shader in a graphics processor. The method comprises determining an entry point for execution of the threads in the general purpose shader, performing a fork operation at the entry point, whereby the plurality of threads are dispatched, wherein the plurality of threads comprise a main thread and one or more sub-threads. The method further comprises performing a join operation whereby the plurality of threads are synchronized upon the main thread reaching a synchronization point. Upon completion of the join operation, a second fork operation is performed to resume parallel execution of the plurality of threads.

Type: Grant

Filed: October 15, 2010

Date of Patent: July 30, 2013

Assignee: VIA Technologies, Inc.

Inventor: Yang (Jeff) Jiao
Constant buffering for a computational core of a programmable graphics processing unit

Patent number: 8319774

Abstract: Embodiments of the present disclosure are directed to graphics processing systems, comprising: a plurality of execution units, wherein one of the execution units is configurable to process a thread corresponding to a rendering context, wherein the rendering context comprises a plurality of constants with a priority level; a constant buffer configurable to store the constants of the rendering context into a plurality of slot in a physical storage space; and an execution unit control unit configurable to assign the thread to one of the execution units; a constant buffer control unit providing a translation table for the rendering context to map the corresponding constants into the slots of the physical storage space. Comparable methods are also disclosed.

Type: Grant

Filed: November 29, 2011

Date of Patent: November 27, 2012

Assignee: Via Technologies, Inc.

Inventors: Yang (Jeff) Jiao, Yijung Su, John Brothers
Shader processing systems and methods

Patent number: 8174534

Abstract: Various embodiments of shader processing systems and methods are disclosed. One method embodiment, among others, comprises a dependent texture read method executed using a multi-threaded, parallel computational core of a graphics processing unit (GPU). Such a method includes generating a dependent texture read request at logic configured to perform shader computations corresponding to a first thread, and sending shader-calculated, texture-sampling related parameters corresponding to the first thread to a texture pipeline while retaining at the logic all other shader processing related information corresponding to the first thread.

Type: Grant

Filed: December 6, 2007

Date of Patent: May 8, 2012

Assignee: Via Technologies, Inc.

Inventor: Yang (Jeff) Jiao
Systems and Methods for Performing Multi-Program General Purpose Shader Kickoff

Publication number: 20120096474

Abstract: Systems and methods for thread group kickoff and thread synchronization are described. One method is directed to synchronizing a plurality of threads in a general purpose shader in a graphics processor. The method comprises determining an entry point for execution of the threads in the general purpose shader, performing a fork operation at the entry point, whereby the plurality of threads are dispatched, wherein the plurality of threads comprise a main thread and one or more sub-threads. The method further comprises performing a join operation whereby the plurality of threads are synchronized upon the main thread reaching a synchronization point. Upon completion of the join operation, a second fork operation is performed to resume parallel execution of the plurality of threads.

Type: Application

Filed: October 15, 2010

Publication date: April 19, 2012

Applicant: VIA TECHNOLOGIES, INC.

Inventor: Yang (Jeff) Jiao
Systems and Methods for Performing Shared Memory Accesses

Publication number: 20120092356

Abstract: Various systems and methods are described for accessing a shared memory in a graphics processing unit (GPU). One embodiment comprises determining whether data to be read from a shared memory aligns to a boundary of the shared memory, wherein the data comprises a plurality of data blocks, and wherein the shared memory comprises a plurality of banks and a plurality of offsets. A swizzle pattern in which the data blocks are to be arranged for processing is determined. Based on whether the data aligns with a boundary of the shared memory and based on the determined swizzle pattern, an order for performing one or more wrapping functions is determined. The shared memory is accessed by performing the one or more wrapping functions and reading the data blocks to construct the data according to the swizzle pattern.

Type: Application

Filed: October 15, 2010

Publication date: April 19, 2012

Applicant: VIA TECHNOLOGIES, INC.

Inventor: Yang (Jeff) Jiao
Systems and Methods for Video Processing

Publication number: 20120092353

Abstract: A multi-shader system in a programmable graphics processing unit (GPU) for processing video data, includes a first shader stage configured to receive slice data from a frame buffer and perform variable length decoding (VLD), wherein the first shader stage outputs data to a first buffer within the frame buffer; a second shader stage configured to receive the output data from the first shader stage and perform transformation and motion compensation on the slice data, wherein the second shader stage outputs decoded slice data to a second buffer within the frame buffer; a third shader stage configured to receive the decoded slice data and perform in-loop deblocking filtering (IDF) on the frame buffer; a fourth shader stage configured to perform post-processing on the frame buffer; and a scheduler configured to schedule execution of the shader stages, the scheduler comprising a plurality of counter registers; wherein execution of the shader stages is synchronized utilizing the counter registers.

Type: Application

Filed: October 15, 2010

Publication date: April 19, 2012

Applicant: VIA TECHNOLOGIES, INC.

Inventors: Timour Paltashev, John Brothers, Yi-Jung Su, Yang (Jeff) Jiao
System and method for dynamically load balancing multiple shader stages in a shared pool of processing units

Patent number: 8144149

Abstract: The present disclosure is directed to novel methods and apparatus for managing or performing the dynamic allocation or reallocation of processing resources among a vertex shader, a geometry shader, and pixel shader of a graphics processing unit. In one embodiment a method for graphics processing comprises assigning at least one execution unit to each of a plurality of shader units, the plurality of shader units comprising a vertex shader, a geometry shader, and a pixel shader, wherein an execution unit assigned to a given shader unit performs processing tasks for only that shader unit, determining that one of the plurality of shader units is bottlenecked, and reassigning at least one execution unit from a non-bottlenecked shader unit to the shader unit determined to be bottlenecked.

Type: Grant

Filed: April 19, 2006

Date of Patent: March 27, 2012

Assignee: Via Technologies, Inc.

Inventors: Yang (Jeff) Jiao, Yijung Su
Constant Buffering for a Computational Core of a Programmable Graphics Processing Unit

Publication number: 20120069033

Abstract: Embodiments of the present disclosure are directed to graphics processing systems, comprising: a plurality of execution units, wherein one of the execution units is configurable to process a thread corresponding to a rendering context, wherein the rendering context comprises a plurality of constants with a priority level; a constant buffer configurable to store the constants of the rendering context into a plurality of slot in a physical storage space; and an execution unit control unit configurable to assign the thread to one of the execution units; a constant buffer control unit providing a translation table for the rendering context to map the corresponding constants into the slots of the physical storage space. Comparable methods are also disclosed.

Type: Application

Filed: November 29, 2011

Publication date: March 22, 2012

Applicant: VIA TECHNOLOGIES, INC.

Inventors: Yang (Jeff) Jiao, Yijung Su, John Brothers
Constant buffering for a computational core of a programmable graphics processing unit

Patent number: 8120608

Abstract: Embodiments of systems and methods for managing a constant buffer with rendering context specific data in multithreaded parallel computational GPU core are disclosed. Briefly described, one method embodiment, among others, comprises responsive to a first shader operation, receiving at a constant buffer a first group of constants corresponding to a first rendering context, and responsive to a second shader operation, receiving at the constant buffer a second group of constants corresponding to a second context without flushing the first group.

Type: Grant

Filed: April 4, 2008

Date of Patent: February 21, 2012

Assignee: Via Technologies, Inc.

Inventors: Yang (Jeff) Jiao, Yijung Su, John Brothers
System and method for vector computations in arithmetic logic units (ALUs)

Patent number: 8049760

Abstract: The present disclosure describes implementations for processing instructions and data across multiple Arithmetic Logic Units (ALUs). In one implementation, a graphics processing apparatus comprises a plurality of ALUs configured to process independent instructions in parallel. Pre-processing logic is configured to receive instructions and associated data to be directed to one of the plurality of ALUs for processing from a register file, the pre-processing logic being configured to selectively format received instructions for delivery to a plurality of the ALUs. In addition, post-processing logic is configured to receive data output from the plurality of the ALUs and deliver the received data to the register file for write-back, the post-processing logic being configured to selectively format data output from a plurality of the ALUs for delivery to the register file as though the data had been output by a single ALU.

Type: Grant

Filed: December 13, 2006

Date of Patent: November 1, 2011

Assignee: Via Technologies, Inc.

Inventors: Yang (Jeff) Jiao, Chien Te Ho
System and Method for Managing the Computation of Graphics Shading Operations

Publication number: 20110261063

Abstract: The present disclosure describes implementations for performing register accesses and operations in a graphics processing apparatus. In one implementation, a graphics processing apparatus comprises an execution unit for processing programmed shader operations, wherein the execution unit is configured for processing operations of a plurality of threads. The apparatus further comprises memory forming a register file that accommodates all register operations for all the threads executed by the execution unit, the memory being organized in a plurality of banks, with a first plurality of banks being allocated to a first plurality of the threads and a second plurality of banks being allocated to the remaining threads. In addition, the apparatus comprises address translation logic configured to translate logical register identifiers into physical register addresses.

Type: Application

Filed: April 21, 2010

Publication date: October 27, 2011

Applicant: VIA TECHNOLOGIES, INC.

Inventor: Yang (Jeff) Jiao
Managing multiple contexts in a decentralized graphics processing unit

Patent number: 7876328

Abstract: Provided is a system for managing multiple contexts in a decentralized graphics processing unit. The system includes multiple control units that can include a context buffer, a context processor, and a context scheduler. Also included is logic to receive multiple contexts, logic to identify at least one of the contexts, and logic to facilitate communication among the control units.

Type: Grant

Filed: February 8, 2007

Date of Patent: January 25, 2011

Assignee: Via Technologies, Inc.

Inventors: Qunfeng (Fred) Liao, Yang (Jeff) Jiao, Yijung Su

1 2 next