Patents by Inventor Timour Paltashev

Timour Paltashev has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Low power and low latency GPU coprocessor for persistent computing

Patent number: 11625807

Abstract: Systems, apparatuses, and methods for implementing a graphics processing unit (GPU) coprocessor are disclosed. The GPU coprocessor includes a SIMD unit with the ability to self-schedule sub-wave procedures based on input data flow events. A host processor sends messages targeting the GPU coprocessor to a queue. In response to detecting a first message in the queue, the GPU coprocessor schedules a first sub-task for execution. The GPU coprocessor includes an inter-lane crossbar and intra-lane biased indexing mechanism for a vector general purpose register (VGPR) file. The VGPR file is split into two files. The first VGPR file is a larger register file with one read port and one write port. The second VGPR file is a smaller register file with multiple read ports and one write port. The second VGPR introduces the ability to co-issue more than one instruction per clock cycle.

Type: Grant

Filed: February 22, 2021

Date of Patent: April 11, 2023

Assignee: Advanced Micro Devices, Inc.

Inventors: Jiasheng Chen, Timour Paltashev, Alexander Lyashevsky, Carl Kittredge Wakeland, Michael J. Mantor
LOW POWER AND LOW LATENCY GPU COPROCESSOR FOR PERSISTENT COMPUTING

Publication number: 20210201439

Abstract: Systems, apparatuses, and methods for implementing a graphics processing unit (GPU) coprocessor are disclosed. The GPU coprocessor includes a SIMD unit with the ability to self-schedule sub-wave procedures based on input data flow events. A host processor sends messages targeting the GPU coprocessor to a queue. In response to detecting a first message in the queue, the GPU coprocessor schedules a first sub-task for execution. The GPU coprocessor includes an inter-lane crossbar and intra-lane biased indexing mechanism for a vector general purpose register (VGPR) file. The VGPR file is split into two files. The first VGPR file is a larger register file with one read port and one write port. The second VGPR file is a smaller register file with multiple read ports and one write port. The second VGPR introduces the ability to co-issue more than one instruction per clock cycle.

Type: Application

Filed: February 22, 2021

Publication date: July 1, 2021

Inventors: Jiasheng Chen, Timour Paltashev, Alexander Lyashevsky, Carl Kittredge Wakeland, Michael J. Mantor
Low power and low latency GPU coprocessor for persistent computing

Patent number: 10929944

Abstract: Systems, apparatuses, and methods for implementing a graphics processing unit (GPU) coprocessor are disclosed. The GPU coprocessor includes a SIMD unit with the ability to self-schedule sub-wave procedures based on input data flow events. A host processor sends messages targeting the GPU coprocessor to a queue. In response to detecting a first message in the queue, the GPU coprocessor schedules a first sub-task for execution. The GPU coprocessor includes an inter-lane crossbar and intra-lane biased indexing mechanism for a vector general purpose register (VGPR) file. The VGPR file is split into two files. The first VGPR file is a larger register file with one read port and one write port. The second VGPR file is a smaller register file with multiple read ports and one write port. The second VGPR introduces the ability to co-issue more than one instruction per clock cycle.

Type: Grant

Filed: November 23, 2016

Date of Patent: February 23, 2021

Assignee: Advanced Micro Devices, Inc.

Inventors: Jiasheng Chen, Timour Paltashev, Alexander Lyashevsky, Carl Kittredge Wakeland, Michael J. Mantor
LOW POWER AND LOW LATENCY GPU COPROCESSOR FOR PERSISTENT COMPUTING

Publication number: 20180144435

Abstract: Systems, apparatuses, and methods for implementing a graphics processing unit (GPU) coprocessor are disclosed. The GPU coprocessor includes a SIMD unit with the ability to self-schedule sub-wave procedures based on input data flow events. A host processor sends messages targeting the GPU coprocessor to a queue. In response to detecting a first message in the queue, the GPU coprocessor schedules a first sub-task for execution. The GPU coprocessor includes an inter-lane crossbar and intra-lane biased indexing mechanism for a vector general purpose register (VGPR) file. The VGPR file is split into two files. The first VGPR file is a larger register file with one read port and one write port. The second VGPR file is a smaller register file with multiple read ports and one write port. The second VGPR introduces the ability to co-issue more than one instruction per clock cycle.

Type: Application

Filed: November 23, 2016

Publication date: May 24, 2018

Inventors: Jiasheng Chen, Timour Paltashev, Alexander Lyashevsky, Carl Kittredge Wakeland, Michael J. Mantor
Graphics processor having unified cache system

Patent number: 9214007

Abstract: Graphics processing units (GPUs) are used, for example, to process data related to three-dimensional objects or scenes and to render the three-dimensional data onto a two-dimensional display screen. One embodiment, among others, of a unified cache system used in a GPU comprises a data storage device and a storage device controller. The data storage device is configured to store graphics data processed by or to be processed by one or more shader units. The storage device controller is placed in communication with the data storage device. The storage device controller is configured to dynamically control a storage allocation of the graphics data within the data storage device.

Type: Grant

Filed: January 25, 2008

Date of Patent: December 15, 2015

Assignee: VIA TECHNOLOGIES, INC.

Inventors: Jeff Jiao, Timour Paltashev
GPU pipeline synchronization and control system and method

Patent number: 8817029

Abstract: A graphics pipeline configured to synchronize data processing according to signals and tokens has at least four components. The first component has one input and one output and communicates output tokens or wire signals after receiving tokens on the input, an internal event occurrence, or receipt of a signal on an input path. The second component has one input and a plurality of outputs and communicates tokens or wire signals on one of the outputs after receiving tokens on the input, an internal event occurrence, or receipt of a signal on an input path. The third component has a plurality of inputs and one output and communicates tokens or wire signals on the output after receiving tokens on one of the inputs, an internal event occurrence, or receipt of a signal on an input path. The fourth component has a plurality of inputs and a plurality of outputs and has the capabilities of both the third and forth components.

Type: Grant

Filed: August 30, 2006

Date of Patent: August 26, 2014

Assignee: Via Technologies, Inc.

Inventors: John Brothers, Timour Paltashev, Hsilin Huang, Qunfeng Liao
Caching method and apparatus for a vertex shader and geometry shader

Patent number: 8769207

Abstract: Systems and methods for sharing a physical cache among one or more clients in a stream data processing pipeline are described. One embodiment is directed to a system for sharing caches between two or more clients. The system comprises a physical cache memory having a memory portion accessed through a cache index. The system further comprises at least two virtual cache spaces mapping to the memory portion, each of the virtual cache spaces has an active window which has a different size than the memory portion. Further, the system comprises at least one virtual cache controller configured to perform a hit-miss test on the active window of the virtual cache space in response to a request from one of the clients for accessing the physical cache memory. Furthermore, data is accessed from the corresponding location of the memory portion when the hit-miss test of the cache index returns a hit.

Type: Grant

Filed: January 16, 2008

Date of Patent: July 1, 2014

Assignee: Via Technologies, Inc.

Inventors: Jeff Jiao, Timour Paltashev
Systems and methods for video processing

Patent number: 8681162

Abstract: A programmable graphics processing unit (GPU) includes a first shader stage configured to receive slice data from a frame buffer and perform variable length decoding (VLD), wherein the first shader stage outputs data to a first buffer within the frame buffer; a second shader stage configured to receive the output data from the first shader stage and perform transformation and motion compensation on the slice data, wherein the second shader stage outputs decoded slice data to a second buffer within the frame buffer; a third shader stage configured to receive the decoded slice data and perform in-loop deblocking filtering (IDF) on the frame buffer; a fourth shader stage configured to perform post-processing on the frame buffer; and a scheduler configured to schedule execution of the shader stages, the scheduler comprising a plurality of counter registers; wherein execution of the shader stages is synchronized utilizing the counter registers.

Type: Grant

Filed: October 15, 2010

Date of Patent: March 25, 2014

Assignee: VIA Technologies, Inc.

Inventors: Timour Paltashev, John Brothers, Yi-Jung Su, Yang (Jeff) Jiao
Metaprocessor for GPU control and synchronization in a multiprocessor environment

Patent number: 8368701

Abstract: Included are embodiments of systems and methods for processing metacommands. In at least one exemplary embodiment a Graphics Processing Unit (GPU) includes a metaprocessor configured to process at least one context register, the metaprocessor including context management logic and a metaprocessor control register block coupled to the metaprocessor, the metaprocessor control register block configured to receive metaprocessor configuration data, the metaprocessor control register block further configured to define metacommand execution logic block behavior. Some embodiments include a Bus Interface Unit (BIU) configured to provide the access from a system processor to the metaprocessor and a GPU command stream processor configured to fetch a current context command stream and send commands for execution to a GPU pipeline and metaprocessor.

Type: Grant

Filed: November 6, 2008

Date of Patent: February 5, 2013

Assignee: Via Technologies, Inc.

Inventors: Timour Paltashev, Boris Prokopenko, John Brothers
Systems and Methods for Video Processing

Publication number: 20120092353

Abstract: A multi-shader system in a programmable graphics processing unit (GPU) for processing video data, includes a first shader stage configured to receive slice data from a frame buffer and perform variable length decoding (VLD), wherein the first shader stage outputs data to a first buffer within the frame buffer; a second shader stage configured to receive the output data from the first shader stage and perform transformation and motion compensation on the slice data, wherein the second shader stage outputs decoded slice data to a second buffer within the frame buffer; a third shader stage configured to receive the decoded slice data and perform in-loop deblocking filtering (IDF) on the frame buffer; a fourth shader stage configured to perform post-processing on the frame buffer; and a scheduler configured to schedule execution of the shader stages, the scheduler comprising a plurality of counter registers; wherein execution of the shader stages is synchronized utilizing the counter registers.

Type: Application

Filed: October 15, 2010

Publication date: April 19, 2012

Applicant: VIA TECHNOLOGIES, INC.

Inventors: Timour Paltashev, John Brothers, Yi-Jung Su, Yang (Jeff) Jiao
Support of a plurality of graphic processing units

Patent number: 8082426

Abstract: Included are systems and methods for supporting a plurality of Graphics Processing Units (GPUs). At least one embodiment of a system includes a context status register configured to send data related to a status of at least one context and a context switch configuration register configured to send instructions related to at least one event for the at least one context. At least one embodiment of a system includes a context status management component coupled to the context status register and the context switch configuration register.

Type: Grant

Filed: November 6, 2008

Date of Patent: December 20, 2011

Assignee: Via Technologies, Inc.

Inventors: Timour Paltashev, Boris Prokopenko, John Brothers
Dual mode floating point multiply accumulate unit

Patent number: 8024394

Abstract: Included are embodiments of a Multiply-Accumulate Unit to process multiple format floating point operands. For short format operands, embodiments of the Multiply Accumulate Unit are configured to process data with twice the throughput as long and mixed format data. At least one embodiment can include a short exponent calculation component configured to receive short format data, a long exponent calculation component configured to receive long format data, and a mixed exponent calculation component configured to receive short exponent data, the mixed exponent calculation component further configured to received long format data. Embodiments also include a mantissa datapath configured for implementation to accommodate processing of long, mixed, and short floating point operands.

Type: Grant

Filed: February 6, 2007

Date of Patent: September 20, 2011

Assignee: Via Technologies, Inc.

Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
Dual Mode Floating Point Multiply Accumulate Unit

Publication number: 20110208946

Abstract: Disclosed are various embodiments of a stream processing unit for single instruction multiple data (SIMD) processing, wherein the stream processing unit executes a stage of a Multiply-Accumulate calculation. In one embodiment, the stream processing unit comprises a plurality of scalar arithmetic logic units (ALUs) configured to receive data having a plurality of data types. The number and type of scalar ALUs corresponds to an SIMD factor. In one embodiment, the scalar ALUs are executed sequentially with a delay being introduced in between execution of each of the scalar ALUs, wherein the delay corresponds to the SIMD factor.

Type: Application

Filed: May 4, 2011

Publication date: August 25, 2011

Applicant: VIA TECHNOLOGIES, INC.

Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
GPU pipeline multiple level synchronization controller processor and method

Patent number: 7737983

Abstract: A method for high level synchronization between an application and a graphics pipeline comprises receiving an application instruction in an input stream at a predetermined component, such as a command stream processor (CSP), as sent by a central processing unit. The CSP may have a first portion coupled to a next component in the graphics pipeline and a second portion coupled to a plurality of components of the graphics pipeline. A command associated with the application instruction may be forwarded from the first portion to the next component in the graphics pipeline or some other component coupled thereto. The command may be received and thereafter executed. A response may be communicated on a feedback path to the second portion of the CSP. Nonlimiting exemplary application instructions that may be received and executed by the CSP include check surface fault, trap, wait, signal, stall, flip, and trigger.

Type: Grant

Filed: October 25, 2006

Date of Patent: June 15, 2010

Assignee: Via Technologies, Inc.

Inventors: John Brothers, Timour Paltashev, Hsilin Huang, Boris Prokopenko, Qunfeng (Fred) Liao
Metaprocessor for GPU Control and Synchronization in a Multiprocessor Environment

Publication number: 20100110083

Abstract: Included are embodiments of systems and methods for processing metacommands. In at least one exemplary embodiment a Graphics Processing Unit (GPU) includes a metaprocessor configured to process at least one context register, the metaprocessor including context management logic and a metaprocessor control register block coupled to the metaprocessor, the metaprocessor control register block configured to receive metaprocessor configuration data, the metaprocessor control register block further configured to define metacommand execution logic block behavior. Some embodiments include a Bus Interface Unit (BIU) configured to provide the access from a system processor to the metaprocessor and a GPU command stream processor configured to fetch a current context command stream and send commands for execution to a GPU pipeline and metaprocessor.

Type: Application

Filed: November 6, 2008

Publication date: May 6, 2010

Applicant: VIA TECHNOLOGIES, INC.

Inventors: Timour Paltashev, Boris Prokopenko, John Brothers
Multiple GPU Context Synchronization Using Barrier Type Primitives

Publication number: 20100110089

Abstract: Included are systems and methods for Graphics Processing Unit (GPU) synchronization. At least one embodiment of a system includes at least one producer GPU configured to receive data related to at least one context, the at least one producer GPU further configured to process at least a portion of the received data. Some embodiments include at least one consumer GPU configured to received data from the producer GPU, the consumer GPU further configured to stall execution of the received data until a fence value is received.

Type: Application

Filed: November 6, 2008

Publication date: May 6, 2010

Applicant: VIA Technologies, Inc.

Inventors: Timour Paltashev, Boris Prokopenko, John Brothers
Support of a Plurality of Graphic Processing Units

Publication number: 20100115249

Abstract: Included are systems and methods for supporting a plurality of Graphics Processing Units (GPUs). At least one embodiment of a system includes a context status register configured to send data related to a status of at least one context and a context switch configuration register configured to send instructions related to at least one event for the at least one context. At least one embodiment of a system includes a context status management component coupled to the context status register and the context switch configuration register.

Type: Application

Filed: November 6, 2008

Publication date: May 6, 2010

Applicant: VIA TECHNOLOGIES, INC.

Inventors: Timour Paltashev, Boris Prokopenko, John Brothers
Method and apparatus for triangle rasterization with clipping and wire-frame mode support

Patent number: 7675521

Abstract: Systems for performing rasterization are described. At least one embodiment includes a span generator for performing rasterization. In accordance with such embodiments, the span generator comprises functionals representing a scissoring box, loaders configured to convert the functionals from a general form to a special case form, edge generators configured to read the special case form of the scissoring box, whereby the special case form simplifies calculations by the edge generators. The span generator further comprises sorters configured to compute the intersection of half-planes, wherein edges of the intersection are generated by the edge generators and a span buffer configured to temporarily store spans before tiling.

Type: Grant

Filed: March 11, 2008

Date of Patent: March 9, 2010

Assignee: VIA Technologies, Inc.

Inventors: Konstantine Iourcha, Boris Prokopenko, Timour Paltashev, Derek Gladding
Multi-execution resource graphics processor

Patent number: 7659898

Abstract: A dynamically scheduled parallel graphics processor comprises a spreader that creates graphic objects for processing and assigns and distributes the created objects for processing to one or more execution blocks. Each execution block is coupled to the spreader and receives an assignment for processing a graphics object. The execution block pushes the object through each processing stage by scheduling the processing of the graphics object and executing instruction operations on the graphics object. The dynamically scheduled parallel graphics processor includes one or more fixed function units coupled to the spreader that are configured to execute one or more predetermined operations on a graphics object. An input/output unit is coupled to the spreader, the one or more fixed function units, and the plurality of execution blocks and is configured to provide access to memory external to the dynamically scheduled parallel graphics processor.

Type: Grant

Filed: August 8, 2005

Date of Patent: February 9, 2010

Assignee: VIA Technologies, Inc.

Inventors: Boris Prokopenko, Timour Paltashev
System and method to manage data processing stages of a logical graphics pipeline

Patent number: 7659899

Abstract: A system and method to manage data processing stages of a logical graphics pipeline comprises a number of execution blocks coupled together and to a global spreader that assigns graphics data entities for execution to the execution blocks. Each execution block has an entity descriptor table containing information about an assigned graphics data entity corresponding to allocation of the entity and a current processing stage associated with the entity. Each execution block includes a stage parser configured to establish pointers for the assigned graphics data entity to be processed on a next processing stage. A numerical processing unit is included and configured to execute floating point and integer instructions in association with the assigned graphics data entity. The execution blocks include a data move unit for data loads and moves within the execution block, with the global spreader, and with other execution blocks of the plurality of execution blocks.

Type: Grant

Filed: August 8, 2005

Date of Patent: February 9, 2010

Assignee: Via Technologies, Inc.

Inventors: Timour Paltashev, Boris Prokopenko

1 2 3 next