Patents by Inventor Timour Paltashev

Timour Paltashev has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11625807
    Abstract: Systems, apparatuses, and methods for implementing a graphics processing unit (GPU) coprocessor are disclosed. The GPU coprocessor includes a SIMD unit with the ability to self-schedule sub-wave procedures based on input data flow events. A host processor sends messages targeting the GPU coprocessor to a queue. In response to detecting a first message in the queue, the GPU coprocessor schedules a first sub-task for execution. The GPU coprocessor includes an inter-lane crossbar and intra-lane biased indexing mechanism for a vector general purpose register (VGPR) file. The VGPR file is split into two files. The first VGPR file is a larger register file with one read port and one write port. The second VGPR file is a smaller register file with multiple read ports and one write port. The second VGPR introduces the ability to co-issue more than one instruction per clock cycle.
    Type: Grant
    Filed: February 22, 2021
    Date of Patent: April 11, 2023
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jiasheng Chen, Timour Paltashev, Alexander Lyashevsky, Carl Kittredge Wakeland, Michael J. Mantor
  • Publication number: 20210201439
    Abstract: Systems, apparatuses, and methods for implementing a graphics processing unit (GPU) coprocessor are disclosed. The GPU coprocessor includes a SIMD unit with the ability to self-schedule sub-wave procedures based on input data flow events. A host processor sends messages targeting the GPU coprocessor to a queue. In response to detecting a first message in the queue, the GPU coprocessor schedules a first sub-task for execution. The GPU coprocessor includes an inter-lane crossbar and intra-lane biased indexing mechanism for a vector general purpose register (VGPR) file. The VGPR file is split into two files. The first VGPR file is a larger register file with one read port and one write port. The second VGPR file is a smaller register file with multiple read ports and one write port. The second VGPR introduces the ability to co-issue more than one instruction per clock cycle.
    Type: Application
    Filed: February 22, 2021
    Publication date: July 1, 2021
    Inventors: Jiasheng Chen, Timour Paltashev, Alexander Lyashevsky, Carl Kittredge Wakeland, Michael J. Mantor
  • Patent number: 10929944
    Abstract: Systems, apparatuses, and methods for implementing a graphics processing unit (GPU) coprocessor are disclosed. The GPU coprocessor includes a SIMD unit with the ability to self-schedule sub-wave procedures based on input data flow events. A host processor sends messages targeting the GPU coprocessor to a queue. In response to detecting a first message in the queue, the GPU coprocessor schedules a first sub-task for execution. The GPU coprocessor includes an inter-lane crossbar and intra-lane biased indexing mechanism for a vector general purpose register (VGPR) file. The VGPR file is split into two files. The first VGPR file is a larger register file with one read port and one write port. The second VGPR file is a smaller register file with multiple read ports and one write port. The second VGPR introduces the ability to co-issue more than one instruction per clock cycle.
    Type: Grant
    Filed: November 23, 2016
    Date of Patent: February 23, 2021
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jiasheng Chen, Timour Paltashev, Alexander Lyashevsky, Carl Kittredge Wakeland, Michael J. Mantor
  • Publication number: 20180144435
    Abstract: Systems, apparatuses, and methods for implementing a graphics processing unit (GPU) coprocessor are disclosed. The GPU coprocessor includes a SIMD unit with the ability to self-schedule sub-wave procedures based on input data flow events. A host processor sends messages targeting the GPU coprocessor to a queue. In response to detecting a first message in the queue, the GPU coprocessor schedules a first sub-task for execution. The GPU coprocessor includes an inter-lane crossbar and intra-lane biased indexing mechanism for a vector general purpose register (VGPR) file. The VGPR file is split into two files. The first VGPR file is a larger register file with one read port and one write port. The second VGPR file is a smaller register file with multiple read ports and one write port. The second VGPR introduces the ability to co-issue more than one instruction per clock cycle.
    Type: Application
    Filed: November 23, 2016
    Publication date: May 24, 2018
    Inventors: Jiasheng Chen, Timour Paltashev, Alexander Lyashevsky, Carl Kittredge Wakeland, Michael J. Mantor
  • Patent number: 9214007
    Abstract: Graphics processing units (GPUs) are used, for example, to process data related to three-dimensional objects or scenes and to render the three-dimensional data onto a two-dimensional display screen. One embodiment, among others, of a unified cache system used in a GPU comprises a data storage device and a storage device controller. The data storage device is configured to store graphics data processed by or to be processed by one or more shader units. The storage device controller is placed in communication with the data storage device. The storage device controller is configured to dynamically control a storage allocation of the graphics data within the data storage device.
    Type: Grant
    Filed: January 25, 2008
    Date of Patent: December 15, 2015
    Assignee: VIA TECHNOLOGIES, INC.
    Inventors: Jeff Jiao, Timour Paltashev
  • Patent number: 8817029
    Abstract: A graphics pipeline configured to synchronize data processing according to signals and tokens has at least four components. The first component has one input and one output and communicates output tokens or wire signals after receiving tokens on the input, an internal event occurrence, or receipt of a signal on an input path. The second component has one input and a plurality of outputs and communicates tokens or wire signals on one of the outputs after receiving tokens on the input, an internal event occurrence, or receipt of a signal on an input path. The third component has a plurality of inputs and one output and communicates tokens or wire signals on the output after receiving tokens on one of the inputs, an internal event occurrence, or receipt of a signal on an input path. The fourth component has a plurality of inputs and a plurality of outputs and has the capabilities of both the third and forth components.
    Type: Grant
    Filed: August 30, 2006
    Date of Patent: August 26, 2014
    Assignee: Via Technologies, Inc.
    Inventors: John Brothers, Timour Paltashev, Hsilin Huang, Qunfeng Liao
  • Patent number: 8769207
    Abstract: Systems and methods for sharing a physical cache among one or more clients in a stream data processing pipeline are described. One embodiment is directed to a system for sharing caches between two or more clients. The system comprises a physical cache memory having a memory portion accessed through a cache index. The system further comprises at least two virtual cache spaces mapping to the memory portion, each of the virtual cache spaces has an active window which has a different size than the memory portion. Further, the system comprises at least one virtual cache controller configured to perform a hit-miss test on the active window of the virtual cache space in response to a request from one of the clients for accessing the physical cache memory. Furthermore, data is accessed from the corresponding location of the memory portion when the hit-miss test of the cache index returns a hit.
    Type: Grant
    Filed: January 16, 2008
    Date of Patent: July 1, 2014
    Assignee: Via Technologies, Inc.
    Inventors: Jeff Jiao, Timour Paltashev
  • Patent number: 8681162
    Abstract: A programmable graphics processing unit (GPU) includes a first shader stage configured to receive slice data from a frame buffer and perform variable length decoding (VLD), wherein the first shader stage outputs data to a first buffer within the frame buffer; a second shader stage configured to receive the output data from the first shader stage and perform transformation and motion compensation on the slice data, wherein the second shader stage outputs decoded slice data to a second buffer within the frame buffer; a third shader stage configured to receive the decoded slice data and perform in-loop deblocking filtering (IDF) on the frame buffer; a fourth shader stage configured to perform post-processing on the frame buffer; and a scheduler configured to schedule execution of the shader stages, the scheduler comprising a plurality of counter registers; wherein execution of the shader stages is synchronized utilizing the counter registers.
    Type: Grant
    Filed: October 15, 2010
    Date of Patent: March 25, 2014
    Assignee: VIA Technologies, Inc.
    Inventors: Timour Paltashev, John Brothers, Yi-Jung Su, Yang (Jeff) Jiao
  • Patent number: 8368701
    Abstract: Included are embodiments of systems and methods for processing metacommands. In at least one exemplary embodiment a Graphics Processing Unit (GPU) includes a metaprocessor configured to process at least one context register, the metaprocessor including context management logic and a metaprocessor control register block coupled to the metaprocessor, the metaprocessor control register block configured to receive metaprocessor configuration data, the metaprocessor control register block further configured to define metacommand execution logic block behavior. Some embodiments include a Bus Interface Unit (BIU) configured to provide the access from a system processor to the metaprocessor and a GPU command stream processor configured to fetch a current context command stream and send commands for execution to a GPU pipeline and metaprocessor.
    Type: Grant
    Filed: November 6, 2008
    Date of Patent: February 5, 2013
    Assignee: Via Technologies, Inc.
    Inventors: Timour Paltashev, Boris Prokopenko, John Brothers
  • Publication number: 20120092353
    Abstract: A multi-shader system in a programmable graphics processing unit (GPU) for processing video data, includes a first shader stage configured to receive slice data from a frame buffer and perform variable length decoding (VLD), wherein the first shader stage outputs data to a first buffer within the frame buffer; a second shader stage configured to receive the output data from the first shader stage and perform transformation and motion compensation on the slice data, wherein the second shader stage outputs decoded slice data to a second buffer within the frame buffer; a third shader stage configured to receive the decoded slice data and perform in-loop deblocking filtering (IDF) on the frame buffer; a fourth shader stage configured to perform post-processing on the frame buffer; and a scheduler configured to schedule execution of the shader stages, the scheduler comprising a plurality of counter registers; wherein execution of the shader stages is synchronized utilizing the counter registers.
    Type: Application
    Filed: October 15, 2010
    Publication date: April 19, 2012
    Applicant: VIA TECHNOLOGIES, INC.
    Inventors: Timour Paltashev, John Brothers, Yi-Jung Su, Yang (Jeff) Jiao
  • Patent number: 8082426
    Abstract: Included are systems and methods for supporting a plurality of Graphics Processing Units (GPUs). At least one embodiment of a system includes a context status register configured to send data related to a status of at least one context and a context switch configuration register configured to send instructions related to at least one event for the at least one context. At least one embodiment of a system includes a context status management component coupled to the context status register and the context switch configuration register.
    Type: Grant
    Filed: November 6, 2008
    Date of Patent: December 20, 2011
    Assignee: Via Technologies, Inc.
    Inventors: Timour Paltashev, Boris Prokopenko, John Brothers
  • Patent number: 8024394
    Abstract: Included are embodiments of a Multiply-Accumulate Unit to process multiple format floating point operands. For short format operands, embodiments of the Multiply Accumulate Unit are configured to process data with twice the throughput as long and mixed format data. At least one embodiment can include a short exponent calculation component configured to receive short format data, a long exponent calculation component configured to receive long format data, and a mixed exponent calculation component configured to receive short exponent data, the mixed exponent calculation component further configured to received long format data. Embodiments also include a mantissa datapath configured for implementation to accommodate processing of long, mixed, and short floating point operands.
    Type: Grant
    Filed: February 6, 2007
    Date of Patent: September 20, 2011
    Assignee: Via Technologies, Inc.
    Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Publication number: 20110208946
    Abstract: Disclosed are various embodiments of a stream processing unit for single instruction multiple data (SIMD) processing, wherein the stream processing unit executes a stage of a Multiply-Accumulate calculation. In one embodiment, the stream processing unit comprises a plurality of scalar arithmetic logic units (ALUs) configured to receive data having a plurality of data types. The number and type of scalar ALUs corresponds to an SIMD factor. In one embodiment, the scalar ALUs are executed sequentially with a delay being introduced in between execution of each of the scalar ALUs, wherein the delay corresponds to the SIMD factor.
    Type: Application
    Filed: May 4, 2011
    Publication date: August 25, 2011
    Applicant: VIA TECHNOLOGIES, INC.
    Inventors: Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Patent number: 7737983
    Abstract: A method for high level synchronization between an application and a graphics pipeline comprises receiving an application instruction in an input stream at a predetermined component, such as a command stream processor (CSP), as sent by a central processing unit. The CSP may have a first portion coupled to a next component in the graphics pipeline and a second portion coupled to a plurality of components of the graphics pipeline. A command associated with the application instruction may be forwarded from the first portion to the next component in the graphics pipeline or some other component coupled thereto. The command may be received and thereafter executed. A response may be communicated on a feedback path to the second portion of the CSP. Nonlimiting exemplary application instructions that may be received and executed by the CSP include check surface fault, trap, wait, signal, stall, flip, and trigger.
    Type: Grant
    Filed: October 25, 2006
    Date of Patent: June 15, 2010
    Assignee: Via Technologies, Inc.
    Inventors: John Brothers, Timour Paltashev, Hsilin Huang, Boris Prokopenko, Qunfeng (Fred) Liao
  • Publication number: 20100110083
    Abstract: Included are embodiments of systems and methods for processing metacommands. In at least one exemplary embodiment a Graphics Processing Unit (GPU) includes a metaprocessor configured to process at least one context register, the metaprocessor including context management logic and a metaprocessor control register block coupled to the metaprocessor, the metaprocessor control register block configured to receive metaprocessor configuration data, the metaprocessor control register block further configured to define metacommand execution logic block behavior. Some embodiments include a Bus Interface Unit (BIU) configured to provide the access from a system processor to the metaprocessor and a GPU command stream processor configured to fetch a current context command stream and send commands for execution to a GPU pipeline and metaprocessor.
    Type: Application
    Filed: November 6, 2008
    Publication date: May 6, 2010
    Applicant: VIA TECHNOLOGIES, INC.
    Inventors: Timour Paltashev, Boris Prokopenko, John Brothers
  • Publication number: 20100110089
    Abstract: Included are systems and methods for Graphics Processing Unit (GPU) synchronization. At least one embodiment of a system includes at least one producer GPU configured to receive data related to at least one context, the at least one producer GPU further configured to process at least a portion of the received data. Some embodiments include at least one consumer GPU configured to received data from the producer GPU, the consumer GPU further configured to stall execution of the received data until a fence value is received.
    Type: Application
    Filed: November 6, 2008
    Publication date: May 6, 2010
    Applicant: VIA Technologies, Inc.
    Inventors: Timour Paltashev, Boris Prokopenko, John Brothers
  • Publication number: 20100115249
    Abstract: Included are systems and methods for supporting a plurality of Graphics Processing Units (GPUs). At least one embodiment of a system includes a context status register configured to send data related to a status of at least one context and a context switch configuration register configured to send instructions related to at least one event for the at least one context. At least one embodiment of a system includes a context status management component coupled to the context status register and the context switch configuration register.
    Type: Application
    Filed: November 6, 2008
    Publication date: May 6, 2010
    Applicant: VIA TECHNOLOGIES, INC.
    Inventors: Timour Paltashev, Boris Prokopenko, John Brothers
  • Patent number: 7675521
    Abstract: Systems for performing rasterization are described. At least one embodiment includes a span generator for performing rasterization. In accordance with such embodiments, the span generator comprises functionals representing a scissoring box, loaders configured to convert the functionals from a general form to a special case form, edge generators configured to read the special case form of the scissoring box, whereby the special case form simplifies calculations by the edge generators. The span generator further comprises sorters configured to compute the intersection of half-planes, wherein edges of the intersection are generated by the edge generators and a span buffer configured to temporarily store spans before tiling.
    Type: Grant
    Filed: March 11, 2008
    Date of Patent: March 9, 2010
    Assignee: VIA Technologies, Inc.
    Inventors: Konstantine Iourcha, Boris Prokopenko, Timour Paltashev, Derek Gladding
  • Patent number: 7659898
    Abstract: A dynamically scheduled parallel graphics processor comprises a spreader that creates graphic objects for processing and assigns and distributes the created objects for processing to one or more execution blocks. Each execution block is coupled to the spreader and receives an assignment for processing a graphics object. The execution block pushes the object through each processing stage by scheduling the processing of the graphics object and executing instruction operations on the graphics object. The dynamically scheduled parallel graphics processor includes one or more fixed function units coupled to the spreader that are configured to execute one or more predetermined operations on a graphics object. An input/output unit is coupled to the spreader, the one or more fixed function units, and the plurality of execution blocks and is configured to provide access to memory external to the dynamically scheduled parallel graphics processor.
    Type: Grant
    Filed: August 8, 2005
    Date of Patent: February 9, 2010
    Assignee: VIA Technologies, Inc.
    Inventors: Boris Prokopenko, Timour Paltashev
  • Patent number: 7659899
    Abstract: A system and method to manage data processing stages of a logical graphics pipeline comprises a number of execution blocks coupled together and to a global spreader that assigns graphics data entities for execution to the execution blocks. Each execution block has an entity descriptor table containing information about an assigned graphics data entity corresponding to allocation of the entity and a current processing stage associated with the entity. Each execution block includes a stage parser configured to establish pointers for the assigned graphics data entity to be processed on a next processing stage. A numerical processing unit is included and configured to execute floating point and integer instructions in association with the assigned graphics data entity. The execution blocks include a data move unit for data loads and moves within the execution block, with the global spreader, and with other execution blocks of the plurality of execution blocks.
    Type: Grant
    Filed: August 8, 2005
    Date of Patent: February 9, 2010
    Assignee: Via Technologies, Inc.
    Inventors: Timour Paltashev, Boris Prokopenko