Patents by Inventor Alexei Vladimirovich Bourd

Alexei Vladimirovich Bourd has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20170316076
    Abstract: In an example, a method of transferring data may include synchronizing work-items corresponding to a first subgroup and work-items corresponding to a second subgroup with a barrier. The method may include performing an inter-subgroup data transfer between the first subgroup and the second subgroup.
    Type: Application
    Filed: September 7, 2016
    Publication date: November 2, 2017
    Inventors: Alexei Vladimirovich Bourd, Vladislav Shimanskiy, Maxim Kazakov, Yun Du
  • Publication number: 20170221173
    Abstract: A graphics processing unit (GPU) may dispatch a first set of commands for execution on one or more processing units of the GPU. The GPU may receive notification from a host device indicating that a second set of commands are ready to execute on the GPU. In response, the GPU may issue a first preemption command at a first preemption granularity to the one or more processing units. In response to the GPU failing to preempt execution of the first set of commands within an elapsed time period after issuing the first preemption command, the GPU may issue a second preemption command at a second preemption granularity to the one or more processing units, where the second preemption granularity is finer-grained than the first preemption granularity.
    Type: Application
    Filed: January 28, 2016
    Publication date: August 3, 2017
    Inventors: Anirudh Rajendra Acharya, Alexei Vladimirovich Bourd, David Rigel Garcia Garcia, Milind Nilkanth Nemlekar, Vineet Goel
  • Patent number: 9652284
    Abstract: A device includes a memory, and at least one programmable processor configured to determine, for each warp of a plurality of warps, whether a Boolean expression is true for a corresponding thread of each warp, pause execution of each warp having a corresponding thread for which the expression is true, determine a number of active threads for each of the plurality of warps for which the expression is true, sort the plurality of warps for which the expression is true based on the number of active threads in each of the plurality of warps, swap thread data of an active thread of a first warp of the plurality of warps with thread data of an inactive thread of a second warp of the plurality of warps, and resume execution of the at least one of the plurality of warps for which the expression is true.
    Type: Grant
    Filed: October 1, 2013
    Date of Patent: May 16, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Chunhui Mei, Alexei Vladimirovich Bourd, Lin Chen
  • Patent number: 9626234
    Abstract: This disclosure describes communication techniques that may be used within a multiple-processor computing platform. The techniques may, in some examples, provide software interfaces that may be used to support message passing within a multiple-processor computing platform that initiates tasks using command queues. The techniques may, in additional examples, provide software interfaces that may be used for shared memory inter-processor communication within a multiple-processor computing platform. In further examples, the techniques may provide a graphics processing unit (GPU) that includes hardware for supporting message passing and/or shared memory communication between the GPU and a host CPU.
    Type: Grant
    Filed: December 15, 2014
    Date of Patent: April 18, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Alexei Vladimirovich Bourd, Colin Christopher Sharp, David Rigel Garcia Garcia, Chihong Zhang
  • Patent number: 9230518
    Abstract: This disclosure presents techniques and structures for preemption at arbitrary control points in graphics processing. A method of graphics processing may comprise executing commands in a command buffer, the commands operating on data in a read-modify-write memory resource, double buffering the data in the read-modify-write memory resource, such that a first buffer stores original data of the read-modify-write memory resource and a second buffer stores any modified data produced by executing the commands in the command buffer, receiving a request to preempt execution of the commands in the command buffer before completing all commands in the command buffer, and restarting execution of the commands at the start of the command buffer using the original data in the first buffer.
    Type: Grant
    Filed: September 10, 2013
    Date of Patent: January 5, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Christopher Paul Frascati, Murat Balci, Avinash Seetharamaiah, Andrew Evan Gruber, Alexei Vladimirovich Bourd
  • Publication number: 20150317157
    Abstract: A SIMD processor may be configured to determine one or more active threads from a plurality of threads, select one active thread from the one or more active threads, and perform a divergent operation on the selected active thread. The divergent operation may be a serial operation.
    Type: Application
    Filed: May 2, 2014
    Publication date: November 5, 2015
    Applicant: QUALCOMM Incorporated
    Inventors: Andrew Evan Gruber, Lin Chen, Yun Du, Alexei Vladimirovich Bourd
  • Publication number: 20150269065
    Abstract: This disclosure describes techniques for supporting inter-task communication in a parallel computing system. The techniques for supporting inter-task communication may use hardware-based atomic operations to maintain the state of a pipe. A pipe may refer to a First-In, First-Out (FIFO)-organized buffer that allows various tasks to interact with the buffer as data producers or data consumers. Various pipe implementations may use multiple state parameters to define the state of a pipe. The hardware-based atomic operations described in this disclosure may modify multiple pipe state parameters in an atomic fashion. Modifying multiple pipe state parameters in an atomic fashion may avoid race conditions that would otherwise occur when multiple producers and/or multiple consumers attempt to modify the state of a pipe at the same time. In this way, pipe-based inter-task communication may be supported in a parallel computing system.
    Type: Application
    Filed: March 19, 2014
    Publication date: September 24, 2015
    Applicant: QUALCOMM Incorporated
    Inventors: Alexei Vladimirovich Bourd, Swapnil Pradipkumar Sakharshete, Fei Xu
  • Publication number: 20150097849
    Abstract: This disclosure describes communication techniques that may be used within a multiple-processor computing platform. The techniques may, in some examples, provide software interfaces that may be used to support message passing within a multiple-processor computing platform that initiates tasks using command queues. The techniques may, in additional examples, provide software interfaces that may be used for shared memory inter-processor communication within a multiple-processor computing platform. In further examples, the techniques may provide a graphics processing unit (GPU) that includes hardware for supporting message passing and/or shared memory communication between the GPU and a host CPU.
    Type: Application
    Filed: December 15, 2014
    Publication date: April 9, 2015
    Inventors: Alexei Vladimirovich Bourd, Colin Christopher Sharp, David Rigel Garcia Garcia, Chihong Zhang
  • Publication number: 20150095914
    Abstract: A device includes a memory, and at least one programmable processor configured to determine, for each warp of a plurality of warps, whether a Boolean expression is true for a corresponding thread of each warp, pause execution of each warp having a corresponding thread for which the expression is true, determine a number of active threads for each of the plurality of warps for which the expression is true, sort the plurality of warps for which the expression is true based on the number of active threads in each of the plurality of warps, swap thread data of an active thread of a first warp of the plurality of warps with thread data of an inactive thread of a second warp of the plurality of warps, and resume execution of the at least one of the plurality of warps for which the expression is true.
    Type: Application
    Filed: October 1, 2013
    Publication date: April 2, 2015
    Applicant: QUALCOMM Incorporated
    Inventors: Chunhui Mei, Alexei Vladimirovich Bourd, Lin Chen
  • Publication number: 20150070369
    Abstract: This disclosure presents techniques and structures for preemption at arbitrary control points in graphics processing. A method of graphics processing may comprise executing commands in a command buffer, the commands operating on data in a read-modify-write memory resource, double buffering the data in the read-modify-write memory resource, such that a first buffer stores original data of the read-modify-write memory resource and a second buffer stores any modified data produced by executing the commands in the command buffer, receiving a request to preempt execution of the commands in the command buffer before completing all commands in the command buffer, and restarting execution of the commands at the start of the command buffer using the original data in the first buffer.
    Type: Application
    Filed: September 10, 2013
    Publication date: March 12, 2015
    Applicant: QUALCOMM Incorporated
    Inventors: Christopher Paul Frascati, Murat Balci, Avinash Seetharamaiah, Andrew Evan Gruber, Alexei Vladimirovich Bourd
  • Publication number: 20150022534
    Abstract: A graphics processor capable of efficiently performing arithmetic operations and computing elementary functions is described. The graphics processor has at least one arithmetic logic unit (ALU) that can perform arithmetic operations and at least one elementary function unit that can compute elementary functions. The ALU(s) and elementary function unit(s) may be arranged such that they can operate in parallel to improve throughput. The graphics processor may also include fewer elementary function units than ALUs, e.g., four ALUs and a single elementary function unit. The four ALUs may perform an arithmetic operation on (1) four components of an attribute for one pixel or (2) one component of an attribute for four pixels. The single elementary function unit may operate on one component of one pixel at a time. The use of a single elementary function unit may reduce cost while still providing good performance.
    Type: Application
    Filed: October 6, 2014
    Publication date: January 22, 2015
    Inventors: YUN DU, Guofang Jiao, Chun Yu, Alexei Vladimirovich Bourd