Patents by Inventor Scott M. Le Grand

Scott M. Le Grand has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9058677
    Abstract: One embodiment of the present invention sets forth a technique for efficiently performing broad phase collision detection using parallel spatial subdivision. The technique involves organizing candidate objects according to a hashed representation of each object centroid, constructing a cell identification (ID) array, sorting the cell ID array, creating a collision cell list, and traversing the collision cell list. The result is a candidate list of object groups that may collide, based on an initial assessment of spatial proximity. Whether a given pair of objects actually collides is determined by a precise narrow phase collision analysis.
    Type: Grant
    Filed: December 12, 2007
    Date of Patent: June 16, 2015
    Assignee: NVIDIA CORPORATION
    Inventor: Scott M. Le Grand
  • Patent number: 9058678
    Abstract: One embodiment of the present invention sets forth a technique for efficiently performing broad phase collision detection using parallel spatial subdivision. The technique involves organizing candidate objects according to a hashed representation of each object centroid, constructing a cell identification (ID) array, sorting the cell ID array, creating a collision cell list, and traversing the collision cell list. The result is a candidate list of object groups that may collide, based on an initial assessment of spatial proximity. Whether a given pair of objects actually collides is determined by a precise narrow phase collision analysis.
    Type: Grant
    Filed: December 12, 2007
    Date of Patent: June 16, 2015
    Assignee: NVIDIA CORPORATION
    Inventor: Scott M. Le Grand
  • Patent number: 8473948
    Abstract: One embodiment of the present invention sets forth a technique for synchronizing the execution of multiple cooperative thread arrays (CTAs) implementing a parallel algorithm that is mapped onto a graphics processing unit. An array of semaphores provides synchronization status to each CTA, while one designated thread within each CTA provides updated status for the CTA. The designated thread within each participating CTA reports completion of a given computational phase by updating a current semaphore within the array of semaphores. The designated thread then polls the status of the current semaphore until all participating CTAs have reported completion of the current computational phase. After each CTA has completed the current computational phase, all participating CTAs may proceed to the next computational phase.
    Type: Grant
    Filed: August 8, 2007
    Date of Patent: June 25, 2013
    Assignee: NVIDIA Corporation
    Inventor: Scott M. Le Grand
  • Patent number: 8370845
    Abstract: One embodiment of the present invention sets forth a technique for synchronizing the execution of multiple cooperative thread arrays (CTAs) implementing a parallel algorithm that is mapped onto a graphics processing unit. An array of semaphores provides synchronization status to each CTA, while one designated thread within each CTA provides updated status for the CTA. The designated thread within each participating CTA reports completion of a given computational phase by updating a current semaphore within the array of semaphores. The designated thread then polls the status of the current semaphore until all participating CTAs have reported completion of the current computational phase. After each CTA has completed the current computational phase, all participating CTAs may proceed to the next computational phase.
    Type: Grant
    Filed: August 8, 2007
    Date of Patent: February 5, 2013
    Assignee: NVIDIA Corporation
    Inventor: Scott M. Le Grand
  • Patent number: 8108659
    Abstract: Thread synchronization techniques are used to control access to a memory resource (e.g., a counter) that is shared among multiple threads. Each thread has a unique identifier and threads are assigned to instances of the shared resource so that at least one instance is shared by two or more threads. Each thread assigned to a particular instance of the shared resource has a unique ordering index. A thread is allowed to access its assigned instance of the resource at a point in the program code determined by its ordering index. The threads are advantageously synchronized (explicitly or implicitly) so that no more than one thread attempts to access the same instance of the resource at a given time.
    Type: Grant
    Filed: September 19, 2007
    Date of Patent: January 31, 2012
    Assignee: NVIDIA Corporation
    Inventor: Scott M. Le Grand
  • Patent number: 8094157
    Abstract: One embodiment of the present invention sets forth a technique for efficiently performing a radix sort operation on a graphics processing unit (GPU). The radix sort operation is conducted on an input list of data using one or more passes of a series of three processing phases. In each processing phase, thread groups are each associated with one segment of input data. In the first phase, occurrences of each radix symbol are counted and stored in a list of counters. In the second phase, the list of counters is processed by a parallel prefix sum operation to generate a list of offsets. In the third phase, the list of offsets is used to perform re-ordering on the list of data, according to the current radix symbol. To maintain sort stability, the one or more passes proceed from least significant data to most significant data in the sort key.
    Type: Grant
    Filed: August 9, 2007
    Date of Patent: January 10, 2012
    Assignee: NVIDIA Corporation
    Inventor: Scott M. Le Grand
  • Patent number: 7877573
    Abstract: One embodiment of the present invention sets forth a technique for computing a parallel prefix sum using one or more cooperative thread arrays (CTA) within a graphics processing unit. The prefix sum input list is partitioned and distributed to each CTA. Within each CTA, the input list is further partitioned for processing by individual threads in a way that avoids access conflicts to memory. Each list partition within the CTA is assigned to one of a plurality of concurrent threads, which executes a prefix sum operation the partition. The final values of the prefix sum operations form a list that is then subjected to a second prefix sum operation. Each element of the second prefix sum operation is added to each element of the subsequent partition, completing the prefix sum operation within the CTA. This technique may be extended to prefix sum operations that span two or more CTAs.
    Type: Grant
    Filed: August 8, 2007
    Date of Patent: January 25, 2011
    Assignee: NVIDIA Corporation
    Inventor: Scott M. Le Grand
  • Patent number: 7725518
    Abstract: One embodiment of the present invention sets forth a technique for computing a parallel prefix sum using one or more cooperative thread arrays (CTA) within a graphics processing unit. The prefix sum input list is partitioned and distributed to each CTA. Within each CTA, the input list is further partitioned for processing by individual threads in a way that avoids access conflicts to memory. Each list partition within the CTA is assigned to one of a plurality of concurrent threads, which executes a prefix sum operation the partition. The final values of the prefix sum operations form a list that is then subjected to a second prefix sum operation. Each element of the second prefix sum operation is added to each element of the subsequent partition, completing the prefix sum operation within the CTA. This technique may be extended to prefix sum operations that span two or more CTAs.
    Type: Grant
    Filed: August 8, 2007
    Date of Patent: May 25, 2010
    Assignee: NVIDIA Corporation
    Inventor: Scott M. Le Grand
  • Patent number: 7689541
    Abstract: One embodiment of the present invention sets forth a technique for efficiently performing a radix sort operation on a graphics processing unit (GPU). The radix sort operation is conducted on an input list of data using one or more passes of a series of three processing phases. In each processing phase, thread groups are each associated with one segment of input data. In the first phase, occurrences of each radix symbol are counted and stored in a list of counters. In the second phase, the list of counters is processed by a parallel prefix sum operation to generate a list of offsets. In the third phase, the list of offsets is used to perform re-ordering on the list of data, according to the current radix symbol. To maintain sort stability, the one or more passes proceed from least significant data to most significant data in the sort key.
    Type: Grant
    Filed: August 9, 2007
    Date of Patent: March 30, 2010
    Assignee: NVIDIA Corporation
    Inventor: Scott M. Le Grand
  • Patent number: 7624107
    Abstract: One embodiment of the present invention sets forth a technique for efficiently performing a radix sort operation on a graphics processing unit (GPU). The radix sort operation is conducted on an input list of data using one or more passes of a series of three processing phases. In each processing phase, thread groups are each associated with one segment of input data. In the first phase, occurrences of each radix symbol are counted and stored in a list of counters. In the second phase, the list of counters is processed by a parallel prefix sum operation to generate a list of offsets. In the third phase, the list of offsets is used to perform re-ordering on the list of data, according to the current radix symbol. To maintain sort stability, the one or more passes proceed from least significant data to most significant data in the sort key.
    Type: Grant
    Filed: August 9, 2007
    Date of Patent: November 24, 2009
    Assignee: NVIDIA Corporation
    Inventor: Scott M. Le Grand