Patents by Inventor Ronny M. KRASHINSKY

Ronny M. KRASHINSKY has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

RECONFIGURING REGISTER AND SHARED MEMORY USAGE IN THREAD ARRAYS

Publication number: 20230297426

Abstract: Various embodiments include techniques for utilizing resources on a processing unit. Thread groups executing on a processor begin execution with specified resources, such as a number of registers and an amount of shared memory. During execution, one or more thread groups may determine that the thread groups have excess resources needed to execute the current functions. Such thread groups can deallocate the excess resources to a free pool. Similarly, during execution, one or more thread groups may determine that the thread groups have fewer resources needed to execute the current functions. Such thread groups can allocate the needed resources from the free pool. Further, producer thread groups that generate data for consumer thread groups can deallocate excess resources prior to completion. The consumer thread groups can allocate the excess resources and initiate execution while the producer thread groups complete execution, thereby decreasing latency between producer and consumer thread groups.

Type: Application

Filed: March 18, 2022

Publication date: September 21, 2023

Inventors: Rajballav DASH, Stephen JONES, Jack Hilaire CHOQUETTE, Manan PATEL, Ronny M. KRASHINSKY, Shirish GADRE, Lixia QIN
EFFICIENTLY LAUNCHING TASKS ON A PROCESSOR

Publication number: 20230236878

Abstract: In various embodiments, scheduling dependencies associated with tasks executed on a processor are decoupled from data dependencies associated with the tasks. Before the completion of a first task that is executing in the processor, a scheduling dependency specifying that a second task is dependent on the first task is resolved based on a pre-exit trigger. In response to the resolution of the scheduling dependency, the second task is launched on the processor.

Type: Application

Filed: January 25, 2022

Publication date: July 27, 2023

Inventors: Jack Hilaire CHOQUETTE, Rajballav DASH, Shayani DEB, Gentaro HIROTA, Ronny M. KRASHINSKY, Ze LONG, Chen MEI, Manan PATEL, Ming Y. SIU
THREAD SYNCHRONIZATION ACROSS MEMORY SYNCHRONIZATION DOMAINS

Publication number: 20230021678

Abstract: Various embodiments include a parallel processing computer system that provides multiple memory synchronization domains in a single parallel processor to reduce unneeded synchronization operations. During execution, one execution kernel may synchronize with one or more other execution kernels by processing outstanding memory references. The parallel processor tracks memory references for each domain to each portion of local and remote memory. During synchronization, the processor synchronizes the memory references for a specific domain while refraining from synchronizing memory references for other domains. As a result, synchronization operations between kernels complete in a reduced amount of time relative to prior approaches.

Type: Application

Filed: July 20, 2021

Publication date: January 26, 2023

Inventors: Michael Allen PARKER, Debajit BHATTACHARYA, David FONTAINE, Shirish GADRE, Wishwesh Anil GANDHI, Olivier GIROUX, Hemayet HOSSAIN, Ronny M. KRASHINSKY, Ze LONG, Raymond Hoi Man WONG
Temporal SIMT execution optimization through elimination of redundant operations

Patent number: 9830156

Abstract: One embodiment of the present invention sets forth a technique for optimizing parallel thread execution in a temporal single-instruction multiple thread (SIMT) architecture. When the threads in a parallel thread group execute temporally on a common processing pipeline rather than spatially on parallel processing pipelines, execution cycles may be reduced when some threads in the parallel thread group are inactive due to divergence. Similarly, an instruction can be dispatched for execution by only one thread in the parallel thread group when the threads in the parallel thread group are executing a scalar instruction. Reducing the number of threads that execute an instruction removes unnecessary or redundant operations for execution by the processing pipelines. Information about scalar operands and operations and divergence of the threads is used in the instruction dispatch logic to eliminate unnecessary or redundant activity in the processing pipelines.

Type: Grant

Filed: August 12, 2011

Date of Patent: November 28, 2017

Assignee: NVIDIA Corporation

Inventor: Ronny M. Krashinsky
Method for convergence analysis based on thread variance analysis

Patent number: 9292265

Abstract: Basic blocks within a thread program are characterized for convergence based on variance analysis or corresponding instructions. Each basic block is marked as divergent based on transitive control dependence on a block that is either divergent or comprising a variant branch condition. Convergent basic blocks that are defined by invariant instructions are advantageously identified as candidates for scalarization by a thread program compiler.

Type: Grant

Filed: May 9, 2012

Date of Patent: March 22, 2016

Assignee: NVIDIA Corporation

Inventors: Vinod Grover, Yunsup Lee, Xiangyun Kong, Gautam Chakrabarti, Ronny M. Krashinsky
METHOD FOR CONVERGENCE ANALYSIS BASED ON THREAD VARIANCE ANALYSIS

Publication number: 20130305021

Abstract: Basic blocks within a thread program are characterized for convergence based on variance analysis or corresponding instructions. Each basic block is marked as divergent based on transitive control dependence on a block that is either divergent or comprising a variant branch condition. Convergent basic blocks that are defined by invariant instructions are advantageously identified as candidates for scalarization by a thread program compiler.

Type: Application

Filed: May 9, 2012

Publication date: November 14, 2013

Inventors: Vinod GROVER, Yunsup LEE, Xiangyun KONG, Gautam CHAKRABARTI, Ronny M. KRASHINSKY
TEMPORAL SIMT EXECUTION OPTIMIZATION

Publication number: 20130042090

Abstract: One embodiment of the present invention sets forth a technique for optimizing parallel thread execution in a temporal single-instruction multiple thread (SIMT) architecture. When the threads in a parallel thread group execute temporally on a common processing pipeline rather than spatially on parallel processing pipelines, execution cycles may be reduced when some threads in the parallel thread group are inactive due to divergence. Similarly, an instruction can be dispatched for execution by only one thread in the parallel thread group when the threads in the parallel thread group are executing a scalar instruction. Reducing the number of threads that execute an instruction removes unnecessary or redundant operations for execution by the processing pipelines. Information about scalar operands and operations and divergence of the threads is used in the instruction dispatch logic to eliminate unnecessary or redundant activity in the processing pipelines.

Type: Application

Filed: August 12, 2011

Publication date: February 14, 2013

Inventor: Ronny M. KRASHINSKY

RECONFIGURING REGISTER AND SHARED MEMORY USAGE IN THREAD ARRAYS

EFFICIENTLY LAUNCHING TASKS ON A PROCESSOR

THREAD SYNCHRONIZATION ACROSS MEMORY SYNCHRONIZATION DOMAINS

Temporal SIMT execution optimization through elimination of redundant operations

Method for convergence analysis based on thread variance analysis

METHOD FOR CONVERGENCE ANALYSIS BASED ON THREAD VARIANCE ANALYSIS

TEMPORAL SIMT EXECUTION OPTIMIZATION