Patents by Inventor Ronny M. KRASHINSKY

Ronny M. KRASHINSKY has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230297426
    Abstract: Various embodiments include techniques for utilizing resources on a processing unit. Thread groups executing on a processor begin execution with specified resources, such as a number of registers and an amount of shared memory. During execution, one or more thread groups may determine that the thread groups have excess resources needed to execute the current functions. Such thread groups can deallocate the excess resources to a free pool. Similarly, during execution, one or more thread groups may determine that the thread groups have fewer resources needed to execute the current functions. Such thread groups can allocate the needed resources from the free pool. Further, producer thread groups that generate data for consumer thread groups can deallocate excess resources prior to completion. The consumer thread groups can allocate the excess resources and initiate execution while the producer thread groups complete execution, thereby decreasing latency between producer and consumer thread groups.
    Type: Application
    Filed: March 18, 2022
    Publication date: September 21, 2023
    Inventors: Rajballav DASH, Stephen JONES, Jack Hilaire CHOQUETTE, Manan PATEL, Ronny M. KRASHINSKY, Shirish GADRE, Lixia QIN
  • Publication number: 20230236878
    Abstract: In various embodiments, scheduling dependencies associated with tasks executed on a processor are decoupled from data dependencies associated with the tasks. Before the completion of a first task that is executing in the processor, a scheduling dependency specifying that a second task is dependent on the first task is resolved based on a pre-exit trigger. In response to the resolution of the scheduling dependency, the second task is launched on the processor.
    Type: Application
    Filed: January 25, 2022
    Publication date: July 27, 2023
    Inventors: Jack Hilaire CHOQUETTE, Rajballav DASH, Shayani DEB, Gentaro HIROTA, Ronny M. KRASHINSKY, Ze LONG, Chen MEI, Manan PATEL, Ming Y. SIU
  • Publication number: 20230021678
    Abstract: Various embodiments include a parallel processing computer system that provides multiple memory synchronization domains in a single parallel processor to reduce unneeded synchronization operations. During execution, one execution kernel may synchronize with one or more other execution kernels by processing outstanding memory references. The parallel processor tracks memory references for each domain to each portion of local and remote memory. During synchronization, the processor synchronizes the memory references for a specific domain while refraining from synchronizing memory references for other domains. As a result, synchronization operations between kernels complete in a reduced amount of time relative to prior approaches.
    Type: Application
    Filed: July 20, 2021
    Publication date: January 26, 2023
    Inventors: Michael Allen PARKER, Debajit BHATTACHARYA, David FONTAINE, Shirish GADRE, Wishwesh Anil GANDHI, Olivier GIROUX, Hemayet HOSSAIN, Ronny M. KRASHINSKY, Ze LONG, Raymond Hoi Man WONG
  • Patent number: 9830156
    Abstract: One embodiment of the present invention sets forth a technique for optimizing parallel thread execution in a temporal single-instruction multiple thread (SIMT) architecture. When the threads in a parallel thread group execute temporally on a common processing pipeline rather than spatially on parallel processing pipelines, execution cycles may be reduced when some threads in the parallel thread group are inactive due to divergence. Similarly, an instruction can be dispatched for execution by only one thread in the parallel thread group when the threads in the parallel thread group are executing a scalar instruction. Reducing the number of threads that execute an instruction removes unnecessary or redundant operations for execution by the processing pipelines. Information about scalar operands and operations and divergence of the threads is used in the instruction dispatch logic to eliminate unnecessary or redundant activity in the processing pipelines.
    Type: Grant
    Filed: August 12, 2011
    Date of Patent: November 28, 2017
    Assignee: NVIDIA Corporation
    Inventor: Ronny M. Krashinsky
  • Patent number: 9292265
    Abstract: Basic blocks within a thread program are characterized for convergence based on variance analysis or corresponding instructions. Each basic block is marked as divergent based on transitive control dependence on a block that is either divergent or comprising a variant branch condition. Convergent basic blocks that are defined by invariant instructions are advantageously identified as candidates for scalarization by a thread program compiler.
    Type: Grant
    Filed: May 9, 2012
    Date of Patent: March 22, 2016
    Assignee: NVIDIA Corporation
    Inventors: Vinod Grover, Yunsup Lee, Xiangyun Kong, Gautam Chakrabarti, Ronny M. Krashinsky
  • Publication number: 20130305021
    Abstract: Basic blocks within a thread program are characterized for convergence based on variance analysis or corresponding instructions. Each basic block is marked as divergent based on transitive control dependence on a block that is either divergent or comprising a variant branch condition. Convergent basic blocks that are defined by invariant instructions are advantageously identified as candidates for scalarization by a thread program compiler.
    Type: Application
    Filed: May 9, 2012
    Publication date: November 14, 2013
    Inventors: Vinod GROVER, Yunsup LEE, Xiangyun KONG, Gautam CHAKRABARTI, Ronny M. KRASHINSKY
  • Publication number: 20130042090
    Abstract: One embodiment of the present invention sets forth a technique for optimizing parallel thread execution in a temporal single-instruction multiple thread (SIMT) architecture. When the threads in a parallel thread group execute temporally on a common processing pipeline rather than spatially on parallel processing pipelines, execution cycles may be reduced when some threads in the parallel thread group are inactive due to divergence. Similarly, an instruction can be dispatched for execution by only one thread in the parallel thread group when the threads in the parallel thread group are executing a scalar instruction. Reducing the number of threads that execute an instruction removes unnecessary or redundant operations for execution by the processing pipelines. Information about scalar operands and operations and divergence of the threads is used in the instruction dispatch logic to eliminate unnecessary or redundant activity in the processing pipelines.
    Type: Application
    Filed: August 12, 2011
    Publication date: February 14, 2013
    Inventor: Ronny M. KRASHINSKY