Patents by Inventor Vivek KINI
Vivek KINI has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20200364821Abstract: The present invention facilitates efficient and effective utilization of unified virtual addresses across multiple components. In one exemplary implementation, an address allocation process comprises: establishing space for managed pointers across a plurality of memories, including allocating one of the managed pointers with a first portion of memory associated with a first one of a plurality of processors; and performing a process of automatically managing accesses to the managed pointers across the plurality of processors and corresponding memories. The automated management can include ensuring consistent information associated with the managed pointers is copied from the first portion of memory to a second portion of memory associated with a second one of the plurality of processors based upon initiation of an accesses to the managed pointers from the second one of the plurality of processors.Type: ApplicationFiled: July 2, 2020Publication date: November 19, 2020Inventors: Stephen Jones, Vivek Kini, Piotr Jaroszynski, Mark Hairgrove, David Fontaine, Cameron Buschardt, Lucien Dunning, John Hubbard
-
Patent number: 10762593Abstract: The present invention facilitates efficient and effective utilization of unified virtual addresses across multiple components. In one exemplary implementation, an address allocation process comprises: establishing space for managed pointers across a plurality of memories, including allocating one of the managed pointers with a first portion of memory associated with a first one of a plurality of processors; and performing a process of automatically managing accesses to the managed pointers across the plurality of processors and corresponding memories. The automated management can include ensuring consistent information associated with the managed pointers is copied from the first portion of memory to a second portion of memory associated with a second one of the plurality of processors based upon initiation of an accesses to the managed pointers from the second one of the plurality of processors.Type: GrantFiled: December 31, 2018Date of Patent: September 1, 2020Assignee: NVIDIA CORPORATIONInventors: Stephen Jones, Vivek Kini, Piotr Jaroszynski, Mark Hairgrove, David Fontaine, Cameron Buschardt, Lucien Dunning, John Hubbard
-
Publication number: 20200265543Abstract: The present invention facilitates efficient and effective utilization of unified virtual addresses across multiple components. In one exemplary implementation, an address allocation process comprises: establishing space for managed pointers across a plurality of memories, including allocating one of the managed pointers with a first portion of memory associated with a first one of a plurality of processors; and performing a process of automatically managing accesses to the managed pointers across the plurality of processors and corresponding memories. The automated management can include ensuring consistent information associated with the managed pointers is copied from the first portion of memory to a second portion of memory associated with a second one of the plurality of processors based upon initiation of an accesses to the managed pointers from the second one of the plurality of processors.Type: ApplicationFiled: December 31, 2018Publication date: August 20, 2020Inventors: Stephen Jones, Vivek Kini, Piotr Jaroszynski, Mark Hairgrove, David Fontaine, Cameron Buschardt, Lucien Dunning, John Hubbard
-
Patent number: 10546361Abstract: The present invention facilitates efficient and effective utilization of unified virtual addresses across multiple components. In one exemplary implementation, an address allocation process comprises: establishing space for managed pointers across a plurality of memories, including allocating one of the managed pointers with a first portion of memory associated with a first one of a plurality of processors; and performing a process of automatically managing accesses to the managed pointers across the plurality of processors and corresponding memories. The automated management can include ensuring consistent information associated with the managed pointers is copied from the first portion of memory to a second portion of memory associated with a second one of the plurality of processors based upon initiation of an accesses to the managed pointers from the second one of the plurality of processors.Type: GrantFiled: September 19, 2017Date of Patent: January 28, 2020Assignee: NVIDIA CORPORATIONInventors: Stephen Jones, Vivek Kini, Piotr Jaroszynski, Mark Hairgrove, David Fontaine, Cameron Buschardt, Lucien Dunning, John Hubbard
-
Publication number: 20190147561Abstract: The present invention facilitates efficient and effective utilization of unified virtual addresses across multiple components. In one exemplary implementation, an address allocation process comprises: establishing space for managed pointers across a plurality of memories, including allocating one of the managed pointers with a first portion of memory associated with a first one of a plurality of processors; and performing a process of automatically managing accesses to the managed pointers across the plurality of processors and corresponding memories. The automated management can include ensuring consistent information associated with the managed pointers is copied from the first portion of memory to a second portion of memory associated with a second one of the plurality of processors based upon initiation of an accesses to the managed pointers from the second one of the plurality of processors.Type: ApplicationFiled: December 31, 2018Publication date: May 16, 2019Inventors: Stephen Jones, Vivek Kini, Piotr Jaroszynski, Mark Hairgrove, David Fontaine, Cameron Buschardt, Lucien Dunning, John Hubbard
-
Patent number: 9971576Abstract: A software development environment (SDE) and a method of compiling integrated source code. One embodiment of the SDE includes: (1) a parser configured to partition an integrated source code into a host code partition and a device code partition, the host code partition including a reference to a device variable, (2) a translator configured to: (2a) embed device machine code, compiled based on the device code partition, into a modified host code, (2b) define a pointer in the modified host code configured to be initialized, upon execution of the integrated source code, to a memory address allocated to the device variable, and (2c) replace the reference with a dereference to the pointer, and (3) a host compiler configured to employ a host library to compile the modified host code.Type: GrantFiled: November 20, 2013Date of Patent: May 15, 2018Assignee: Nvidia CorporationInventors: Stephen Jones, Mark Hairgrove, Jaydeep Marathe, Vivek Kini, Bastiaan Aarts
-
Patent number: 9886736Abstract: A method for handling parallel processing clients associated with a server in a GPU, the method comprising: receiving a failure indication for at least client running a thread in the GPU; determining threads in the GPU associated with the failing client; exiting threads in the GPU associated with the failing client; and continuing to execute remaining threads in the GPU for other clients running threads in the GPU.Type: GrantFiled: September 9, 2014Date of Patent: February 6, 2018Assignee: NVIDIA CORPORATIONInventors: Kyrylo Perelygin, Vivek Kini, Vyas Venkataraman
-
Publication number: 20180018750Abstract: The present invention facilitates efficient and effective utilization of unified virtual addresses across multiple components. In one exemplary implementation, an address allocation process comprises: establishing space for managed pointers across a plurality of memories, including allocating one of the managed pointers with a first portion of memory associated with a first one of a plurality of processors; and performing a process of automatically managing accesses to the managed pointers across the plurality of processors and corresponding memories. The automated management can include ensuring consistent information associated with the managed pointers is copied from the first portion of memory to a second portion of memory associated with a second one of the plurality of processors based upon initiation of an accesses to the managed pointers from the second one of the plurality of processors.Type: ApplicationFiled: September 19, 2017Publication date: January 18, 2018Inventors: Stephen Jones, Vivek Kini, Piotr Jaroszynski, Mark Hairgrove, David Fontaine, Cameron Buschardt, Lucien Dunning, John Hubbard
-
Patent number: 9632834Abstract: One embodiment sets forth a method for assigning priorities to kernels launched by a software application and executed within a stream of work on a parallel processing subsystem. First, the software application assigns a desired priority to a stream using a call included in the API. The API receives this call and passes it to a driver. The driver maps the desired priority to an appropriate device priority associated with the parallel processing subsystem. Subsequently, if the software application launches a particular kernel within the stream, then the driver assigns the device priority associated with the stream to the kernel before adding the kernel to the stream for execution on the parallel processing subsystem. Advantageously, by assigning priorities to streams and, subsequently, strategically launching kernels within the prioritized streams, an application developer may fine-tune the software application to increase the overall processing efficiency of the software application.Type: GrantFiled: May 17, 2013Date of Patent: April 25, 2017Assignee: NVIDIA CorporationInventors: Vivek Kini, Forrest Iandola, Timothy James Murray
-
Patent number: 9575760Abstract: One embodiment sets forth a method for assigning priorities to kernels launched by a software application and executed within a stream of work on a parallel processing subsystem that supports dynamic parallelism. First, the software application assigns a maximum nesting depth for dynamic parallelism. The software application then assigns a stream priority to a stream. These assignments cause a driver to map the stream priority to a device priority and, subsequently, associate the device priority with the stream. As part of the mapping, the driver ensures that each device priority is at least the maximum nesting depth higher than the device priorities associated with any lower priority streams. Subsequently, the driver launches any kernel included in the stream with the device priority associated with the stream. Advantageously, by strategically assigning the maximum nesting depth and prioritizing streams, an application developer may increase the overall processing efficiency of the software application.Type: GrantFiled: May 17, 2013Date of Patent: February 21, 2017Assignee: NVIDIA CorporationInventors: Vivek Kini, Christopher Lamb
-
Patent number: 9483423Abstract: One embodiment sets forth a method for guiding the order in which a parallel processing subsystem executes memory copies. A driver creates semaphores for all but the lowest priority included in a plurality of priorities and associates one priority with each copy hardware channel included in the parallel processing subsystem. The driver then aliases prioritized streams to the copy hardware channels based on the priorities. Upon receiving a request to execute a memory copy within one of the streams, the driver inserts commands into the aliased copy hardware channel. These commands use the semaphores to direct the parallel processing subsystem to execute the memory copy based on the priority of the copy hardware channel. Advantageously, by assigning priorities to streams and, subsequently, strategically requesting memory copies within the prioritized streams, an application developer may fine-tune their software application to increase the overall processing efficiency of the software application.Type: GrantFiled: May 17, 2013Date of Patent: November 1, 2016Assignee: NVIDIA CorporationInventors: Vivek Kini, Christopher Lamb, Mark Hairgrove
-
Publication number: 20150206272Abstract: A method for handling parallel processing clients associated with a server in a GPU, the method comprising: receiving a failure indication for at least client running a thread in the GPU; determining threads in the GPU associated with the failing client; exiting threads in the GPU associated with the failing client; and continuing to execute remaining threads in the GPU for other clients running threads in the GPU.Type: ApplicationFiled: September 9, 2014Publication date: July 23, 2015Inventors: Kyrylo PERELYGIN, Vivek KINI, Vyas VENKATARAMAN
-
Publication number: 20150206277Abstract: The present invention facilitates efficient and effective utilization of unified virtual addresses across multiple components. In one embodiment, the presented new approach or solution uses Operating System (OS) allocation on the central processing unit (CPU) combined with graphics processing unit (GPU) driver mappings to provide a unified virtual address (VA) across both GPU and CPU. The new approach helps ensure that a GPU VA pointer does not collide with a CPU pointer provided by OS CPU allocation (e.g., like one returned by “malloc” C runtime API, etc.).Type: ApplicationFiled: January 20, 2015Publication date: July 23, 2015Inventors: Amit RAO, Ashish SRIVASTAVA, Yogesh KINI, Alban DOUILLET, Geoffrey GERFIN, Mayank KAUSHIK, Nikita SHULGA, Vyas VENKATARAMAN, David FONTAINE, Mark HAIRGROVE, Piotr JAROSZYNSKI, Stephen JONES, Vivek KINI
-
Publication number: 20150143347Abstract: A software development environment (SDE) and a method of compiling integrated source code. One embodiment of the SDE includes: (1) a parser configured to partition an integrated source code into a host code partition and a device code partition, the host code partition including a reference to a device variable, (2) a translator configured to: (2a) embed device machine code, compiled based on the device code partition, into a modified host code, (2b) define a pointer in the modified host code configured to be initialized, upon execution of the integrated source code, to a memory address allocated to the device variable, and (2c) replace the reference with a dereference to the pointer, and (3) a host compiler configured to employ a host library to compile the modified host code.Type: ApplicationFiled: November 20, 2013Publication date: May 21, 2015Applicant: NVIDIA CORPORATIONInventors: Stephen Jones, Mark Hairgrove, Jaydeep Marathe, Vivek Kini, Bastiaan Aarts
-
Publication number: 20140344528Abstract: One embodiment sets forth a method for guiding the order in which a parallel processing subsystem executes memory copies. A driver creates semaphores for all but the lowest priority included in a plurality of priorities and associates one priority with each copy hardware channel included in the parallel processing subsystem. The driver then aliases prioritized streams to the copy hardware channels based on the priorities. Upon receiving a request to execute a memory copy within one of the streams, the driver inserts commands into the aliased copy hardware channel. These commands use the semaphores to direct the parallel processing subsystem to execute the memory copy based on the priority of the copy hardware channel. Advantageously, by assigning priorities to streams and, subsequently, strategically requesting memory copies within the prioritized streams, an application developer may fine-tune their software application to increase the overall processing efficiency of the software application.Type: ApplicationFiled: May 17, 2013Publication date: November 20, 2014Applicant: NVIDIA CORPORATIONInventors: Vivek KINI, Christopher LAMB, Mark HAIRGROVE
-
Publication number: 20140344822Abstract: One embodiment sets forth a method for assigning priorities to kernels launched by a software application and executed within a stream of work on a parallel processing subsystem. First, the software application assigns a desired priority to a stream using a call included in the API. The API receives this call and passes it to a driver. The driver maps the desired priority to an appropriate device priority associated with the parallel processing subsystem. Subsequently, if the software application launches a particular kernel within the stream, then the driver assigns the device priority associated with the stream to the kernel before adding the kernel to the stream for execution on the parallel processing subsystem. Advantageously, by assigning priorities to streams and, subsequently, strategically launching kernels within the prioritized streams, an application developer may fine-tune the software application to increase the overall processing efficiency of the software application.Type: ApplicationFiled: May 17, 2013Publication date: November 20, 2014Applicant: NVIDIA CORPORATIONInventors: Vivek KINI, Forrest IANDOLA, Timothy James MURRAY
-
Publication number: 20140344821Abstract: One embodiment sets forth a method for assigning priorities to kernels launched by a software application and executed within a stream of work on a parallel processing subsystem that supports dynamic parallelism. First, the software application assigns a maximum nesting depth for dynamic parallelism. The software application then assigns a stream priority to a stream. These assignments cause a driver to map the stream priority to a device priority and, subsequently, associate the device priority with the stream. As part of the mapping, the driver ensures that each device priority is at least the maximum nesting depth higher than the device priorities associated with any lower priority streams. Subsequently, the driver launches any kernel included in the stream with the device priority associated with the stream. Advantageously, by strategically assigning the maximum nesting depth and prioritizing streams, an application developer may increase the overall processing efficiency of the software application.Type: ApplicationFiled: May 17, 2013Publication date: November 20, 2014Applicant: NVIDIA CORPORATIONInventors: Vivek KINI, Christopher LAMB