Patents by Inventor Jaydeep Marathe

Jaydeep Marathe has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20190171466
    Abstract: Embodiments of the present invention provide a novel solution to generate multiple linked device code portions within a final executable file. Embodiments of the present invention are operable to extract device code from their respective host object filesets and then link them together to form multiple linked device code portions. Also, using the identification process described by embodiments of the present invention, device code embedded within host objects may also be uniquely identified and linked in accordance with the protocols of conventional programming languages. Furthermore, these multiple linked device code portions may be then converted into distinct executable forms of code that may be encapsulated within a single executable file.
    Type: Application
    Filed: February 5, 2019
    Publication date: June 6, 2019
    Inventors: Jaydeep Marathe, Michael Murphy, Sean Y. Lee
  • Patent number: 10261807
    Abstract: Embodiments of the present invention provide a novel solution to generate multiple linked device code portions within a final executable file. Embodiments of the present invention are operable to extract device code from their respective host object filesets and then link them together to form multiple linked device code portions. Also, using the identification process described by embodiments of the present invention, device code embedded within host objects may also be uniquely identified and linked in accordance with the protocols of conventional programming languages. Furthermore, these multiple linked device code portions may be then converted into distinct executable forms of code that may be encapsulated within a single executable file.
    Type: Grant
    Filed: March 25, 2013
    Date of Patent: April 16, 2019
    Assignee: NVIDIA Corporation
    Inventors: Jaydeep Marathe, Michael Murphy, Sean Y. Lee
  • Patent number: 10241761
    Abstract: A system and method for processing source code for compilation. The method includes accessing a portion of host source code and determining whether the portion of the host source code comprises a device lambda expression. The method further includes in response to the portion of host code comprising the device lambda expression, determining a unique placeholder type instantiation based on the device lambda expression and modifying the device lambda expression based on the unique placeholder type instantiation to produce modified host source code. The method further includes sending the modified host source code to a host compiler.
    Type: Grant
    Filed: December 14, 2015
    Date of Patent: March 26, 2019
    Assignee: Nvidia Corporation
    Inventors: Jaydeep Marathe, Vinod Grover
  • Patent number: 10025643
    Abstract: A system and method for compiling source code (e.g., with a compiler). The method includes accessing a portion of device source code and determining whether the portion of the device source code comprises a piece of work to be launched on a device from the device. The method further includes determining a plurality of application programming interface (API) calls based on the piece of work to be launched on the device and generating compiled code based on the plurality of API calls. The compiled code comprises a first portion operable to execute on a central processing unit (CPU) and a second portion operable to execute on the device (e.g., GPU).
    Type: Grant
    Filed: January 7, 2013
    Date of Patent: July 17, 2018
    Assignee: Nvidia Corporation
    Inventors: Vinod Grover, Jaydeep Marathe, Sean Lee
  • Patent number: 9971576
    Abstract: A software development environment (SDE) and a method of compiling integrated source code. One embodiment of the SDE includes: (1) a parser configured to partition an integrated source code into a host code partition and a device code partition, the host code partition including a reference to a device variable, (2) a translator configured to: (2a) embed device machine code, compiled based on the device code partition, into a modified host code, (2b) define a pointer in the modified host code configured to be initialized, upon execution of the integrated source code, to a memory address allocated to the device variable, and (2c) replace the reference with a dereference to the pointer, and (3) a host compiler configured to employ a host library to compile the modified host code.
    Type: Grant
    Filed: November 20, 2013
    Date of Patent: May 15, 2018
    Assignee: Nvidia Corporation
    Inventors: Stephen Jones, Mark Hairgrove, Jaydeep Marathe, Vivek Kini, Bastiaan Aarts
  • Patent number: 9798569
    Abstract: A system for and method of retrieving values of captured local variables for a lambda function in Java. In one embodiment, the system includes: (1) a Java virtual machine and (2) a captured variable retriever that interacts with the Java virtual machine and configured to retrieve a signature of the lambda function from a classfile of a Java class containing the lambda function, compare the signature with a declaration of the lambda function to identify arguments corresponding to the captured local variables, modify the lambda function and cause the Java virtual machine to execute the modified lambda function.
    Type: Grant
    Filed: February 15, 2016
    Date of Patent: October 24, 2017
    Assignee: Nvidia Corporation
    Inventors: Michael Lai, Vinod Grover, Sean Lee, Jaydeep Marathe
  • Patent number: 9747107
    Abstract: A system and method for compiling or runtime executing a fork-join data parallel program with function calls. In one embodiment, the system includes: (1) a partitioner operable to partition groups into a master group and at least one worker group and (2) a thread designator associated with the partitioner and operable to designate only one thread from the master group for execution and all threads in the at least one worker group for execution.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: August 29, 2017
    Assignee: Nvidia Corporation
    Inventors: Yuan Lin, Gautam Chakrabarti, Jaydeep Marathe, Okwan Kwon, Amit Sabne
  • Publication number: 20170235586
    Abstract: A system for and method of retrieving values of captured local variables for a lambda function in Java. In one embodiment, the system includes: (1) a Java virtual machine and (2) a captured variable retriever that interacts with the Java virtual machine and configured to retrieve a signature of the lambda function from a classfile of a Java class containing the lambda function, compare the signature with a declaration of the lambda function to identify arguments corresponding to the captured local variables, modify the lambda function and cause the Java virtual machine to execute the modified lambda function.
    Type: Application
    Filed: February 15, 2016
    Publication date: August 17, 2017
    Inventors: Michael Lai, Vinod Grover, Sean Lee, Jaydeep Marathe
  • Patent number: 9727338
    Abstract: A system and method of translating functions of a program. In one embodiment, the system includes: (1) a local-scope variable identifier operable to identify local-scope variables employed in the at least some of the functions as being either thread-shared local-scope variables or thread-private local-scope variables and (2) a function translator associated with the local-scope variable identifier and operable to translate the at least some of the functions to cause thread-shared memory to be employed to store the thread-shared local-scope variables and thread-private memory to be employed to store the thread-private local-scope variables.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: August 8, 2017
    Assignee: Nvidia Corporation
    Inventors: Yuan Lin, Gautam Chakrabarti, Jaydeep Marathe, Okwan Kwon, Amit Sabne
  • Patent number: 9710275
    Abstract: A system and method for allocating shared memory of differing properties to shared data objects and a hybrid stack data structure. In one embodiment, the system includes: (1) a hybrid stack creator configured to create, in the shared memory, a hybrid stack data structure having a lower portion having a more favorable property and a higher portion having a less favorable property and (2) a data object allocator associated with the hybrid stack creator and configured to allocate storage for shared data object in the lower portion if the lower portion has a sufficient remaining capacity to contain the shared data object and alternatively allocate storage for the shared data object in the higher portion if the lower portion has an insufficient remaining capacity to contain the shared data object.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: July 18, 2017
    Assignee: Nvidia Corporation
    Inventors: Jaydeep Marathe, Yuan Lin, Gautam Chakrabarti, Okwan Kwon, Amit Sabne
  • Publication number: 20170147299
    Abstract: A system and method for optimizing multiple invocations of a graphics processing unit (GPU) program in Java. In one embodiment, the system includes: (1) a frontend component in a computer system and configured to compile Java bytecode associated with the a class object that implements a functional interface into Intermediate Representation (IR) code and store the IR code with the associated jogArray and (2) a collector/composer component in the computer system, associated with the frontend and configured to traverse a tree containing the multiple invocations from the result to collect the IR code and compose the IR code collected in the traversing into aggregate IR code when a result of the GPU program is explicitly requested to be transferred to a host.
    Type: Application
    Filed: November 24, 2015
    Publication date: May 25, 2017
    Inventors: Michael Lai, Vinod Grover, Sean Lee, Jaydeep Marathe
  • Patent number: 9483235
    Abstract: Embodiments of the present invention provide a novel solution that supports the separate compilation of host code and device code used within a heterogeneous programming environment. Embodiments of the present invention are operable to link device code embedded within multiple host object files using a separate device linking operation. Embodiments of the present invention may extract device code from their respective host object files and then linked them together to form linked device code. This linked device code may then be embedded back into a host object generated by embodiments of the present invention which may then be passed to a host linker to form a host executable file. As such, device code may be split into multiple files and then linked together to form a final executable file by embodiments of the present invention.
    Type: Grant
    Filed: March 25, 2013
    Date of Patent: November 1, 2016
    Assignee: NVIDIA Corporation
    Inventors: Michael Murphy, Sean Y. Lee, Stephen Jones, Girish Bharambe, Jaydeep Marathe
  • Patent number: 9436475
    Abstract: A system and method for executing sequential code in the context of a single-instruction, multiple-thread (SIMT) processor. In one embodiment, the system includes: (1) a pipeline control unit operable to create a group of counterpart threads of the sequential code, one of the counterpart threads being a master thread, remaining ones of the counterpart threads being slave threads and (2) lanes operable to: (2a) execute certain instructions of the sequential code only in the master thread, corresponding instructions in the slave threads being predicated upon the certain instructions and (2b) broadcast branch conditions in the master thread to the slave threads.
    Type: Grant
    Filed: December 21, 2012
    Date of Patent: September 6, 2016
    Assignee: NVIDIA CORPORATION
    Inventors: Gautam Chakrabarti, Yuan Lin, Jaydeep Marathe, Okwan Kwon, Amit Sabne
  • Publication number: 20160188352
    Abstract: A system and method for processing source code for compilation. The method includes accessing a portion of host source code and determining whether the portion of the host source code comprises a device lambda expression. The method further includes in response to the portion of host code comprising the device lambda expression, determining a unique placeholder type instantiation based on the device lambda expression and modifying the device lambda expression based on the unique placeholder type instantiation to produce modified host source code. The method further includes sending the modified host source code to a host compiler.
    Type: Application
    Filed: December 14, 2015
    Publication date: June 30, 2016
    Inventors: Jaydeep MARATHE, Vinod GROVER
  • Patent number: 9367306
    Abstract: A technique is disclosed for executing a program designed for multi-threaded operation on a general purpose processor. Original source code for the program is transformed from a multi-threaded structure into a computationally equivalent single-threaded structure. A transform operation modifies the original source code to insert code constructs for serial thread execution. The transform operation also replaces synchronization barrier constructs in the original source code with synchronization barrier code that is configured to facilitate serialization. The transformed source code may then be conventionally compiled and advantageously executed on the general purpose processor.
    Type: Grant
    Filed: March 30, 2011
    Date of Patent: June 14, 2016
    Assignee: NVIDIA CORPORATION
    Inventors: Jaydeep Marathe, Vinod Grover
  • Patent number: 9229698
    Abstract: A method for processing a function with a plurality of execution spaces is disclosed. The method comprises creating an internal compiler representation for the function. Creating the internal compiler representation comprises copying substantially all lexical tokens corresponding to a body of the function. Further, the creating comprises inserting the lexical tokens into a plurality of conditional if-statements, wherein a conditional if-statement is generated for each corresponding execution space of said plurality of execution spaces, and wherein each conditional if-statement determines which execution space the function is executing in. During compilation, the method finally comprises performing overload resolution at a call site of an overloaded function by checking for compatibility with a first execution space specified by one of the plurality of conditional if-statements, wherein the overloaded function is called within the body of the function.
    Type: Grant
    Filed: November 25, 2013
    Date of Patent: January 5, 2016
    Assignee: NVIDIA CORPORATION
    Inventor: Jaydeep Marathe
  • Publication number: 20150149987
    Abstract: A method for processing a function with a plurality of execution spaces is disclosed. The method comprises creating an internal compiler representation for the function. Creating the internal compiler representation comprises copying substantially all lexical tokens corresponding to a body of the function. Further, the creating comprises inserting the lexical tokens into a plurality of conditional if-statements, wherein a conditional if-statement is generated for each corresponding execution space of said plurality of execution spaces, and wherein each conditional if-statement determines which execution space the function is executing in. During compilation, the method finally comprises performing overload resolution at a call site of an overloaded function by checking for compatibility with a first execution space specified by one of the plurality of conditional if-statements, wherein the overloaded function is called within the body of the function.
    Type: Application
    Filed: November 25, 2013
    Publication date: May 28, 2015
    Inventor: Jaydeep MARATHE
  • Publication number: 20150143347
    Abstract: A software development environment (SDE) and a method of compiling integrated source code. One embodiment of the SDE includes: (1) a parser configured to partition an integrated source code into a host code partition and a device code partition, the host code partition including a reference to a device variable, (2) a translator configured to: (2a) embed device machine code, compiled based on the device code partition, into a modified host code, (2b) define a pointer in the modified host code configured to be initialized, upon execution of the integrated source code, to a memory address allocated to the device variable, and (2c) replace the reference with a dereference to the pointer, and (3) a host compiler configured to employ a host library to compile the modified host code.
    Type: Application
    Filed: November 20, 2013
    Publication date: May 21, 2015
    Applicant: NVIDIA CORPORATION
    Inventors: Stephen Jones, Mark Hairgrove, Jaydeep Marathe, Vivek Kini, Bastiaan Aarts
  • Publication number: 20140129783
    Abstract: A system and method for allocating shared memory of differing properties to shared data objects and a hybrid stack data structure. In one embodiment, the system includes: (1) a hybrid stack creator configured to create, in the shared memory, a hybrid stack data structure having a lower portion having a more favorable property and a higher portion having a less favorable property and (2) a data object allocator associated with the hybrid stack creator and configured to allocate storage for shared data object in the lower portion if the lower portion has a sufficient remaining capacity to contain the shared data object and alternatively allocate storage for the shared data object in the higher portion if the lower portion has an insufficient remaining capacity to contain the shared data object.
    Type: Application
    Filed: December 21, 2012
    Publication date: May 8, 2014
    Applicant: NVIDIA
    Inventors: Jaydeep Marathe, Gautam Chakrabarti, Yuan Lin, Okwan Kwon, Amit Sabne
  • Publication number: 20140130052
    Abstract: A system and method for compiling or runtime executing a fork-join data parallel program with function calls. In one embodiment, the system includes: (1) a partitioner operable to partition groups into a master group and at least one worker group and (2) a thread designator associated with the partitioner and operable to designate only one thread from the master group for execution and all threads in the at least one worker group for execution.
    Type: Application
    Filed: December 21, 2012
    Publication date: May 8, 2014
    Applicant: Nvidia Corporation
    Inventors: Yuan Lin, Gautam Chakrabarti, Jaydeep Marathe, Okwan Kwon, Amit Sabne