Patents by Inventor Girish Bhaskarrao Bharambe

Girish Bhaskarrao Bharambe has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240169022
    Abstract: Apparatuses, systems, and techniques to perform computational operations in response to one or more compute uniform device architecture (CUDA) programs. In at least one embodiment, one or more computational operations are to cause one or more other computational operations to wait until matrix multiply-accumulate (MMA) memory transactions are performed.
    Type: Application
    Filed: November 30, 2022
    Publication date: May 23, 2024
    Inventors: Harold Carter Edwards, Kyrylo Perelygin, Maciej Tyrlik, Gokul Ramaswamy Hirisave Chandra Shekhara, Balaji Krishna Yugandhar Atukuri, Rishkul Kulkarni, Konstantinos Kyriakopoulos, Edward H. Gornish, David Allan Berson, Bageshri Sathe, James Player, Aman Arora, Alan Kaatz, Andrew Kerr, Haicheng Wu, Cris Cecka, Vijay Thakkar, Sean Treichler, Jack H. Choquette, Aditya Avinash Atluri, Apoorv Parle, Ronny Meir Krashinsky, Cody Addison, Girish Bhaskarrao Bharambe
  • Publication number: 20240169023
    Abstract: Apparatuses, systems, and techniques to perform computational operations in response to one or more compute uniform device architecture (CUDA) programs. In at least one embodiment, one or more computational operations are to indicate whether matrix multiply-accumulate (MMA) memory operations are complete.
    Type: Application
    Filed: November 30, 2022
    Publication date: May 23, 2024
    Inventors: Harold Carter Edwards, Kyrylo Perelygin, Maciej Tyrlik, Gokul Ramaswamy Hirisave Chandra Shekhara, Balaji Krishna Yugandhar Atukuri, Rishkul Kulkarni, Konstantinos Kyriakopoulos, Edward H. Gornish, David Allan Berson, Bageshri Sathe, James Player, Aman Arora, Alan Kaatz, Andrew Kerr, Haicheng Wu, Cris Cecka, Vijay Thakkar, Sean Treichler, Jack H. Choquette, Aditya Avinash Atluri, Apoorv Parle, Ronny Meir Krashinsky, Cody Addison, Girish Bhaskarrao Bharambe
  • Publication number: 20240168763
    Abstract: Apparatuses, systems, and techniques to perform computational operations in response to one or more compute uniform device architecture (CUDA) programs. In at least one embodiment, one or more computational operations are to cause two or more other computational operations to be performed by two or more streaming multiprocessors (SMs).
    Type: Application
    Filed: November 30, 2022
    Publication date: May 23, 2024
    Inventors: Harold Carter Edwards, Kyrylo Perelygin, Maciej Tyrlik, Gokul Ramaswamy Hirisave Chandra Shekhara, Balaji Krishna Yugandhar Atukuri, Rishkul Kulkarni, Konstantinos Kyriakopoulos, Edward H. Gornish, David Allan Berson, Bageshri Sathe, James Player, Aman Arora, Alan Kaatz, Andrew Kerr, Haicheng Wu, Cris Cecka, Vijay Thakkar, Sean Treichler, Jack H. Choquette, Aditya Avinash Atluri, Apoorv Parle, Ronny Meir Krashinsky, Cody Addison, Girish Bhaskarrao Bharambe
  • Publication number: 20240168762
    Abstract: Apparatuses, systems, and techniques to perform computational operations in response to one or more compute uniform device architecture (CUDA) programs. In at least one embodiment, one or more computational operations are to cause one or more other computational operations to wait until a portion of matrix multiply-accumulate (MMA) operations have been performed.
    Type: Application
    Filed: November 30, 2022
    Publication date: May 23, 2024
    Inventors: Harold Carter Edwards, Kyrylo Perelygin, Maciej Tyrlik, Gokul Ramaswamy Hirisave Chandra Shekhara, Balaji Krishna Yugandhar Atukuri, Rishkul Kulkarni, Konstantinos Kyriakopoulos, Edward H. Gornish, David Allan Berson, Bageshri Sathe, James Player, Aman Arora, Alan Kaatz, Andrew Kerr, Haicheng Wu, Cris Cecka, Vijay Thakkar, Sean Treichler, Jack H. Choquette, Aditya Avinash Atluri, Apoorv Parle, Ronny Meir Krashinsky, Cody Addison, Girish Bhaskarrao Bharambe
  • Publication number: 20240036955
    Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to indicate one or more limitations of one or more attributes of one or more groups of blocks of one or more threads.
    Type: Application
    Filed: September 28, 2022
    Publication date: February 1, 2024
    Inventors: Ze Long, Kyrylo Perelygin, Harold Carter Edwards, Gokul Ramaswamy Hirisave Chandra Shekhara, Jaydeep Marathe, Ronny Meir Krashinsky, Girish Bhaskarrao Bharambe
  • Publication number: 20240036953
    Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to indicate a scheduling policy of one or more blocks of one or more threads.
    Type: Application
    Filed: September 28, 2022
    Publication date: February 1, 2024
    Inventors: Ze Long, Kyrylo Perelygin, Harold Carter Edwards, Gokul Ramaswamy Hirisave Chandra Shekhara, Jaydeep Marathe, Ronny Meir Krashinsky, Girish Bhaskarrao Bharambe
  • Publication number: 20240036952
    Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to determine which of two or more blocks of threads are to be scheduled in parallel.
    Type: Application
    Filed: September 28, 2022
    Publication date: February 1, 2024
    Inventors: Ze Long, Kyrylo Perelygin, Harold Carter Edwards, Gokul Ramaswamy Hirisave Chandra Shekhara, Jaydeep Marathe, Ronny Meir Krashinsky, Girish Bhaskarrao Bharambe
  • Publication number: 20240036916
    Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to indicate a maximum number of blocks of threads capable of being scheduled in parallel.
    Type: Application
    Filed: September 28, 2022
    Publication date: February 1, 2024
    Inventors: Ze Long, Kyrylo Perelygin, Harold Carter Edwards, Gokul Ramaswamy Hirisave Chandra Shekhara, Jaydeep Marathe, Ronny Meir Krashinsky, Girish Bhaskarrao Bharambe
  • Publication number: 20240036917
    Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to indicate a maximum number of blocks of threads to be scheduled in parallel.
    Type: Application
    Filed: September 28, 2022
    Publication date: February 1, 2024
    Inventors: Ze Long, Kyrylo Perelygin, Harold Carter Edwards, Gokul Ramaswamy Hirisave Chandra Shekhara, Jaydeep Marathe, Ronny Meir Krashinsky, Girish Bhaskarrao Bharambe
  • Publication number: 20240036956
    Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to indicate whether one or more threads within a group of blocks of threads have performed a barrier instruction and to cause performance of one or more threads within the group of blocks of threads to stop at least until all threads within the group of blocks have performed the barrier instruction.
    Type: Application
    Filed: September 28, 2022
    Publication date: February 1, 2024
    Inventors: Ze Long, Kyrylo Perelygin, Harold Carter Edwards, Gokul Ramaswamy Hirisave Chandra Shekhara, Jaydeep Marathe, Ronny Meir Krashinsky, Girish Bhaskarrao Bharambe
  • Publication number: 20240036915
    Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to determine a scheduling policy of one or more blocks of one or more threads.
    Type: Application
    Filed: September 28, 2022
    Publication date: February 1, 2024
    Inventors: Ze Long, Kyrylo Perelygin, Harold Carter Edwards, Gokul Ramaswamy Hirisave Chandra Shekhara, Jaydeep Marathe, Ronny Meir Krashinsky, Girish Bhaskarrao Bharambe
  • Publication number: 20240036945
    Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to cause performance of one or more threads within a group of blocks of threads to stop at least until all threads within the group of blocks have performed a barrier instruction.
    Type: Application
    Filed: September 28, 2022
    Publication date: February 1, 2024
    Inventors: Ze Long, Kyrylo Perelygin, Harold Carter Edwards, Gokul Ramaswamy Hirisave Chandra Shekhara, Jaydeep Marathe, Ronny Meir Krashinsky, Girish Bhaskarrao Bharambe
  • Publication number: 20240036944
    Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to indicate whether one or more threads within two or more blocks of threads have performed a barrier instruction.
    Type: Application
    Filed: September 28, 2022
    Publication date: February 1, 2024
    Inventors: Ze Long, Kyrylo Perelygin, Harold Carter Edwards, Gokul Ramaswamy Hirisave Chandra Shekhara, Jaydeep Marathe, Ronny Meir Krashinsky, Girish Bhaskarrao Bharambe
  • Publication number: 20240036957
    Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to cause memory to be shared between two or more groups of blocks of threads.
    Type: Application
    Filed: September 28, 2022
    Publication date: February 1, 2024
    Inventors: Ze Long, Kyrylo Perelygin, Harold Carter Edwards, Gokul Ramaswamy Hirisave Chandra Shekhara, Jaydeep Marathe, Ronny Meir Krashinsky, Girish Bhaskarrao Bharambe
  • Publication number: 20240036954
    Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to indicate one or more attributes of one or more groups of blocks of one or more threads.
    Type: Application
    Filed: September 28, 2022
    Publication date: February 1, 2024
    Inventors: Ze Long, Kyrylo Perelygin, Harold Carter Edwards, Gokul Ramaswamy Hirisave Chandra Shekhara, Jaydeep Marathe, Ronny Meir Krashinsky, Girish Bhaskarrao Bharambe
  • Publication number: 20240036918
    Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to cause a kernel to be generated to cause two or more blocks of two or more threads to be scheduled in parallel.
    Type: Application
    Filed: September 28, 2022
    Publication date: February 1, 2024
    Inventors: Ze Long, Kyrylo Perelygin, Harold Carter Edwards, Gokul Ramaswamy Hirisave Chandra Shekhara, Jaydeep Marathe, Ronny Meir Krashinsky, Girish Bhaskarrao Bharambe
  • Publication number: 20240036951
    Abstract: Apparatuses, systems, and techniques to execute CUDA programs. In at least one embodiment, an application programming interface is performed to indicate two or more blocks of threads to be scheduled in parallel.
    Type: Application
    Filed: September 28, 2022
    Publication date: February 1, 2024
    Inventors: Ze Long, Kyrylo Perelygin, Harold Carter Edwards, Gokul Ramaswamy Hirisave Chandra Shekhara, Jaydeep Marathe, Ronny Meir Krashinsky, Girish Bhaskarrao Bharambe
  • Publication number: 20230401044
    Abstract: Systems and methods related to generating machine code using a coroutine suspension mechanism are disclosed below. An asynchronous programming model utilizing coroutines may be implemented in a compiler for a high-level programming language. The compiler is configured to include functionality related to an intrinsic function for a suspend operation of a coroutine. In accordance with an aspect of the disclosure, a method is disclosed for generating machine code that includes the coroutine mechanism. The method includes: receiving source code for a program in a high-level programming language, and compiling the source code with a compiler to generate machine code for a target processor. The source code includes a caller and a coroutine called by the caller. The compiler is configured to detect an intrinsic function for a suspend operation in the source code for the coroutine. The compiler inserts low-level code in the machine code in accordance with an ABI.
    Type: Application
    Filed: June 14, 2022
    Publication date: December 14, 2023
    Inventors: Konstantinos Kyriakopoulos, Michael F. Haidl, Ralf Andreas Karrenberg, Zhiwei Cao, Gokul Ramaswamy Hirisave Chandra Shekhara, Girish Bhaskarrao Bharambe, Justin Andrew Holewinski, Bharath Vasudevan
  • Publication number: 20230305845
    Abstract: Apparatuses, systems, and techniques to cause data to be selectively stored in one or more memory locations. In at least one embodiment, a processor is to cause data to be selectively stored in one or more memory locations based, at least in part, on one or more threads to use the data.
    Type: Application
    Filed: March 31, 2022
    Publication date: September 28, 2023
    Inventors: Harold Carter Edwards, Stephen Anthony Bernard Jones, David Anthony Fontaine, Sebastian Piotr Jodlowski, Aditya Avinash Atluri, Andrew Robert Kerr, Michael Andrew Clark, Gonzalo Brito Gadeschi, Olivier Giroux, Jaydeep Marathe, Thibaut Lutz, Hariharan Sandanagobalane, Gokul Ramaswamy Hirisave Chandra Shekhara, Girish Bhaskarrao Bharambe, Rishkul Kulkarni, Konstantinos Kyriakopoulos
  • Publication number: 20230221960
    Abstract: Apparatuses, systems, and techniques to enable a program to access data regardless of where said data is stored. In at least one embodiment, a system enables a program to access data regardless of where said data is stored, based on, for example, one or more locations encoding one or more addresses of said data.
    Type: Application
    Filed: January 13, 2022
    Publication date: July 13, 2023
    Inventors: Maciej Marcin Piechotka, Kyrylo Perelygin, Ze Long, Raphael Dominique Pierre Boissel, Michael Murphy, Anis Ladram, Isaac Gelado, Girish Bhaskarrao Bharambe, Sebastian Piotr Jodlowski