Patents by Inventor Jayaram Bobba

Jayaram Bobba has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

COMPRESSION OF MACHINE LEARNING MODELS UTILIZING PSEUDO-LABELED DATA TRAINING

Publication number: 20240070926

Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.

Type: Application

Filed: September 13, 2023

Publication date: February 29, 2024

Applicant: Intel Corporation

Inventors: Joydeep Ray, Ben Ashbaugh, Prasoonkumar Surti, Pradeep Ramani, Rama Harihara, Jerin C. Justin, Jing Huang, Xiaoming Cui, Timothy B. Costa, Ting Gong, Elmoustapha Ould-ahmed-vall, Kumar Balasubramanian, Anil Thomas, Oguz H. Elibol, Jayaram Bobba, Guozhong Zhuang, Bhavani Subramanian, Gokce Keskin, Chandrasekaran Sakthivel, Rajesh Poornachandran
Dynamic assignment of down sampling intervals for data stream processing

Patent number: 11798198

Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.

Type: Grant

Filed: January 10, 2023

Date of Patent: October 24, 2023

Assignee: INTEL CORPORATION

Inventors: Joydeep Ray, Ben Ashbaugh, Prasoonkumar Surti, Pradeep Ramani, Rama Harihara, Jerin C. Justin, Jing Huang, Xiaoming Cui, Timothy B. Costa, Ting Gong, Elmoustapha Ould-ahmed-vall, Kumar Balasubramanian, Anil Thomas, Oguz H. Elibol, Jayaram Bobba, Guozhong Zhuang, Bhavani Subramanian, Gokce Keskin, Chandrasekaran Sakthivel, Rajesh Poornachandran
DYNAMIC ASSIGNMENT OF DOWN SAMPLING INTERVALS FOR DATA STREAM PROCESSING

Publication number: 20230230289

Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.

Type: Application

Filed: January 10, 2023

Publication date: July 20, 2023

Applicant: Intel Corporation

Inventors: Joydeep Ray, Ben Ashbaugh, Prasoonkumar Surti, Pradeep Ramani, Rama Harihara, Jerin C. Justin, Jing Huang, Xiaoming Cui, Timothy B. Costa, Ting Gong, Elmoustapha Ould-ahmed-vall, Kumar Balasubramanian, Anil Thomas, Oguz H. Elibol, Jayaram Bobba, Guozhong Zhuang, Bhavani Subramanian, Gokce Keskin, Chandrasekaran Sakthivel, Rajesh Poornachandran
Policy-based system interface for a real-time autonomous system

Patent number: 11557064

Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.

Type: Grant

Filed: January 23, 2020

Date of Patent: January 17, 2023

Inventors: Joydeep Ray, Ben Ashbaugh, Prasoonkumar Surti, Pradeep Ramani, Rama Harihara, Jerin C. Justin, Jing Huang, Xiaoming Cui, Timothy B. Costa, Ting Gong, Elmoustapha Ould-ahmed-vall, Kumar Balasubramanian, Anil Thomas, Oguz H. Elibol, Jayaram Bobba, Guozhong Zhuang, Bhavani Subramanian, Gokce Keskin, Chandrasekaran Sakthivel, Rajesh Poornachandran
Systolic array accelerator systems and methods

Patent number: 11003619

Abstract: The present disclosure is directed to systems and methods for decomposing systolic array circuitry to provide a plurality of N×N systolic sub-array circuits, apportioning a first tensor or array into a plurality of N×M first input arrays, and apportioning a second tensor or array into a plurality of M×N second input arrays. Systolic array control circuitry transfers corresponding ones of the first input arrays and second input arrays to a respective one of the plurality of N×N systolic sub-array circuits. As the elements included in the first input array and the elements included in the second input array are transferred to the systolic sub-array, the systolic sub-array performs one or more mathematical operations using the first and the second input arrays. The systems and methods beneficially improve the usage of the systolic array circuitry thereby advantageously reducing the number of clock cycles needed to perform a given number of calculations.

Type: Grant

Filed: February 24, 2019

Date of Patent: May 11, 2021

Assignee: Intel Corporation

Inventors: Srinivasan Narayanamoorthy, Jayaram Bobba, Ankit More
SYSTOLIC ARRAY ACCELERATOR SYSTEMS AND METHODS

Publication number: 20200272596

Abstract: The present disclosure is directed to systems and methods for decomposing systolic array circuitry to provide a plurality of N×N systolic sub-array circuits, apportioning a first tensor or array into a plurality of N×M first input arrays, and apportioning a second tensor or array into a plurality of M×N second input arrays. Systolic array control circuitry transfers corresponding ones of the first input arrays and second input arrays to a respective one of the plurality of N×N systolic sub-array circuits. As the elements included in the first input array and the elements included in the second input array are transferred to the systolic sub-array, the systolic sub-array performs one or more mathematical operations using the first and the second input arrays. The systems and methods beneficially improve the usage of the systolic array circuitry thereby advantageously reducing the number of clock cycles needed to perform a given number of calculations.

Type: Application

Filed: February 24, 2019

Publication date: August 27, 2020

Applicant: INTEL CORPORATION

Inventors: Srinivasan Narayanamoorthy, Jayaram Bobba, Ankit More
POLICY-BASED SYSTEM INTERFACE FOR A REAL-TIME AUTONOMOUS SYSTEM

Publication number: 20200258263

Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.

Type: Application

Filed: January 23, 2020

Publication date: August 13, 2020

Applicant: Intel Corporation

Inventors: Joydeep Ray, Ben Ashbaugh, Prasoonkumar Surti, Pradeep Ramani, Rama Harihara, Jerin C. Justin, Jing Huang, Xiaoming Cui, Timothy B. Costa, Ting Gong, Elmoustapha Ould-ahmed-vall, Kumar Balasubramanian, Anil Thomas, Oguz H. Elibol, Jayaram Bobba, Guozhong Zhuang, Bhavani Subramanian, Gokce Keskin, Chandrasekaran Sakthivel, Rajesh Poornachandran
Technologies for dynamic acceleration of general-purpose code using binary translation targeted to hardware accelerators with runtime execution offload

Patent number: 10740152

Abstract: Technologies for dynamic acceleration of general-purpose code include a computing device having a general-purpose processor core and one or more hardware accelerators. The computing device identifies an acceleration candidate in an application that is targeted to the processor core. The acceleration candidate may be a long-running computation of the application. The computing device translates the acceleration candidate into a translated executable targeted to the hardware accelerator. The computing device determines whether to offload execution of the acceleration candidate and, if so, executes the translated executable with the hardware accelerator. The computing device may translate the acceleration candidate into multiple translated executables, each targeted to a different hardware accelerator. The computing device may select among the translated executables in response to determining to offload execution.

Type: Grant

Filed: December 6, 2016

Date of Patent: August 11, 2020

Assignee: Intel Corporation

Inventors: Jayaram Bobba, Niranjan K. Soundararajan
Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program to multiple parallel threads

Patent number: 10725755

Abstract: Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. In some embodiments, the systems and apparatuses execute a method of original code decomposition and/or generated thread execution.

Type: Grant

Filed: June 6, 2017

Date of Patent: July 28, 2020

Assignee: Intel Corporation

Inventors: David J. Sager, Ruchira Sasanka, Ron Gabor, Shlomo Raikin, Joseph Nuzman, Leeor Peled, Jason A. Domer, Ho-Seop Kim, Youfeng Wu, Koichi Yamada, Tin-Fook Ngai, Howard H. Chen, Jayaram Bobba, Jeffrey J. Cook, Omar M. Shaikh, Suresh Srinivas
Compression in machine learning and deep learning processing

Patent number: 10546393

Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.

Type: Grant

Filed: December 30, 2017

Date of Patent: January 28, 2020

Assignee: INTEL CORPORATION

Inventors: Joydeep Ray, Ben Ashbaugh, Prasoonkumar Surti, Pradeep Ramani, Rama Harihara, Jerin C. Justin, Jing Huang, Xiaoming Cui, Timothy B. Costa, Ting Gong, Elmoustapha Ould-Ahmed-Vall, Kumar Balasubramanian, Anil Thomas, Oguz H. Elibol, Jayaram Bobba, Guozhong Zhuang, Bhavani Subramanian, Gokce Keskin, Chandrasekaran Sakthivel, Rajesh Poornachandran
COMPRESSION IN MACHINE LEARNING AND DEEP LEARNING PROCESSING

Publication number: 20190206090

Abstract: Embodiments are generally directed to compression in machine learning and deep learning processing. An embodiment of an apparatus for compression of untyped data includes a graphical processing unit (GPU) including a data compression pipeline, the data compression pipeline including a data port coupled with one or more shader cores, wherein the data port is to allow transfer of untyped data without format conversion, and a 3D compression/decompression unit to provide for compression of untyped data to be stored to a memory subsystem and decompression of untyped data from the memory subsystem.

Type: Application

Filed: December 30, 2017

Publication date: July 4, 2019

Applicant: Intel Corporation

Inventors: Joydeep Ray, Ben Ashbaugh, Prasoonkumar Surti, Pradeep Ramani, Rama Harihara, Jerin C. Justin, Jing Huang, Xiaoming Cui, Timothy B. Costa, Ting Gong, Elmoustapha Ould-Ahmed-Vall, Kumar Balasubramanian, Anil Thomas, Oguz H. Elibol, Jayaram Bobba, Guozhong Zhuang, Bhavani Subramanian, Gokce Keskin, Chandrasekaran Sakthivel, Rajesh Poornachandran
Inter-architecture compatability module to allow code module of one architecture to use library module of another architecture

Patent number: 10120663

Abstract: An inter-architecture compatibility apparatus of an aspect includes a control flow transfer reception module to receive a first call procedure operation, intended for a first architecture library module, from a first architecture code module. The first call procedure operation involves a first plurality of input parameters. An application binary interface (ABI) change module is coupled with the control flow transfer reception module. The ABI change module makes ABI changes to convert the first call procedure operation involving the first plurality of input parameters to a corresponding second call procedure operation involving a second plurality of input parameters. The second call procedure operation is compatible with a second architecture library module. A control flow transfer output module is coupled with the ABI change module. The control flow transfer output module provides the second call procedure operation to the second architecture library module.

Type: Grant

Filed: March 28, 2014

Date of Patent: November 6, 2018

Assignee: Intel Corporation

Inventors: Niranjan Hasabnis, Suresh Srinivas, Jayaram Bobba
TECHNOLOGIES FOR DYNAMIC ACCELERATION OF GENERAL-PURPOSE CODE USING HARDWARE ACCELERATORS

Publication number: 20180157531

Abstract: Technologies for dynamic acceleration of general-purpose code include a computing device having a general-purpose processor core and one or more hardware accelerators. The computing device identifies an acceleration candidate in an application that is targeted to the processor core. The acceleration candidate may be a long-running computation of the application. The computing device translates the acceleration candidate into a translated executable targeted to the hardware accelerator. The computing device determines whether to offload execution of the acceleration candidate and, if so, executes the translated executable with the hardware accelerator. The computing device may translate the acceleration candidate into multiple translated executables, each targeted to a different hardware accelerator. The computing device may select among the translated executables in response to determining to offload execution.

Type: Application

Filed: December 6, 2016

Publication date: June 7, 2018

Inventors: Jayaram Bobba, Niranjan K. Soundararajan
SYSTEMS, APPARATUSES, AND METHODS FOR A HARDWARE AND SOFTWARE SYSTEM TO AUTOMATICALLY DECOMPOSE A PROGRAM TO MULTIPLE PARALLEL THREADS

Publication number: 20180060049

Abstract: Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. In some embodiments, the systems and apparatuses execute a method of original code decomposition and/or generated thread execution.

Type: Application

Filed: June 6, 2017

Publication date: March 1, 2018

Inventors: DAVID J. SAGER, RUCHIRA SASANKA, RON GABOR, SHLOMO RAIKIN, JOSEPH NUZMAN, LEEOR PELED, JASON A. DOMER, HO-SEOP KIM, YOUFENG WU, KOICHI YAMADA, TIN-FOOK NGAI, HOWARD H. CHEN, JAYARAM BOBBA, JEFFREY J. COOK, OMAR M. SHAIKH, SURESH SRINIVAS
Using control flow data structures to direct and track instruction execution

Patent number: 9880842

Abstract: A mechanism for tracking the control flow of instructions in an application and performing one or more optimizations of a processing device, based on the control flow of the instructions in the application, is disclosed. Control flow data is generated to indicate the control flow of blocks of instructions in the application. The control flow data may include annotations that indicate whether optimizations may be performed for different blocks of instructions. The control flow data may also be used to track the execution of the instructions to determine whether an instruction in a block of instructions is assigned to a thread, a process, and/or an execution core of a processor, and to determine whether errors have occurred during the execution of the instructions.

Type: Grant

Filed: March 15, 2013

Date of Patent: January 30, 2018

Assignee: Intel Corporation

Inventors: Jayaram Bobba, Ruchira Sasanka, Jeffrey J. Cook, Abhinav Das, Arvind Krishnaswamy, David J. Sager, Jason M. Agron
Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program to multiple parallel threads

Patent number: 9672019

Abstract: Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. In some embodiments, the systems and apparatuses execute a method of original code decomposition and/or generated thread execution.

Type: Grant

Filed: December 25, 2010

Date of Patent: June 6, 2017

Assignee: Intel Corporation

Inventors: David J. Sager, Ruchira Sasanka, Ron Gabor, Shlomo Raikin, Joseph Nuzman, Leeor Peled, Jason A. Domer, Ho-Seop Kim, Youfeng Wu, Koichi Yamada, Tin-Fook Ngai, Howard H. Chen, Jayaram Bobba, Jeffery J. Cook, Omar M. Shaikh, Suresh Srinivas
Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program to multiple parallel threads

Patent number: 9189233

Abstract: Systems, apparatuses, and methods for a hardware and software system to automatically decompose a program into multiple parallel threads are described. For example, a method according to one embodiment comprises: analyzing a single-threaded region of executing program code, the analysis including identifying dependencies within the single-threaded region; determining portions of the single-threaded region of executing program code which may be executed in parallel based on the analysis; assigning the portions to two or more parallel execution tracks; and executing the portions in parallel across the assigned execution tracks.

Type: Grant

Filed: June 26, 2012

Date of Patent: November 17, 2015

Assignee: INTEL CORPORATION

Inventors: Ruchira Sasanka, Abhinav Das, Jeffrey J. Cook, Jayaram Bobba, Arvind Krishnaswamy, David J. Sager, Suresh Srinivas
Analyzing potential benefits of vectorization

Patent number: 9170789

Abstract: Embodiments of computer-implemented methods, systems, computing devices, and computer-readable media (transitory and non-transitory) are described herein for analyzing execution of a plurality of executable instructions and, based on the analysis, providing an indication of a benefit to be obtained by vectorization of at least a subset of the plurality of executable instructions. In various embodiments, the analysis may include identification of the subset of the plurality of executable instructions suitable for conversion to one or more single-instruction multiple-data (“SIMD”) instructions.

Type: Grant

Filed: March 5, 2013

Date of Patent: October 27, 2015

Assignee: Intel Corporation

Inventors: Ruchira Sasanka, Jeffrey J. Cook, Abhinav Das, Jayaram Bobba, Michael R. Greenfield, Suresh Srinivas
INTER-ARCHITECTURE COMPATABILITY MODULE TO ALLOW CODE MODULE OF ONE ARCHITECTURE TO USE LIBRARY MODULE OF ANOTHER ARCHITECTURE

Publication number: 20150277867

Abstract: An inter-architecture compatibility apparatus of an aspect includes a control flow transfer reception module to receive a first call procedure operation, intended for a first architecture library module, from a first architecture code module. The first call procedure operation involves a first plurality of input parameters. An application binary interface (ABI) change module is coupled with the control flow transfer reception module. The ABI change module makes ABI changes to convert the first call procedure operation involving the first plurality of input parameters to a corresponding second call procedure operation involving a second plurality of input parameters. The second call procedure operation is compatible with a second architecture library module. A control flow transfer output module is coupled with the ABI change module. The control flow transfer output module provides the second call procedure operation to the second architecture library module.

Type: Application

Filed: March 28, 2014

Publication date: October 1, 2015

Inventors: Niranjan Hasabnis, Suresh Srinivas, Jayaram Bobba
TRACKING CONTROL FLOW OF INSTRUCTIONS

Publication number: 20140281424

Abstract: A mechanism for tracking the control flow of instructions in an application and performing one or more optimizations of a processing device, based on the control flow of the instructions in the application, is disclosed. Control flow data is generated to indicate the control flow of blocks of instructions in the application. The control flow data may include annotations that indicate whether optimizations may be performed for different blocks of instructions. The control flow data may also be used to track the execution of the instructions to determine whether an instruction in a block of instructions is assigned to a thread, a process, and/or an execution core of a processor, and to determine whether errors have occurred during the execution of the instructions.

Type: Application

Filed: March 15, 2013

Publication date: September 18, 2014

Inventors: Jayaram Bobba, Ruchira Sasanka, Jeffrey J. Cook, Abhinav Das, Arvind Krishnaswamy, David J. Sager, Jason M. Agron

1 2 next