Patents by Inventor Balaji Vembu

Balaji Vembu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

STREAMING DATA TO MULTI-TILE PROCESSING SYSTEM

Publication number: 20230342121

Abstract: A processing system comprising one or more chips, each comprising a plurality of tiles is described. Each tile comprises a respective processing unit and memory, the memory storing a codelet. The processing system has at least one encryption unit configured to encrypt and decrypt data transferred between the tiles and a trusted computing entity via an external computing device. The codelets are configured to instruct the tiles to transfer the encrypted data by reading from and writing to a plurality of memory regions at the external memory such that a plurality of streams of encrypted data are formed, each stream using an individual one of the memory regions at the external computing device.

Type: Application

Filed: July 13, 2021

Publication date: October 26, 2023

Inventors: Daniel John Pelham WILKINSON, Richard OSBORNE, Graham Bernard CUNNINGHAM, Kenneth GORDON, Samuel Alexander WEBSTER, Stavros VOLOS, Kapil VASWANI, Balaji VEMBU, Cédric Alain Marie FOURNET
Apparatus and method for dynamic provisioning, quality of service, and prioritization in a graphics processor

Patent number: 11798125

Abstract: An apparatus and method for dynamic provisioning, quality of service, and prioritization in a graphics processor. For example, one embodiment of an apparatus comprises a graphics processing unit (GPU) comprising a plurality of graphics processing resources; slice configuration hardware logic to logically subdivide the graphics processing resources into a plurality of slices; and slice allocation hardware logic to allocate a designated number of slices to each virtual machine (VM) of a plurality of VMs running in a virtualized execution environment, the slice allocation hardware logic to allocate different numbers of slices to different VMs based on graphics processing requirements and/or priorities of each of the VMs.

Type: Grant

Filed: May 31, 2022

Date of Patent: October 24, 2023

Assignee: Intel Corporation

Inventors: Abhishek R. Appu, Joydeep Ray, Altug Koker, Balaji Vembu, Pattabhiraman K, Matthew B. Callaway
Dynamic distributed training of machine learning models

Patent number: 11797837

Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: April 24, 2017

Date of Patent: October 24, 2023

Assignee: Intel Corporation

Inventors: Altug Koker, Abhishek R. Appu, Kamal Sinha, Joydeep Ray, Balaji Vembu, Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, John C. Weast, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Farshad Akhbari, Nadathur Rajagopalan Satish, Liwei Ma, Jeremy Bottleson, Eriko Nurvitadhi, Travis T. Schluessler, Ankur N. Shah, Jonathan Kennedy, Vasanth Ranganathan, Sanjeev Jahagirdar
DYNAMIC DISTRIBUTED TRAINING OF MACHINE LEARNING MODELS

Publication number: 20230334316

Abstract: Described herein is a graphics processor comprising a memory device and a graphics processing cluster coupled with the memory device. The graphics processing cluster includes a plurality of graphics multiprocessors interconnected via a data interconnect. A graphics multiprocessor includes circuitry configured to load a modular neural network including a plurality of subnetworks, each of the plurality of subnetworks trained to perform a computer vision operation on a separate subject.

Type: Application

Filed: May 9, 2023

Publication date: October 19, 2023

Applicant: Intel Corporation

Inventors: Altug Koker, Abhishek R. Appu, Kamal Sinha, Joydeep Ray, Balaji Vembu, Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, John C. Weast, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Farshad Akhbari, Nadathur Rajagopalan Satish, Liwei Ma, Jeremy Bottleson, Eriko Nurvitadhi, Travis T. Schluessler, Ankur N. Shah, Jonathan Kennedy, Vasanth Ranganathan, Sanjeev Jahagirdar
Computing platform security methods and apparatus

Patent number: 11775634

Abstract: Computing platform security methods and apparatus are disclosed. An example apparatus includes a graphics processor; and a graphics driver to facilitate access to the graphics processor, the graphics driver including: an authenticator to establish a trusted channel between the graphics driver and an application driver via mutual authentication of the graphics driver and the application driver; an offloader to offload a computing task to the graphics processor via the trusted channel, the computing task associated with the application driver; and a hypervisor to monitor memory associated with the offloaded computing task for an unauthorized access attempt.

Type: Grant

Filed: January 28, 2020

Date of Patent: October 3, 2023

Assignee: MCAFEE, LLC

Inventors: Paritosh Saxena, Adrian M. M. T. Dunbar, Michael S. Hughes, John Teddy, David Michael Durham, Balaji Vembu, Prashant Dewan, Debra Cablao, Nicholas D. Triantafillou, Jason M. Surprise
Apparatus and method for memory management in a graphics processing environment

Patent number: 11768781

Abstract: An apparatus and method are described for implementing memory management in a graphics processing system. For example, one embodiment of an apparatus comprises: a first plurality of graphics processing resources to execute graphics commands and process graphics data; a first memory management unit (MMU) to communicatively couple the first plurality of graphics processing resources to a system-level MMU to access a system memory; a second plurality of graphics processing resources to execute graphics commands and process graphics data; a second MMU to communicatively couple the second plurality of graphics processing resources to the first MMU; wherein the first MMU is configured as a master MMU having a direct connection to the system-level MMU and the second MMU comprises a slave MMU configured to send memory transactions to the first MMU, the first MMU either servicing a memory transaction or sending the memory transaction to the system-level MMU on behalf of the second MMU.

Type: Grant

Filed: May 27, 2022

Date of Patent: September 26, 2023

Assignee: Intel Corporation

Inventors: Niranjan L. Cooray, Abhishek R. Appu, Altug Koker, Joydeep Ray, Balaji Vembu, Pattabhiraman K, David Puffer, David J. Cowperthwaite, Rajesh M. Sankaran, Satyeshwar Singh, Sameer Kp, Ankur N. Shah, Kun Tian
Scheduling of threads for execution utilizing load balancing of thread groups

Patent number: 11768687

Abstract: An apparatus to facilitate thread scheduling is disclosed. The apparatus includes logic to store barrier usage data based on a magnitude of barrier messages in an application kernel and a scheduler to schedule execution of threads across a plurality of multiprocessors based on the barrier usage data.

Type: Grant

Filed: June 24, 2022

Date of Patent: September 26, 2023

Assignee: Intel Corporation

Inventors: Balaji Vembu, Abhishek R. Appu, Joydeep Ray, Altug Koker
SCALABLE I/O VIRTUALIZATION INTERRUPT AND SCHEDULING

Publication number: 20230297526

Abstract: Embodiments described herein provide techniques to facilitate scalable interrupts and workload submission for a virtualized graphics processor. The techniques include memory-based interrupt reporting and shared work queue submission for multiple software domains.

Type: Application

Filed: June 3, 2022

Publication date: September 21, 2023

Applicant: Intel Corporation

Inventors: David Puffer, Ankur Shah, Niranjan Cooray, Bryan White, Balaji Vembu, Hema Chand Nalluri, Kritika Bala
Hybrid low power homogenous grapics processing units

Patent number: 11762696

Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: November 5, 2021

Date of Patent: September 19, 2023

Assignee: INTEL CORPORATION

Inventors: Abhishek R Appu, Altug Koker, Balaji Vembu, Joydeep Ray, Kamal Sinha, Prasoonkumar Surti, Kiran C. Veernapu, Subramaniam Maiyuran, Sanjeev S. Jahagirdar, Eric J. Asperheim, Guei-Yuan Lueh, David Puffer, Wenyin Fu, Nikos Kaburlasos, Bhushan M. Borole, Josh B. Mastronarde, Linda L. Hurd, Travis T. Schluessler, Tomasz Janczak, Abhishek Venkatesh, Kai Xiao, Slawomir Grajewski
Coordination and increased utilization of graphics processors during inference

Patent number: 11748841

Abstract: A mechanism is described for facilitating inference coordination and processing utilization for machine learning. A method of embodiments, as described herein, includes limiting execution of workloads for the respective contexts of a plurality of contexts to a specified subset of a plurality of processing resources of a processing system according to physical resource slices of the processing system that are associated with the respective contexts of the plurality of contexts.

Type: Grant

Filed: July 22, 2022

Date of Patent: September 5, 2023

Assignee: Intel Corporation

Inventors: Abhishek R. Appu, Altug Koker, John C. Weast, Mike B. Macpherson, Linda L. Hurd, Sara S. Baghsorkhi, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Kamal Sinha, Joydeep Ray, Balaji Vembu, Sanjeev Jahagirdar, Vasanth Ranganathan, Dukhwan Kim
Dynamic precision for neural network compute operations

Patent number: 11748606

Abstract: In an example, an apparatus comprises a compute engine comprising a high precision component and a low precision component; and logic, at least partially including hardware logic, to receive instructions in the compute engine; select at least one of the high precision component or the low precision component to execute the instructions; and apply a gate to at least one of the high precision component or the low precision component to execute the instructions. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: May 11, 2021

Date of Patent: September 5, 2023

Assignee: INTEL CORPORATION

Inventors: Kamal Sinha, Balaji Vembu, Eriko Nurvitadhi, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Farshad Akhbari, Narayan Srinivasa, Feng Chen, Dukhwan Kim, Nadathur Rajagopalan Satish, John C. Weast, Mike B. MacPherson, Linda L. Hurd, Vasanth Ranganathan, Sanjeev S. Jahagirdar
Data operations and finite state machine for machine learning via bypass of computational tasks based on frequently-used data values

Patent number: 11748106

Abstract: A mechanism is described for facilitating fast data operations and for facilitating a finite state machine for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting input data to be used in computational tasks by a computation component of a processor including a graphics processor. The method may further include determining one or more frequently-used data values (FDVs) from the data, and pushing the one or more frequent data values to bypass the computational tasks.

Type: Grant

Filed: March 1, 2022

Date of Patent: September 5, 2023

Assignee: INTEL CORPORATION

Inventors: Liwei Ma, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Eriko Nurvitadhi, Abhishek R. Appu, Altug Koker, Kamal Sinha, Joydeep Ray, Balaji Vembu, Vasanth Ranganathan, Sanjeev Jahagirdar
Engine to enable high speed context switching via on-die storage

Patent number: 11748302

Abstract: In an example, an apparatus comprises a plurality of execution units, and a first memory communicatively couple to the plurality of execution units, wherein the first shared memory is shared by the plurality of execution units and a copy engine to copy context state data from at least a first of the plurality of execution units to the first shared memory. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: December 23, 2021

Date of Patent: September 5, 2023

Assignee: INTEL CORPORATION

Inventors: Altug Koker, Prasoonkumar Surti, David Puffer, Subramaniam Maiyuran, Guei-Yuan Lueh, Abhishek R. Appu, Joydeep Ray, Balaji Vembu, Tomer Bar-On, Andrew T. Lauritzen, Hugues Labbe, John G. Gierach, Gabor Liktor
Scalable I/O virtualization interrupt and scheduling

Patent number: 11748283

Abstract: Embodiments described herein provide techniques to facilitate scalable interrupts and workload submission for a virtualized graphics processor. The techniques include memory-based interrupt reporting and shared work queue submission for multiple software domains.

Type: Grant

Filed: June 3, 2022

Date of Patent: September 5, 2023

Assignee: Intel Corporation

Inventors: David Puffer, Ankur Shah, Niranjan Cooray, Bryan White, Balaji Vembu, Hema Chand Nalluri, Kritika Bala
Processor power management

Patent number: 11733758

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to collect user information for a user of a data processing device, generate a user profile for the user of the data processing device from the user information, and set a power profile a processor in the data processing device using the user profile. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: August 25, 2021

Date of Patent: August 22, 2023

Assignee: INTEL CORPORATION

Inventors: Altug Koker, Abhishek R. Appu, Kiran C. Veernapu, Joydeep Ray, Balaji Vembu, Prasoonkumar Surti, Kamal Sinha, Eric J. Hoekstra, Wenyin Fu, Nikos Kaburlasos, Bhushan M. Borole, Travis T. Schluessler, Ankur N. Shah, Jonathan Kennedy
SECTOR CACHE FOR COMPRESSION

Publication number: 20230259458

Abstract: One embodiment provides circuitry coupled with cache memory and a memory interface, the circuitry to compress compute data at multiple cache line granularity, and a processing resource coupled with the memory interface and the cache memory. The processing resource is configured to perform a general-purpose compute operation on compute data associated with multiple cache lines of the cache memory. The circuitry is configured to compress the compute data before a write of the compute data via the memory interface to the memory bus, in association with a read of the compute data associated with the multiple cache lines via the memory interface, decompress the compute data, and provide the decompressed compute data to the processing resource.

Type: Application

Filed: February 13, 2023

Publication date: August 17, 2023

Applicant: Intel Corporation

Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, David Puffer, Prasoonkumar Surti, Lakshminarayanan Striramassarma, Vasanth Ranganathan, Kiran C. Veernapu, Balaji Vembu, Pattabhiraman K
COMPUTE OPTIMIZATION MECHANISM FOR DEEP NEURAL NETWORKS

Publication number: 20230260072

Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.

Type: Application

Filed: February 13, 2023

Publication date: August 17, 2023

Applicant: Intel Corporation

Inventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
Programmable coarse grained and sparse matrix compute hardware with advanced scheduling

Patent number: 11727527

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to cause the compute apparatus to perform a complex compute operation.

Type: Grant

Filed: December 3, 2021

Date of Patent: August 15, 2023

Assignee: Intel Corporation

Inventors: Eriko Nurvitadhi, Balaji Vembu, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Altug Koker, Narayan Srinivasa, Dukhwan Kim, Sara S. Baghsorkhi, Justin E. Gottschlich, Feng Chen, Elmoustapha Ould-Ahmed-Vall, Kevin Nealis, Xiaoming Chen, Anbang Yao
HANDLING PIPELINE SUBMISSIONS ACROSS MANY COMPUTE UNITS

Publication number: 20230252597

Abstract: One embodiment provides an apparatus comprising an interconnect fabric comprising a processing cluster including an array of multiprocessors coupled to an interconnect fabric, scheduling circuitry to distribute a plurality of thread groups across the array of multiprocessors, each thread group comprising a plurality of threads. A first multiprocessor of the array of multiprocessors can be assigned to process a first thread group comprising a first plurality of threads including a first thread sub-group and a second thread sub-group. The second thread sub-group has a data dependency on the first thread sub-group and the first multiprocessor includes circuitry to cause threads of the second thread sub-group to sleep until the threads of the first thread sub-group have satisfied the data dependency.

Type: Application

Filed: April 13, 2023

Publication date: August 10, 2023

Applicant: Intel Corporation

Inventors: Balaji Vembu, Altug Koker, Joydeep Ray
Instructions and logic to perform floating point and integer operations for machine learning

Patent number: 11720355

Abstract: One embodiment provides a graphics processor comprising a memory controller and a graphics processing resource coupled with the memory controller. The graphics processing resource includes circuitry configured to execute an instruction to perform a matrix operation on first input including weight data and second input including input activation data, generate intermediate data based on a result of the matrix operation, quantize the intermediate data to a floating-point format determined based on a statistical distribution of first output data, and output, as second output data, quantized intermediate data in a determined floating-point format.

Type: Grant

Filed: June 7, 2022

Date of Patent: August 8, 2023

Assignee: Intel Corporation

Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar

prev 1 2 3 4 5 6 … next