Patents by Inventor Balaji Vembu

Balaji Vembu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

COMPUTE OPTIMIZATIONS FOR NEURAL NETWORKS

Publication number: 20230359461

Abstract: One embodiment provides for a compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including a multi-bit input value and a one-bit weight associated with a neural network, as well as an arithmetic logic unit including a multiplier, an adder, and an accumulator register. To execute the decoded instruction, the multiplier is to perform a fused operation including an exclusive not OR (XNOR) operation and a population count operation. The adder is configured to add the intermediate product to a value stored in the accumulator register and update the value stored in the accumulator register.

Type: Application

Filed: May 11, 2023

Publication date: November 9, 2023

Applicant: Intel Corporation

Inventors: Kevin Nealis, Anbang Yao, Xiaoming Chen, Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Eriko Nurvitadhi, Balaji Vembu, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha
Autonomous vehicle advanced sensing and response

Patent number: 11810405

Abstract: An autonomous vehicle is provided that includes one or more processors configured to provide a local compute manager to manage execution of compute workloads associated with the autonomous vehicle. The local compute manager can perform various compute operations, including receiving offload of compute operations from to other compute nodes and offloading compute operations to other compute notes, where the other compute nodes can be other autonomous vehicles. The local compute manager can also facilitate autonomous navigation functionality.

Type: Grant

Filed: November 30, 2021

Date of Patent: November 7, 2023

Assignee: Intel Corporation

Inventors: Barath Lakshamanan, Linda L. Hurd, Ben J. Ashbaugh, Elmoustapha Ould-Ahmed-Vall, Liwei Ma, Jingyi Jin, Justin E. Gottschlich, Chandrasekaran Sakthivel, Michael S. Strickland, Brian T. Lewis, Lindsey Kuper, Altug Koker, Abhishek R. Appu, Prasoonkumar Surti, Joydeep Ray, Balaji Vembu, Javier S. Turek, Naila Farooqui
Machine learning sparse computation mechanism

Patent number: 11803935

Abstract: Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.

Type: Grant

Filed: August 5, 2022

Date of Patent: October 31, 2023

Assignee: Intel Corporation

Inventors: Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Nicolas C. Galoppo Von Borries
Handling pipeline submissions across many compute units

Patent number: 11803934

Abstract: One embodiment provides an apparatus comprising an interconnect fabric comprising one or more fabric switches, a plurality of memory interfaces coupled to the interconnect fabric to provide access to a plurality of memory devices, an input/output (IO) interface coupled to the interconnect fabric to provide access to IO devices, an array of multiprocessors coupled to the interconnect fabric, scheduling circuitry to distribute a plurality of thread groups across the array of multiprocessors, each thread group comprising a plurality of threads and each thread comprising a plurality of instructions to be executed by at least one of the multiprocessors, and a first multiprocessor of the array of multiprocessors to be assigned to process a first thread group comprising a first plurality of threads, the first multiprocessor comprising a plurality of parallel execution circuits.

Type: Grant

Filed: February 2, 2022

Date of Patent: October 31, 2023

Assignee: Intel Corporation

Inventors: Balaji Vembu, Altug Koker, Joydeep Ray
STREAMING DATA TO MULTI-TILE PROCESSING SYSTEM

Publication number: 20230342121

Abstract: A processing system comprising one or more chips, each comprising a plurality of tiles is described. Each tile comprises a respective processing unit and memory, the memory storing a codelet. The processing system has at least one encryption unit configured to encrypt and decrypt data transferred between the tiles and a trusted computing entity via an external computing device. The codelets are configured to instruct the tiles to transfer the encrypted data by reading from and writing to a plurality of memory regions at the external memory such that a plurality of streams of encrypted data are formed, each stream using an individual one of the memory regions at the external computing device.

Type: Application

Filed: July 13, 2021

Publication date: October 26, 2023

Inventors: Daniel John Pelham WILKINSON, Richard OSBORNE, Graham Bernard CUNNINGHAM, Kenneth GORDON, Samuel Alexander WEBSTER, Stavros VOLOS, Kapil VASWANI, Balaji VEMBU, Cédric Alain Marie FOURNET
Apparatus and method for dynamic provisioning, quality of service, and prioritization in a graphics processor

Patent number: 11798125

Abstract: An apparatus and method for dynamic provisioning, quality of service, and prioritization in a graphics processor. For example, one embodiment of an apparatus comprises a graphics processing unit (GPU) comprising a plurality of graphics processing resources; slice configuration hardware logic to logically subdivide the graphics processing resources into a plurality of slices; and slice allocation hardware logic to allocate a designated number of slices to each virtual machine (VM) of a plurality of VMs running in a virtualized execution environment, the slice allocation hardware logic to allocate different numbers of slices to different VMs based on graphics processing requirements and/or priorities of each of the VMs.

Type: Grant

Filed: May 31, 2022

Date of Patent: October 24, 2023

Assignee: Intel Corporation

Inventors: Abhishek R. Appu, Joydeep Ray, Altug Koker, Balaji Vembu, Pattabhiraman K, Matthew B. Callaway
Dynamic distributed training of machine learning models

Patent number: 11797837

Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: April 24, 2017

Date of Patent: October 24, 2023

Assignee: Intel Corporation

Inventors: Altug Koker, Abhishek R. Appu, Kamal Sinha, Joydeep Ray, Balaji Vembu, Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, John C. Weast, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Farshad Akhbari, Nadathur Rajagopalan Satish, Liwei Ma, Jeremy Bottleson, Eriko Nurvitadhi, Travis T. Schluessler, Ankur N. Shah, Jonathan Kennedy, Vasanth Ranganathan, Sanjeev Jahagirdar
DYNAMIC DISTRIBUTED TRAINING OF MACHINE LEARNING MODELS

Publication number: 20230334316

Abstract: Described herein is a graphics processor comprising a memory device and a graphics processing cluster coupled with the memory device. The graphics processing cluster includes a plurality of graphics multiprocessors interconnected via a data interconnect. A graphics multiprocessor includes circuitry configured to load a modular neural network including a plurality of subnetworks, each of the plurality of subnetworks trained to perform a computer vision operation on a separate subject.

Type: Application

Filed: May 9, 2023

Publication date: October 19, 2023

Applicant: Intel Corporation

Inventors: Altug Koker, Abhishek R. Appu, Kamal Sinha, Joydeep Ray, Balaji Vembu, Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Anbang Yao, Kevin Nealis, Xiaoming Chen, John C. Weast, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Farshad Akhbari, Nadathur Rajagopalan Satish, Liwei Ma, Jeremy Bottleson, Eriko Nurvitadhi, Travis T. Schluessler, Ankur N. Shah, Jonathan Kennedy, Vasanth Ranganathan, Sanjeev Jahagirdar
Computing platform security methods and apparatus

Patent number: 11775634

Abstract: Computing platform security methods and apparatus are disclosed. An example apparatus includes a graphics processor; and a graphics driver to facilitate access to the graphics processor, the graphics driver including: an authenticator to establish a trusted channel between the graphics driver and an application driver via mutual authentication of the graphics driver and the application driver; an offloader to offload a computing task to the graphics processor via the trusted channel, the computing task associated with the application driver; and a hypervisor to monitor memory associated with the offloaded computing task for an unauthorized access attempt.

Type: Grant

Filed: January 28, 2020

Date of Patent: October 3, 2023

Assignee: MCAFEE, LLC

Inventors: Paritosh Saxena, Adrian M. M. T. Dunbar, Michael S. Hughes, John Teddy, David Michael Durham, Balaji Vembu, Prashant Dewan, Debra Cablao, Nicholas D. Triantafillou, Jason M. Surprise
Apparatus and method for memory management in a graphics processing environment

Patent number: 11768781

Abstract: An apparatus and method are described for implementing memory management in a graphics processing system. For example, one embodiment of an apparatus comprises: a first plurality of graphics processing resources to execute graphics commands and process graphics data; a first memory management unit (MMU) to communicatively couple the first plurality of graphics processing resources to a system-level MMU to access a system memory; a second plurality of graphics processing resources to execute graphics commands and process graphics data; a second MMU to communicatively couple the second plurality of graphics processing resources to the first MMU; wherein the first MMU is configured as a master MMU having a direct connection to the system-level MMU and the second MMU comprises a slave MMU configured to send memory transactions to the first MMU, the first MMU either servicing a memory transaction or sending the memory transaction to the system-level MMU on behalf of the second MMU.

Type: Grant

Filed: May 27, 2022

Date of Patent: September 26, 2023

Assignee: Intel Corporation

Inventors: Niranjan L. Cooray, Abhishek R. Appu, Altug Koker, Joydeep Ray, Balaji Vembu, Pattabhiraman K, David Puffer, David J. Cowperthwaite, Rajesh M. Sankaran, Satyeshwar Singh, Sameer Kp, Ankur N. Shah, Kun Tian
Scheduling of threads for execution utilizing load balancing of thread groups

Patent number: 11768687

Abstract: An apparatus to facilitate thread scheduling is disclosed. The apparatus includes logic to store barrier usage data based on a magnitude of barrier messages in an application kernel and a scheduler to schedule execution of threads across a plurality of multiprocessors based on the barrier usage data.

Type: Grant

Filed: June 24, 2022

Date of Patent: September 26, 2023

Assignee: Intel Corporation

Inventors: Balaji Vembu, Abhishek R. Appu, Joydeep Ray, Altug Koker
SCALABLE I/O VIRTUALIZATION INTERRUPT AND SCHEDULING

Publication number: 20230297526

Abstract: Embodiments described herein provide techniques to facilitate scalable interrupts and workload submission for a virtualized graphics processor. The techniques include memory-based interrupt reporting and shared work queue submission for multiple software domains.

Type: Application

Filed: June 3, 2022

Publication date: September 21, 2023

Applicant: Intel Corporation

Inventors: David Puffer, Ankur Shah, Niranjan Cooray, Bryan White, Balaji Vembu, Hema Chand Nalluri, Kritika Bala
Hybrid low power homogenous grapics processing units

Patent number: 11762696

Abstract: In an example, an apparatus comprises a plurality of execution units comprising at least a first type of execution unit and a second type of execution unit and logic, at least partially including hardware logic, to analyze a workload and assign the workload to one of the first type of execution unit or the second type of execution unit. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: November 5, 2021

Date of Patent: September 19, 2023

Assignee: INTEL CORPORATION

Inventors: Abhishek R Appu, Altug Koker, Balaji Vembu, Joydeep Ray, Kamal Sinha, Prasoonkumar Surti, Kiran C. Veernapu, Subramaniam Maiyuran, Sanjeev S. Jahagirdar, Eric J. Asperheim, Guei-Yuan Lueh, David Puffer, Wenyin Fu, Nikos Kaburlasos, Bhushan M. Borole, Josh B. Mastronarde, Linda L. Hurd, Travis T. Schluessler, Tomasz Janczak, Abhishek Venkatesh, Kai Xiao, Slawomir Grajewski
Coordination and increased utilization of graphics processors during inference

Patent number: 11748841

Abstract: A mechanism is described for facilitating inference coordination and processing utilization for machine learning. A method of embodiments, as described herein, includes limiting execution of workloads for the respective contexts of a plurality of contexts to a specified subset of a plurality of processing resources of a processing system according to physical resource slices of the processing system that are associated with the respective contexts of the plurality of contexts.

Type: Grant

Filed: July 22, 2022

Date of Patent: September 5, 2023

Assignee: Intel Corporation

Inventors: Abhishek R. Appu, Altug Koker, John C. Weast, Mike B. Macpherson, Linda L. Hurd, Sara S. Baghsorkhi, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Kamal Sinha, Joydeep Ray, Balaji Vembu, Sanjeev Jahagirdar, Vasanth Ranganathan, Dukhwan Kim
Dynamic precision for neural network compute operations

Patent number: 11748606

Abstract: In an example, an apparatus comprises a compute engine comprising a high precision component and a low precision component; and logic, at least partially including hardware logic, to receive instructions in the compute engine; select at least one of the high precision component or the low precision component to execute the instructions; and apply a gate to at least one of the high precision component or the low precision component to execute the instructions. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: May 11, 2021

Date of Patent: September 5, 2023

Assignee: INTEL CORPORATION

Inventors: Kamal Sinha, Balaji Vembu, Eriko Nurvitadhi, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Farshad Akhbari, Narayan Srinivasa, Feng Chen, Dukhwan Kim, Nadathur Rajagopalan Satish, John C. Weast, Mike B. MacPherson, Linda L. Hurd, Vasanth Ranganathan, Sanjeev S. Jahagirdar
Data operations and finite state machine for machine learning via bypass of computational tasks based on frequently-used data values

Patent number: 11748106

Abstract: A mechanism is described for facilitating fast data operations and for facilitating a finite state machine for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting input data to be used in computational tasks by a computation component of a processor including a graphics processor. The method may further include determining one or more frequently-used data values (FDVs) from the data, and pushing the one or more frequent data values to bypass the computational tasks.

Type: Grant

Filed: March 1, 2022

Date of Patent: September 5, 2023

Assignee: INTEL CORPORATION

Inventors: Liwei Ma, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Eriko Nurvitadhi, Abhishek R. Appu, Altug Koker, Kamal Sinha, Joydeep Ray, Balaji Vembu, Vasanth Ranganathan, Sanjeev Jahagirdar
Engine to enable high speed context switching via on-die storage

Patent number: 11748302

Abstract: In an example, an apparatus comprises a plurality of execution units, and a first memory communicatively couple to the plurality of execution units, wherein the first shared memory is shared by the plurality of execution units and a copy engine to copy context state data from at least a first of the plurality of execution units to the first shared memory. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: December 23, 2021

Date of Patent: September 5, 2023

Assignee: INTEL CORPORATION

Inventors: Altug Koker, Prasoonkumar Surti, David Puffer, Subramaniam Maiyuran, Guei-Yuan Lueh, Abhishek R. Appu, Joydeep Ray, Balaji Vembu, Tomer Bar-On, Andrew T. Lauritzen, Hugues Labbe, John G. Gierach, Gabor Liktor
Scalable I/O virtualization interrupt and scheduling

Patent number: 11748283

Abstract: Embodiments described herein provide techniques to facilitate scalable interrupts and workload submission for a virtualized graphics processor. The techniques include memory-based interrupt reporting and shared work queue submission for multiple software domains.

Type: Grant

Filed: June 3, 2022

Date of Patent: September 5, 2023

Assignee: Intel Corporation

Inventors: David Puffer, Ankur Shah, Niranjan Cooray, Bryan White, Balaji Vembu, Hema Chand Nalluri, Kritika Bala
Processor power management

Patent number: 11733758

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to collect user information for a user of a data processing device, generate a user profile for the user of the data processing device from the user information, and set a power profile a processor in the data processing device using the user profile. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: August 25, 2021

Date of Patent: August 22, 2023

Assignee: INTEL CORPORATION

Inventors: Altug Koker, Abhishek R. Appu, Kiran C. Veernapu, Joydeep Ray, Balaji Vembu, Prasoonkumar Surti, Kamal Sinha, Eric J. Hoekstra, Wenyin Fu, Nikos Kaburlasos, Bhushan M. Borole, Travis T. Schluessler, Ankur N. Shah, Jonathan Kennedy
SECTOR CACHE FOR COMPRESSION

Publication number: 20230259458

Abstract: One embodiment provides circuitry coupled with cache memory and a memory interface, the circuitry to compress compute data at multiple cache line granularity, and a processing resource coupled with the memory interface and the cache memory. The processing resource is configured to perform a general-purpose compute operation on compute data associated with multiple cache lines of the cache memory. The circuitry is configured to compress the compute data before a write of the compute data via the memory interface to the memory bus, in association with a read of the compute data associated with the multiple cache lines via the memory interface, decompress the compute data, and provide the decompressed compute data to the processing resource.

Type: Application

Filed: February 13, 2023

Publication date: August 17, 2023

Applicant: Intel Corporation

Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, David Puffer, Prasoonkumar Surti, Lakshminarayanan Striramassarma, Vasanth Ranganathan, Kiran C. Veernapu, Balaji Vembu, Pattabhiraman K

prev 1 2 3 4 5 6 … next