Patents by Inventor Balaji Vembu

Balaji Vembu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

MACHINE LEARNING SPARSE COMPUTATION MECHANISM

Publication number: 20230040631

Abstract: Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.

Type: Application

Filed: August 5, 2022

Publication date: February 9, 2023

Applicant: Intel Corporation

Inventors: Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Nicolas C. Galoppo Von Borries
Display engine surface blending and adaptive texel to pixel ratio sample rate system, apparatus and method

Patent number: 11574386

Abstract: Systems, apparatuses and methods may provide away to blend two or more of the scene surfaces based on the focus area and an offload threshold. More particularly, systems, apparatuses and methods may provide a way to blend, by a display engine, two or more of the focus area scene surfaces and blended non-focus area scene surfaces. The systems, apparatuses and methods may include a graphics engine to render the focus area surfaces at a higher sample rate than the non-focus area scene surfaces.

Type: Grant

Filed: April 26, 2021

Date of Patent: February 7, 2023

Assignee: Intel Corporation

Inventors: Joydeep Ray, Travis T. Schluessler, John H. Feit, Nikos Kaburlasos, Jacek Kwiatkowski, Abhishek R. Appu, Balaji Vembu, Prasoonkumar Surti
Compute optimization mechanism for deep neural networks

Patent number: 11562461

Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes one or more processing units to provide a first set of shader operations associated with a shader stage of a graphics pipeline, a scheduler to schedule shader threads for processing, and a field-programmable gate array (FPGA) dynamically configured to provide a second set of shader operations associated with the shader stage of the graphics pipeline.

Type: Grant

Filed: November 18, 2021

Date of Patent: January 24, 2023

Assignee: Intel Corporation

Inventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
MEASURED RESTART OF MICROCONTROLLERS

Publication number: 20230020838

Abstract: In various examples there is a computing device comprising: a first microcontroller comprising a first immutable bootloader and first mutable firmware. The first immutable bootloader uses a unique device secret burnt into hardware of the computing device in order to generate an attestation of the first mutable firmware. The computing device has a second microcontroller. There is second mutable firmware at the second microcontroller. There is a second immutable bootloader at the second microcontroller which sends a measurement of the second mutable firmware to the first immutable bootloader whenever the second microcontroller restarts, such that the first microcontroller is able to include the measurement in the attestation.

Type: Application

Filed: July 13, 2021

Publication date: January 19, 2023

Inventors: Stavros VOLOS, Colin DOAK, Simon Douglas CHAMBERS, David RUGGLES, Richard NEAL, Cédric Alain Marie FOURNET, Kapil VASWANI, Balaji VEMBU
Terminating Distributed Trusted Execution Environment via Confirmation Messages

Publication number: 20230014066

Abstract: A method for securely terminating a distributed trusted execution environment (TEE) spanning a plurality of work accelerators. After wiping sensitive data from the memory of its accelerator, a root of trust for each accelerator is configured to receive confirmation that the data has been wiped from the processor memory in relevant other accelerators prior to moving on to the next stage at which the TEE on its associated accelerator is terminated. Since the data has been wiped from the other accelerators, even if a third party were to inject malicious code into the accelerator, they would be unable to read out the secret data from the other accelerators since the data has been wiped from those other accelerators. In this way, a mechanism is provided for ensuring that when the distributed TEE is terminated, malicious third parties are unable to read out confidential data from the accelerators.

Type: Application

Filed: July 13, 2021

Publication date: January 19, 2023

Inventors: Daniel John Pelham Wilkinson, Stavros Volos, Kapil Vaswani, Balaji Vembu
Terminating Distributed Trusted Execution Environment via Self-Isolation

Publication number: 20230020255

Abstract: A method for securely terminating a distributed trusted execution environment spanning a plurality of work accelerators. Each accelerator is configured to self-isolate upon determining that the distributed TEE is to be terminated across the system of accelerators. The data is also wiped from the processor memory of each accelerator, such that the data cannot be read out from the processor memory once the accelerator's links are re-enabled. The self-isolation is performed on each accelerator prior to the step of terminating the TEE on that accelerator. An accelerator only re-enables its links to other accelerators once the data is wiped from its processor memory such that the secret data is removed from the accelerator memory.

Type: Application

Filed: July 13, 2021

Publication date: January 19, 2023

Inventors: Daniel John Pelham WILKINSON, Stavros VOLOS, Kapil VASWANI, Balaji VEMBU
GRAPHICS SCHEDULING MECHANISM

Publication number: 20220413869

Abstract: An apparatus to facilitate thread scheduling is disclosed. In one embodiment the apparatus includes a processor comprising a plurality of multiprocessors comprising single-instruction multiple thread (SIMT) execution circuitry to simultaneously execute multiple threads, a shared local memory to be shared by the multiple threads, and scheduling hardware logic to schedule the multiple threads in a thread group for execution across the plurality of multiprocessors in accordance with barrier data. The instructions of the multiple threads are to produce shared data to be stored in the shared local memory when executed by the plurality of multiprocessors, wherein additional instructions of at least a first thread of the multiple threads are to use the shared data, and wherein, in accordance with the barrier data, the first thread is to wait for other threads of the multiple threads to finish producing the shared data before executing the additional instructions.

Type: Application

Filed: August 31, 2022

Publication date: December 29, 2022

Applicant: Intel Corporation

Inventors: Balaji Vembu, Abhishek R. Appu, Joydeep Ray, Altug Koker
APPARATUS AND METHOD FOR DYNAMIC PROVISIONING, QUALITY OF SERVICE, AND PRIORITIZATION IN A GRAPHICS PROCESSOR

Publication number: 20220405877

Abstract: An apparatus and method for dynamic provisioning, quality of service, and prioritization in a graphics processor. For example, one embodiment of an apparatus comprises a graphics processing unit (GPU) comprising a plurality of graphics processing resources; slice configuration hardware logic to logically subdivide the graphics processing resources into a plurality of slices; and slice allocation hardware logic to allocate a designated number of slices to each virtual machine (VM) of a plurality of VMs running in a virtualized execution environment, the slice allocation hardware logic to allocate different numbers of slices to different VMs based on graphics processing requirements and/or priorities of each of the VMs.

Type: Application

Filed: May 31, 2022

Publication date: December 22, 2022

Inventors: Abhishek R. APPU, Joydeep RAY, Altug KOKER, Balaji VEMBU, Pattabhiraman K, Matthew B. CALLAWAY
GRAPHICS ENGINE PARTITIONING MECHANISM

Publication number: 20220405876

Abstract: An apparatus to facilitate partitioning of a graphics device is disclosed. The apparatus includes a plurality of engines and logic to partition the plurality of engines to facilitate independent access to each engine within the plurality of engines.

Type: Application

Filed: May 6, 2022

Publication date: December 22, 2022

Applicant: Intel Corporation

Inventors: Abhishek R. Appu, Balaji Vembu, Altug Koker, Bryan R. White, David J. Cowperthwaite, Joydeep Ray, Murali Ramadoss
Regional adjustment of render rate

Patent number: 11531510

Abstract: In accordance with some embodiments, the render rate is varied across and/or up and down the display screen. This may be done based on where the user is looking in order to reduce power consumption and/or increase performance. Specifically the screen display is separated into regions, such as quadrants. Each of these regions is rendered at a rate determined by at least one of what the user is currently looking at, what the user has looked at in the past and/or what it is predicted that the user will look at next. Areas of less focus may be rendered at a lower rate, reducing power consumption in some embodiments.

Type: Grant

Filed: August 11, 2021

Date of Patent: December 20, 2022

Assignee: Intel Corporation

Inventors: Eric J. Asperheim, Subramaniam M. Maiyuran, Kiran C. Veernapu, Sanjeev S. Jahagirdar, Balaji Vembu, Devan Burke, Philip R. Laws, Kamal Sinha, Abhishek R. Appu, Elmoustapha Ould-Ahmed-Vall, Peter L. Doyle, Joydeep Ray, Travis T. Schluessler, John H. Feit, Nikos Kaburlasos, Jacek Kwiatkowski, Altug Koker
APPARATUS AND METHOD FOR SCALABLE ERROR DETECTION AND REPORTING

Publication number: 20220398147

Abstract: Apparatus and method for scalable error reporting. For example, one embodiment of an apparatus comprises error detection circuitry to detect an error in a component of a first tile within a tile-based hierarchy of a processing device; error classification circuitry to classify the error and record first error data based on the classification; a first tile interface to combine the first error data with second error data received from one or more other components associated with the first tile to generate first accumulated error data; and a master tile interface to combine the first accumulated error data with second accumulated error data received from at least one other tile interface to generate second accumulated error data and to provide the second accumulated error data to a host executing an application to process the second accumulated error data.

Type: Application

Filed: June 24, 2022

Publication date: December 15, 2022

Inventors: Balaji VEMBU, Bryan WHITE, Ankur SHAH, Murali RAMADOSS, David PUFFER, Altug KOKER, Aditya NAVALE, Mahesh NATU
SCHEDULING OF THREADS FOR EXECUTION UTILIZING LOAD BALANCING OF THREAD GROUPS

Publication number: 20220398101

Abstract: An apparatus to facilitate thread scheduling is disclosed. The apparatus includes logic to store barrier usage data based on a magnitude of barrier messages in an application kernel and a scheduler to schedule execution of threads across a plurality of multiprocessors based on the barrier usage data.

Type: Application

Filed: June 24, 2022

Publication date: December 15, 2022

Applicant: Intel Corporation

Inventors: Balaji Vembu, Abhishek R. Appu, Joydeep Ray, Altug Koker
SPECIALIZED FIXED FUNCTION HARDWARE FOR EFFICIENT CONVOLUTION

Publication number: 20220391679

Abstract: One embodiment provides a graphics processor comprising an instruction cache to store an instruction and a compute block configured to perform multiply-accumulate operations in response to execution of the instruction. The compute block includes a scheduler to schedule a plurality of threads for execution of the instruction and multiply-accumulate circuitry configured to execute the instruction via the plurality of threads, wherein the multiply-accumulate circuitry includes a plurality of functional units configured to process, in parallel via the plurality of threads, a corresponding plurality of matrix elements to multiply a first matrix and a second matrix, and to multiply the first matrix and the second matrix includes to multiply data elements in a row of the first matrix by corresponding data elements in a column of the second matrix to generate a plurality of products.

Type: Application

Filed: August 11, 2022

Publication date: December 8, 2022

Applicant: Intel Corporation

Inventors: Rajkishore Barik, Elmoustapha Ould-Ahmed-Vall, Xiaoming Chen, Dhawal Srivastava, Anbang Yao, Kevin Nealis, Eriko Nurvitadhi, Sara S. Baghsorkhi, Balaji Vembu, Tatiana Shpeisman, Ping T. Tang
Efficient merging of atomic operations at computing devices

Patent number: 11521294

Abstract: A mechanism is described for facilitating dynamic merging of atomic operations in computing devices. A method of embodiments, as described herein, includes facilitating detecting atomic messages and a plurality of slot addresses. The method further includes comparing one or more slot addresses of the plurality of slot addresses with other slot addresses of the plurality of slot addresses to seek one or more matched slot addresses, where the one or more matched slot addresses are merged into one or more merged groups. The method may further include generating one or more merged atomic operations based on and corresponding to the one or more merged groups.

Type: Grant

Filed: September 29, 2020

Date of Patent: December 6, 2022

Assignee: INTEL CORPORATION

Inventors: Joydeep Ray, Altug Koker, Abhishek R. Appu, Balaji Vembu
Collaborative multi-user virtual reality

Patent number: 11520555

Abstract: An embodiment of a graphics apparatus may include a processor, memory communicatively coupled to the processor, and a collaboration engine communicatively coupled to the processor to identify a shared graphics component between two or more users in an environment, and share the shared graphics components with the two or more users in the environment. Embodiments of the collaboration engine may include one or more of a centralized sharer, a depth sharer, a shared preprocessor, a multi-port graphics subsystem, and a decode sharer. Other embodiments are disclosed and claimed.

Type: Grant

Filed: January 29, 2021

Date of Patent: December 6, 2022

Assignee: Intel Corporation

Inventors: Deepak S. Vembar, Atsuo Kuwahara, Chandrasekaran Sakthivel, Radhakrishnan Venkataraman, Brent E. Insko, Anupreet S. Kalra, Hugues Labbe, Altug Koker, Michael Apodaca, Kai Xiao, Jeffery S. Boles, Adam T. Lake, David M. Cimini, Balaji Vembu, Elmoustapha Ould-Ahmed-Vall, Jacek Kwiatkowski, Philip R. Laws, Ankur N. Shah, Abhishek R. Appu, Joydeep Ray, Wenyin Fu, Nikos Kaburlasos, Prasoonkumar Surti, Bhushan M. Borole
Avoid thread switching in cache management

Patent number: 11520723

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to monitor a thread switching overhead parameter for an application executing in a processing system and in response to a determination that the thread switching overhead parameter exceeds a threshold, to activate a thread management algorithm to reduce thread switching in the processing system. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: July 2, 2021

Date of Patent: December 6, 2022

Assignee: INTEL CORPORATION

Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, Kiran C. Veernapu, Balaji Vembu, Vasanth Ranganathan, Prasoonkumar Surti
Apparatus and method for display virtualization using mapping between virtual and physical display planes

Patent number: 11514550

Abstract: An apparatus and method for managing pipes and planes within a virtual graphics processing engine. For example, one embodiment of a graphics processing apparatus comprises: a graphics processor comprising one or more display pipes to render one or more display planes, each of the one or more display pipes comprising a set of graphics processing hardware resources for executing graphics commands and rendering graphics images in the one or more display planes; and pipe and plane management hardware logic to manage pipe and plane assignment, the pipe and plane management hardware logic to associate a first virtual machine (VM) with one or more virtual display planes and to maintain a mapping between the one or more virtual display planes and at least one physical display plane.

Type: Grant

Filed: June 30, 2020

Date of Patent: November 29, 2022

Assignee: INTEL CORPORATION

Inventors: Yunbiao Lin, Changliang Wang, Satyanantha Ramagopal Musunuri, David Puffer, David J. Cowperthwaite, Bryan R. White, Balaji Vembu
Register spill/fill using shared local memory space

Patent number: 11508338

Abstract: A mechanism is described for facilitating using of a shared local memory for register spilling/filling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes reserving one or more spaces of a shared local memory (SLM) to perform one or more of spilling and filling relating to registers associated with a graphics processor of a computing device.

Type: Grant

Filed: October 5, 2020

Date of Patent: November 22, 2022

Assignee: Intel Corporation

Inventors: Joydeep Ray, Altug Koker, Balaji Vembu, Murali Ramadoss, Guei-Yuan Lueh, James A. Valerio, Prasoonkumar Surti, Abhishek R. Appu, Vasanth Ranganathan, Kalyan K. Bhiravabhatla, Arthur D. Hunter, Jr., Wei-Yu Chen, Subramaniam M. Maiyuran
COORDINATION AND INCREASED UTILIZATION OF GRAPHICS PROCESSORS DURING INFERENCE

Publication number: 20220366527

Abstract: A mechanism is described for facilitating inference coordination and processing utilization for machine learning. A method of embodiments, as described herein, includes limiting execution of workloads for the respective contexts of a plurality of contexts to a specified subset of a plurality of processing resources of a processing system according to physical resource slices of the processing system that are associated with the respective contexts of the plurality of contexts.

Type: Application

Filed: July 22, 2022

Publication date: November 17, 2022

Applicant: Intel Corporation

Inventors: Abhishek R. Appu, Altug Koker, John C. Weast, Mike B. Macpherson, Linda L. Hurd, Sara S. Baghsorkhi, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Liwei Ma, Elmoustapha Ould-Ahmed-Vall, Kamal Sinha, Joydeep Ray, Balaji Vembu, Sanjeev Jahagirdar, Vasanth Ranganathan, DUKHWAN Kim
HANDLING PIPELINE SUBMISSIONS ACROSS MANY COMPUTE UNITS

Publication number: 20220358618

Abstract: One embodiment provides a graphics processor including a plurality of processing clusters, each processing cluster including a plurality of multiprocessors and a data interconnect coupled to the plurality of multiprocessors. At least one multiprocessor of the plurality of multiprocessors is configured to share data with another multiprocessor over the data interconnect.

Type: Application

Filed: July 21, 2022

Publication date: November 10, 2022

Applicant: Intel Corporation

Inventors: Balaji Vembu, Altug Koker, Joydeep Ray

prev 1 2 3 4 5 6 7 8 … next