Patents by Inventor Balaji Vembu

Balaji Vembu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

PROGRAMMABLE COARSE GRAINED AND SPARSE MATRIX COMPUTE HARDWARE WITH ADVANCED SCHEDULING

Publication number: 20220164916

Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to cause the compute apparatus to perform a complex compute operation.

Type: Application

Filed: December 3, 2021

Publication date: May 26, 2022

Applicant: Intel Corporation

Inventors: Eriko Nurvitadhi, Balaji Vembu, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Altug Koker, Narayan Srinivasa, Dukhwan Kim, Sara S. Baghsorkhi, Justin E. Gottschlich, Feng Chen, Elmoustapha Ould-Ahmed-Vall, Kevin Nealis, Xiaoming Chen, Anbang Yao
Graphics engine partitioning mechanism

Patent number: 11341600

Abstract: An apparatus to facilitate partitioning of a graphics device is disclosed. The apparatus includes a plurality of engines and logic to partition the plurality of engines to facilitate independent access to each engine within the plurality of engines.

Type: Grant

Filed: November 13, 2019

Date of Patent: May 24, 2022

Assignee: Intel Corporation

Inventors: Abhishek R. Appu, Balaji Vembu, Altug Koker, Bryan R. White, David J. Cowperthwaite, Joydeep Ray, Murali Ramadoss
Apparatus and method for protecting content in virtualized and graphics environments

Patent number: 11341212

Abstract: An apparatus and method for protecting content in a graphics processor. For example, one embodiment of an apparatus comprises: encode/decode circuitry to decode protected audio and/or video content to generate decoded audio and/or video content; a graphics cache of a graphics processing unit (GPU) to store the decoded audio and/or video content; first protection circuitry to set a protection attribute for each cache line containing the decoded audio and/or video data in the graphics cache; a cache coherency controller to generate a coherent read request to the graphics cache; second protection circuitry to read the protection attribute to determine whether the cache line identified in the read request is protected, wherein if it is protected, the second protection circuitry to refrain from including at least some of the data from the cache line in a response.

Type: Grant

Filed: February 17, 2020

Date of Patent: May 24, 2022

Assignee: INTEL CORPORATION

Inventors: Joydeep Ray, Abhishek R. Appu, Pattabhiraman K, Balaji Vembu, Altug Koker
COMPUTE OPTIMIZATION MECHANISM FOR DEEP NEURAL NETWORKS

Publication number: 20220156876

Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes one or more processing units to provide a first set of shader operations associated with a shader stage of a graphics pipeline, a scheduler to schedule shader threads for processing, and a field-programmable gate array (FPGA) dynamically configured to provide a second set of shader operations associated with the shader stage of the graphics pipeline.

Type: Application

Filed: November 18, 2021

Publication date: May 19, 2022

Applicant: Intel Corporation

Inventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
THREAD SCHEDULING OVER COMPUTE BLOCKS FOR POWER OPTIMIZATION

Publication number: 20220156875

Abstract: Thread dispatch circuitry is configured to dispatch threads of a two-dimensional (2D) thread group based on data access locality associated with the threads. The thread dispatch circuitry can dispatch a first 2D sub-group of the 2D thread group to a compute block of the multiple compute blocks, the first 2D sub-group associated with a first 2D tile of memory and dispatch a second 2D sub-group of the 2D thread group to the compute block of the multiple compute blocks, the second 2D sub-group associated with a second 2D tile of memory.

Type: Application

Filed: November 16, 2021

Publication date: May 19, 2022

Applicant: Intel Corporation

Inventors: Altug Koker, Balaji Vembu, Joydeep Ray, James A. Valerio, Abhishek R. Appu
Compute optimization mechanism for deep neural networks

Patent number: 11334962

Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a plurality of processing units each comprising a plurality of processing cores of a first type and a second type. A first set of processing cores of a first type perform multi-dimensional matrix operations and a second set of processing cores of a second type perform general purpose graphics processing unit (GPGPU) operations.

Type: Grant

Filed: July 26, 2021

Date of Patent: May 17, 2022

Assignee: Intel Corporation

Inventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
Dual path sequential element to reduce toggles in data path

Patent number: 11320886

Abstract: Methods and apparatus relating to techniques for a dual path sequential element to reduce toggles in data path are described. In an embodiment, switching logic causes signals for a single data path of a processor to be directed to at least two separate data paths. At least one of the two separate data paths is power gated to reduce signal toggles in the at least one data path. Other embodiments are also disclosed and claimed.

Type: Grant

Filed: December 1, 2020

Date of Patent: May 3, 2022

Assignee: INTEL CORPORATION

Inventors: Subramaniam Maiyuran, Sanjeev S. Jahagirdar, Kiran C. Veernapu, Eric J. Asperheim, Altug Koker, Balaji Vembu, Joydeep Ray, Abhishek R. Appu
Cache optimization for graphics systems

Patent number: 11314654

Abstract: A mechanism is described for facilitating optimization of cache associated with graphics processors at computing devices. A method of embodiments, as described herein, includes introducing coloring bits to contents of a cache associated with a processor including a graphics processor, wherein the coloring bits to represent a signal identifying one or more caches available for use, while avoiding explicit invalidations and flushes.

Type: Grant

Filed: November 19, 2020

Date of Patent: April 26, 2022

Assignee: Intel Corporation

Inventors: Altug Koker, Balaji Vembu, Joydeep Ray, Abhishek R. Appu
SPECIALIZED FIXED FUNCTION HARDWARE FOR EFFICIENT CONVOLUTION

Publication number: 20220114430

Abstract: One embodiment provides an apparatus comprising an instruction cache to store a plurality of instructions, a scheduler unit coupled to the instruction cache, the scheduler unit to schedule the plurality of instructions for execution, an instruction fetch and decode unit to decode the plurality of instructions to determine a set of operations to perform in response, one or more compute blocks to perform parallel multiply-accumulate operations based on the instruction fetch and decode unit decoding a first instruction of the plurality of instructions, and matrix multiplication logic to perform matrix multiplication operations based on the instruction fetch and decode unit decoding a second instruction of the plurality of instructions.

Type: Application

Filed: December 21, 2021

Publication date: April 14, 2022

Applicant: Intel Corporation

Inventors: Rajkishore Barik, Elmoustapha Ould-Ahmed-Vall, Xiaoming Chen, Dhawal Srivastava, Anbang Yao, Kevin Nealis, Eriko Nurvitadhi, Sara S. Baghsorkhi, Balaji Vembu, Tatiana Shpeisman, Ping T. Tang
PROCESSOR POWER MANAGEMENT

Publication number: 20220113783

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to collect user information for a user of a data processing device, generate a user profile for the user of the data processing device from the user information, and set a power profile a processor in the data processing device using the user profile. Other embodiments are also disclosed and claimed.

Type: Application

Filed: August 25, 2021

Publication date: April 14, 2022

Applicant: INTEL CORPORATION

Inventors: Altug Koker, Abhishek R. Appu, Kiran C. Veernapu, Joydeep Ray, Balaji Vembu, Prasoonkumar Surti, Kamal Sinha, Eric J. Hoekstra, Wenyin Fu, Nikos Kaburlasos, Bhushan M. Borole, Travis T. Schluessler, Ankur N. Shah, Jonathan Kennedy
Router-based transaction routing for toggle reduction

Patent number: 11281837

Abstract: In accordance with embodiments disclosed herein, there is provided systems and methods for router-based transaction routing for toggle reduction. An integrated circuit includes a transmitter circuit, receiver circuits, and a multicast bus coupled between the transmitter circuit and the receiver circuits. The multicast bus includes a first flow router circuit to route a multicast signal to a first receiver circuit of the plurality of receiver circuits and not route the multicast signal to a second receiver circuit of the plurality of receiver circuits.

Type: Grant

Filed: December 18, 2017

Date of Patent: March 22, 2022

Assignee: Intel Corporation

Inventors: Hema Chand Nalluri, Balaji Vembu, Santosh Tripathy, Altug Koker, Pattabhiraman K
Apparatus and method for managing data bias in a graphics processing architecture

Patent number: 11282161

Abstract: An apparatus and method are described for managing data which is biased towards a processor or a GPU. For example, an apparatus comprises a processor comprising one or more cores, one or more cache levels, and cache coherence controllers to maintain coherent data in the one or more cache levels; a graphics processing unit (GPU) to execute graphics instructions and process graphics data, wherein the GPU and processor cores are to share a virtual address space for accessing a system memory; a GPU memory addressable through the virtual address space shared by the processor cores and GPU; and bias management circuitry to store an indication for whether the data has a processor bias or a GPU bias, wherein if the data has a GPU bias, the data is to be accessed by the GPU without necessarily accessing the processor's cache coherence controllers.

Type: Grant

Filed: May 5, 2020

Date of Patent: March 22, 2022

Assignee: INTEL CORPORATION

Inventors: Joydeep Ray, Abhishek R. Appu, Altug Koker, Balaji Vembu
AUTONOMOUS VEHICLE ADVANCED SENSING AND RESPONSE

Publication number: 20220084329

Abstract: An autonomous vehicle is provided that includes one or more processors configured to provide a local compute manager to manage execution of compute workloads associated with the autonomous vehicle. The local compute manager can perform various compute operations, including receiving offload of compute operations from to other compute nodes and offloading compute operations to other compute notes, where the other compute nodes can be other autonomous vehicles. The local compute manager can also facilitate autonomous navigation functionality.

Type: Application

Filed: November 30, 2021

Publication date: March 17, 2022

Applicant: Intel Corporation

Inventors: Barath LAKSHAMANAN, Linda L. HURD, Ben J. ASHBAUGH, Elmoustapha OULD-AHMED-VALL, Liwei MA, Jingyi JIN, Justin E. GOTTSCHLICH, Chandrasekaran SAKTHIVEL, Michael S. STRICKLAND, Brian T. LEWIS, Lindsey KUPER, Altug KOKER, Abhishek R. APPU, Prasoonkumar SURTI, Joydeep RAY, Balaji VEMBU, Javier S. TUREK, Naila FAROOQUI
Compression Mechanism

Publication number: 20220084252

Abstract: An apparatus to facilitate compute compression is disclosed. The apparatus includes a graphics processing unit including mapping logic to map a first block of integer pixel data to a compression block and compression logic to compress the compression block.

Type: Application

Filed: June 23, 2021

Publication date: March 17, 2022

Applicant: Intel Corporation

Inventors: Abhishek Appu, Altug Koker, Joydeep Ray, Balaji Vembu, Prasoonkumar Surti, Kamal Sinha, Nadathur Rajagoplan Satish, Narayan Srinivasa, Feng Chen, Dukhwan Kim, Farshad Akhbari
CLOUD BASED DISTRIBUTED SINGLE GAME CALCULATION OF SHARED COMPUTATIONAL WORK FOR MULTIPLE CLOUD GAMING CLIENT DEVICES

Publication number: 20220076480

Abstract: Systems, apparatuses, and methods may provide for technology to process graphics data in a virtual gaming environment. The technology may identify, from graphics data in a graphics application, redundant graphics calculations relating to common frame characteristics of one or more graphical scenes to be shared between client game devices of a plurality of users and calculate, in response to the identified redundant graphics calculations, frame characteristics relating to the one or more graphical scenes. Additionally, the technology may send, over a computer network, the calculation of the frame characteristics to the client game devices.

Type: Application

Filed: September 17, 2021

Publication date: March 10, 2022

Inventors: Jonathan Kennedy, Gabor Liktor, Jeffery S. Boles, Slawomir Grajewski, Balaji Vembu, Travis T. Schluessler, Abhishek R. Appu, Ankur N. Shah, Joydeep Ray, Altug Koker, Jacek Kwiatkowski
Compute cluster preemption within a general-purpose graphics processing unit

Patent number: 11270406

Abstract: Embodiments described herein provide techniques enable a compute unit to continue processing operations when all dispatched threads are blocked. One embodiment provides for a method comprising executing multiple concurrent threads on a processing resource of a graphics processor, during execution, detecting that each of the multiple concurrent threads of the processing resource are blocked from execution, selecting a victim thread from the multiple concurrent threads, and suspending the victim thread. The thread state is stored to a thread scratch space in memory along with a blocking event associated with the victim thread.

Type: Grant

Filed: November 16, 2020

Date of Patent: March 8, 2022

Assignee: Intel Corporation

Inventors: Murali Ramadoss, Balaji Vembu, Eric C. Samson, Kun Tian, David J. Cowperthwaite, Altug Koker, Zhi Wang, Joydeep Ray, Subramaniam M. Maiyuran, Abhishek R. Appu
Data operations and finite state machine for machine learning via bypass of computational tasks based on frequently-used data values

Patent number: 11269643

Abstract: A mechanism is described for facilitating fast data operations and for facilitating a finite state machine for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting input data to be used in computational tasks by a computation component of a processor including a graphics processor. The method may further include determining one or more frequently-used data values (FDVs) from the data, and pushing the one or more frequent data values to bypass the computational tasks.

Type: Grant

Filed: April 9, 2017

Date of Patent: March 8, 2022

Assignee: Intel Corporation

Inventors: Liwei Ma, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Eriko Nurvitadhi, Abhishek R. Appu, Altug Koker, Kamal Sinha, Joydeep Ray, Balaji Vembu, Vasanth Ranganathan, Sanjeev Jahagirdar
AVOID THREAD SWITCHING IN CACHE MANAGEMENT

Publication number: 20220066970

Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to monitor a thread switching overhead parameter for an application executing in a processing system and in response to a determination that the thread switching overhead parameter exceeds a threshold, to activate a thread management algorithm to reduce thread switching in the processing system. Other embodiments are also disclosed and claimed.

Type: Application

Filed: July 2, 2021

Publication date: March 3, 2022

Applicant: Intel Corporation

Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, Kiran C. Veernapu, Balaji Vembu, Vasanth Ranganathan, Prasoonkumar Surti
REDUCE POWER BY FRAME SKIPPING

Publication number: 20220067874

Abstract: In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive an input from one or more detectors proximate a display to present an output from a graphics pipeline, determine that a user is not interacting with the display, and in response to a determination that the user is not interacting with the display, to reduce a frame rendering rate of the graphics pipeline. Other embodiments are also disclosed and claimed.

Type: Application

Filed: August 10, 2021

Publication date: March 3, 2022

Applicant: INTEL CORPORATION

Inventors: Balaji Vembu, Nikos Kaburlasos, Josh B. Mastronarde
Regional Adjustment of Render Rate

Publication number: 20220066726

Abstract: In accordance with some embodiments, the render rate is varied across and/or up and down the display screen. This may be done based on where the user is looking in order to reduce power consumption and/or increase performance. Specifically the screen display is separated into regions, such as quadrants. Each of these regions is rendered at a rate determined by at least one of what the user is currently looking at, what the user has looked at in the past and/or what it is predicted that the user will look at next. Areas of less focus may be rendered at a lower rate, reducing power consumption in some embodiments.

Type: Application

Filed: August 11, 2021

Publication date: March 3, 2022

Inventors: Eric J. Asperheim, Subramaniam M. Maiyuran, Kiran C. Veernapu, Sanjeev S. Jahagirdar, Balaji Vembu, Devan Burke, Philip R. Laws, Kamal Sinha, Abhishek R. Appu, Elmoustapha Ould-Ahmed-Vall, Peter L. Doyle, Joydeep Ray, Travis T. Schluessler, John H. Feit, Nikos Kaburlasos, Jacek Kwiatkowski, Altug Koker

prev … 3 4 5 6 7 8 9 10 11 … next