Patents by Inventor Balaji Vembu

Balaji Vembu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20210287327
    Abstract: An apparatus and method for dynamic provisioning, quality of service, and prioritization in a graphics processor. For example, one embodiment of an apparatus comprises a graphics processing unit (GPU) comprising a plurality of graphics processing resources; slice configuration hardware logic to logically subdivide the graphics processing resources into a plurality of slices; and slice allocation hardware logic to allocate a designated number of slices to each virtual machine (VM) of a plurality of VMs running in a virtualized execution environment, the slice allocation hardware logic to allocate different numbers of slices to different VMs based on graphics processing requirements and/or priorities of each of the VMs.
    Type: Application
    Filed: February 23, 2021
    Publication date: September 16, 2021
    Inventors: Abhishek R. APPU, Joydeep RAY, Altug KOKER, Balaji VEMBU, Pattabhiraman K, Matthew B. CALLAWAY
  • Publication number: 20210287328
    Abstract: Methods and apparatus relating to techniques for power management. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive one or more frames for a workload, determine one or more compute resource parameters for the workload, and store the one or more compute resource parameters for the workload in a memory in association with workload context data for the workload. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: March 24, 2021
    Publication date: September 16, 2021
    Applicant: INTEL CORPORATION
    Inventors: Balaji Vembu, Josh B. Mastronarde, Altug Koker, Nikos Kaburlasos, Abhishek R. Appu, Joydeep Ray
  • Publication number: 20210279104
    Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive a completion acknowledgment from the plurality of graphics processing units and in response to a determination that the workload is finished, to terminate one or more communication connections on the interconnect bridge. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: March 18, 2021
    Publication date: September 9, 2021
    Applicant: INTEL CORPORATION
    Inventors: ALTUG KOKER, Abhishek R. Appu, Kiran C. Veernapu, Joydeep Ray, Balaji Vembu
  • Publication number: 20210271539
    Abstract: Apparatus and method for scalable error reporting. For example, one embodiment of an apparatus comprises error detection circuitry to detect an error in a component of a first tile within a tile-based hierarchy of a processing device; error classification circuitry to classify the error and record first error data based on the classification; a first tile interface to combine the first error data with second error data received from one or more other components associated with the first tile to generate first accumulated error data; and a master tile interface to combine the first accumulated error data with second accumulated error data received from at least one other tile interface to generate second accumulated error data and to provide the second accumulated error data to a host executing an application to process the second accumulated error data.
    Type: Application
    Filed: February 9, 2021
    Publication date: September 2, 2021
    Inventors: Balaji VEMBU, Bryan WHITE, Ankur SHAH, Murali RAMADOSS, David PUFFER, Altug KOKER, Aditya NAVALE, Mahesh NATU
  • Patent number: 11106274
    Abstract: An embodiment of a graphics apparatus may include a facial expression detector to detect a facial expression of a user, and a parameter adjuster communicatively coupled to the facial expression detector to adjust a graphics parameter based on the detected facial expression of the user. The detected facial expression may include one or more of a squinting, blinking, winking, and facial muscle tension of the user. The graphics parameter may include one or more of a frame resolution, a screen contrast, a screen brightness, and a shading rate. Other embodiments are disclosed and claimed.
    Type: Grant
    Filed: April 10, 2017
    Date of Patent: August 31, 2021
    Assignee: Intel Corporation
    Inventors: Travis T. Schluessler, Joydeep Ray, John H. Feit, Nikos Kaburlasos, Jacek Kwiatkowski, Jefferson Amstutz, Carson Brownlee, Vivek Tiwari, Sayan Lahiri, Kai Xiao, Abhishek R. Appu, ElMoustapha Ould-Ahmed-Vall, Deepak S. Vembar, Ankur N. Shah, Balaji Vembu, Josh B. Mastronarde
  • Patent number: 11106264
    Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to collect user information for a user of a data processing device, generate a user profile for the user of the data processing device from the user information, and set a power profile a processor in the data processing device using the user profile. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: February 28, 2020
    Date of Patent: August 31, 2021
    Assignee: INTEL CORPORATION
    Inventors: Altug Koker, Abhishek R. Appu, Kiran C. Veernapu, Joydeep Ray, Balaji Vembu, Prasoonkumar Surti, Kamal Sinha, Eric J. Hoekstra, Wenyin Fu, Nikos Kaburlasos, Bhushan M. Borole, Travis T. Schluessler, Ankur N. Shah, Jonathan Kennedy
  • Publication number: 20210264558
    Abstract: Systems, apparatuses and methods may provide a way to monitor, by a process monitor, one or more processing factors of one or more client devices hosting one or more user sessions. More particularly, the systems, apparatuses and methods may provide a way to generate, responsively, a scene generation plan based on one or more of a digital representation of an N dimensional space or at least one of the one or more processing factors, and generate, by a global scene generator, a global scene common to the one or more client devices based on the digital representation of the space. The systems, apparatuses and methods may further provide for performing, by a local scene generator, at least a portion of the global illumination based on one or more of the scene generation plan, or application parameters.
    Type: Application
    Filed: December 28, 2020
    Publication date: August 26, 2021
    Inventors: Balaji Vembu, David M. Cimini, Elmoustapha Ould-Ahmed-Vall, Jacek Kwiatkowski, Philip R. Laws, Abhishek R. Appu
  • Patent number: 11099800
    Abstract: In accordance with some embodiments, the render rate is varied across and/or up and down the display screen. This may be done based on where the user is looking in order to reduce power consumption and/or increase performance. Specifically the screen display is separated into regions, such as quadrants. Each of these regions is rendered at a rate determined by at least one of what the user is currently looking at, what the user has looked at in the past and/or what it is predicted that the user will look at next. Areas of less focus may be rendered at a lower rate, reducing power consumption in some embodiments.
    Type: Grant
    Filed: May 22, 2020
    Date of Patent: August 24, 2021
    Assignee: Intel Corporation
    Inventors: Eric J. Asperheim, Subramaniam M. Maiyuran, Kiran C. Veernapu, Sanjeev S. Jahagirdar, Balaji Vembu, Devan Burke, Philip R. Laws, Kamal Sinha, Abhishek R. Appu, Elmoustapha Ould-Ahmed-Vall, Peter L. Doyle, Joydeep Ray, Travis T. Schluessler, John H. Feit, Nikos Kaburlasos, Jacek Kwiatkowski, Altug Koker
  • Publication number: 20210255857
    Abstract: A mechanism is described for facilitating intelligent dispatching and vectorizing at autonomous machines. A method of embodiments, as described herein, includes detecting a plurality of threads corresponding to a plurality of workloads associated with tasks relating to a graphics processor. The method may further include determining a first set of threads of the plurality of threads that are similar to each other or have adjacent surfaces, and physically clustering the first set of threads close together using a first set of adjacent compute blocks.
    Type: Application
    Filed: December 21, 2020
    Publication date: August 19, 2021
    Applicant: Intel Corporation
    Inventors: Feng Chen, Narayan Srinivasa, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Joydeep Ray, Nicolas C. Galoppo Von Borries, Prasoonkumar Surti, Ben J. Ashbaugh, Sanjeev Jahagirdar, Vasanth Ranganathan
  • Patent number: 11094033
    Abstract: In an example, an apparatus comprises logic, at least partially comprising hardware logic, to receive an input from one or more detectors proximate a display to present an output from a graphics pipeline, determine that a user is not interacting with the display, and in response to a determination that the user is not interacting with the display, to reduce a frame rendering rate of the graphics pipeline. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: February 14, 2020
    Date of Patent: August 17, 2021
    Assignee: INTEL CORPORATION
    Inventors: Balaji Vembu, Nikos Kaburlasos, Josh B. Mastronarde
  • Publication number: 20210241417
    Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a plurality of processing units each comprising a plurality of execution units (EUs), wherein the plurality of EUs comprise a first EU type and a second EU type.
    Type: Application
    Filed: January 11, 2021
    Publication date: August 5, 2021
    Applicant: Intel Corporation
    Inventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
  • Publication number: 20210241418
    Abstract: Embodiments described herein provide a graphics, media, and compute device having a tiled architecture composed of a number of tiles of smaller graphics devices. The work distribution infrastructure for such device enables the distribution of workloads across multiple tiles of the device. Work items can be submitted to any one or more of the multiple tiles, with workloads able to span multiple tiles. Additionally, upon completion of a work item, graphics, media, and/or compute engines within the device can readily acquire new work items for execution with minimal latency.
    Type: Application
    Filed: April 19, 2021
    Publication date: August 5, 2021
    Applicant: Intel Corporation
    Inventors: Balaji Vembu, Brandon Fliflet, James Valerio, Michael Apodaca, Ben Ashbaugh, Hema Nalluri, Ankur Shah, Murali Ramadoss, David Puffer, Altug Koker, Aditya Navale, Abhishek R. Appu, Joydeep Ray, Travis Schluessler
  • Patent number: 11080213
    Abstract: An apparatus and method for dynamic provisioning and traffic control on a memory fabric.
    Type: Grant
    Filed: December 2, 2019
    Date of Patent: August 3, 2021
    Assignee: INTEL CORPORATION
    Inventors: Balaji Vembu, Altug Koker, Joydeep Ray, Abhishek R. Appu, Pattabhiraman K, Niranjan L. Cooray
  • Patent number: 11080046
    Abstract: A processing apparatus is provided comprising a multiprocessor having a multithreaded architecture. The multiprocessor can execute at least one single instruction to perform parallel mixed precision matrix operations. In one embodiment the apparatus includes a memory interface and an array of multiprocessors coupled to the memory interface. At least one multiprocessor in the array of multiprocessors is configured to execute a fused multiply-add instruction in parallel across multiple threads.
    Type: Grant
    Filed: February 5, 2021
    Date of Patent: August 3, 2021
    Assignee: Intel Corporation
    Inventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
  • Publication number: 20210232204
    Abstract: Methods and apparatus relating to techniques for a dual path sequential element to reduce toggles in data path are described. In an embodiment, switching logic causes signals for a single data path of a processor to be directed to at least two separate data paths. At least one of the two separate data paths is power gated to reduce signal toggles in the at least one data path. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: December 1, 2020
    Publication date: July 29, 2021
    Applicant: Intel Corporation
    Inventors: Subramaniam Maiyuran, Sanjeev S. Jahagirdar, Kiran C. Veernapu, Eric J. Asperheim, Altug Koker, Balaji Vembu, Joydeep Ray, Abhishek R. Appu
  • Patent number: 11074072
    Abstract: One embodiment provides for a compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including a multi-bit input value and a bipolar binary weight associated with a neural network and an arithmetic logic unit including a multiplier, an adder, and an accumulator register. To execute the decoded instruction, the multiplier is to perform a multiplication operation on the multi-bit input based on the bipolar binary weight to generate an intermediate product and the adder is to add the intermediate product to a value stored in the accumulator register and update the value stored in the accumulator register.
    Type: Grant
    Filed: July 8, 2019
    Date of Patent: July 27, 2021
    Assignee: Intel Corporation
    Inventors: Kevin Nealis, Anbang Yao, Xiaoming Chen, Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Eriko Nurvitadhi, Balaji Vembu, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha
  • Publication number: 20210217130
    Abstract: Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.
    Type: Application
    Filed: March 5, 2021
    Publication date: July 15, 2021
    Applicant: Intel Corporation
    Inventors: Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Nicolas C. Galoppo Von Borries
  • Publication number: 20210216467
    Abstract: A mechanism is described for facilitating optimization of cache associated with graphics processors at computing devices. A method of embodiments, as described herein, includes introducing coloring bits to contents of a cache associated with a processor including a graphics processor, wherein the coloring bits to represent a signal identifying one or more caches available for use, while avoiding explicit invalidations and flushes.
    Type: Application
    Filed: November 19, 2020
    Publication date: July 15, 2021
    Applicant: Intel Corporation
    Inventors: Altug Koker, Balaji Vembu, Joydeep Ray, Abhishek R. Appu
  • Patent number: 11055248
    Abstract: Methods and apparatus relating to techniques for avoiding cache lookup for cold cache. In an example, an apparatus comprises logic, at least partially comprising hardware logic, to monitor a thread switching overhead parameter for an application executing in a processing system and in response to a determination that the thread switching overhead parameter exceeds a threshold, to activate a thread management algorithm to reduce thread switching in the processing system. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: October 11, 2019
    Date of Patent: July 6, 2021
    Assignee: INTEL CORPORATION
    Inventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, Kiran C. Veernapu, Balaji Vembu, Vasanth Ranganathan, Prasoonkumar Surti
  • Publication number: 20210201556
    Abstract: An apparatus and method are described for allocating local memories to virtual machines. For example, one embodiment of an apparatus comprises: a command streamer to queue commands from a plurality of virtual machines (VMs) or applications, the commands to be distributed from the command streamer and executed by graphics processing resources of a graphics processing unit (GPU); a tile cache to store graphics data associated with the plurality of VMs or applications as the commands are executed by the graphics processing resources; and tile cache allocation hardware logic to allocate a first portion of the tile cache to a first VM or application and a second portion of the tile cache to a second VM or application; the tile cache allocation hardware logic to further allocate a first region in system memory to store spill-over data when the first portion of the tile cache and/or the second portion of the file cache becomes full.
    Type: Application
    Filed: January 5, 2021
    Publication date: July 1, 2021
    Inventors: JOYDEEP RAY, ABHISHEK R. APPU, PATTABHIRAMAN K, BALAJI VEMBU, ALTUG KOKER, NIRANJAN L. COORAY, JOSH B. MASTRONARDE