Patents by Inventor Balaji Vembu
Balaji Vembu has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11720355Abstract: One embodiment provides a graphics processor comprising a memory controller and a graphics processing resource coupled with the memory controller. The graphics processing resource includes circuitry configured to execute an instruction to perform a matrix operation on first input including weight data and second input including input activation data, generate intermediate data based on a result of the matrix operation, quantize the intermediate data to a floating-point format determined based on a statistical distribution of first output data, and output, as second output data, quantized intermediate data in a determined floating-point format.Type: GrantFiled: June 7, 2022Date of Patent: August 8, 2023Assignee: Intel CorporationInventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
-
Patent number: 11715174Abstract: Embodiments described herein provide techniques enable a graphics processor to continue processing operations during the reset of a compute unit that has experienced a hardware fault. Threads and associated context state for a faulted compute unit can be migrated to another compute unit of the graphics processor and the faulting compute unit can be reset while processing operations continue.Type: GrantFiled: March 3, 2022Date of Patent: August 1, 2023Assignee: Intel CorporationInventors: Murali Ramadoss, Balaji Vembu, Eric C. Samson, Kun Tian, David J. Cowperthwaite, Altug Koker, Zhi Wang, Joydeep Ray, Subramaniam M. Maiyuran, Abhishek R. Appu
-
Patent number: 11710267Abstract: Systems, apparatuses, and methods may provide for technology to process graphics data in a virtual gaming environment. The technology may identify, from graphics data in a graphics application, redundant graphics calculations relating to common frame characteristics of one or more graphical scenes to be shared between client game devices of a plurality of users and calculate, in response to the identified redundant graphics calculations, frame characteristics relating to the one or more graphical scenes. Additionally, the technology may send, over a computer network, the calculation of the frame characteristics to the client game devices.Type: GrantFiled: September 17, 2021Date of Patent: July 25, 2023Assignee: Intel CorporationInventors: Jonathan Kennedy, Gabor Liktor, Jeffery S. Boles, Slawomir Grajewski, Balaji Vembu, Travis T. Schluessler, Abhishek R. Appu, Ankur N. Shah, Joydeep Ray, Altug Koker, Jacek Kwiatkowski
-
Patent number: 11704181Abstract: Apparatus and method for scalable error reporting. For example, one embodiment of an apparatus comprises error detection circuitry to detect an error in a component of a first tile within a tile-based hierarchy of a processing device; error classification circuitry to classify the error and record first error data based on the classification; a first tile interface to combine the first error data with second error data received from one or more other components associated with the first tile to generate first accumulated error data; and a master tile interface to combine the first accumulated error data with second accumulated error data received from at least one other tile interface to generate second accumulated error data and to provide the second accumulated error data to a host executing an application to process the second accumulated error data.Type: GrantFiled: June 24, 2022Date of Patent: July 18, 2023Assignee: Intel CorporationInventors: Balaji Vembu, Bryan White, Ankur Shah, Murali Ramadoss, David Puffer, Altug Koker, Aditya Navale, Mahesh Natu
-
Patent number: 11693658Abstract: One embodiment provides for a compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including a multi-bit input value and a ternary weight associated with a neural network and an arithmetic logic unit including a multiplier, an adder, and an accumulator register. To execute the decoded instruction, the multiplier is to perform a multiplication operation on the multi-bit input based on the ternary weight to generate an intermediate product and the adder is to add the intermediate product to a value stored in the accumulator register and update the value stored in the accumulator register.Type: GrantFiled: July 26, 2021Date of Patent: July 4, 2023Assignee: Intel CorporationInventors: Kevin Nealis, Anbang Yao, Xiaoming Chen, Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Eriko Nurvitadhi, Balaji Vembu, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha
-
Patent number: 11688122Abstract: An embodiment of an electronic processing system may include an application processor, persistent storage media communicatively coupled to the application processor, and a graphics subsystem communicatively coupled to the application processor. The system may include one or more of a draw call re-orderer communicatively coupled to the application processor and the graphics subsystem to re-order two or more draw calls, a workload re-orderer communicatively coupled to the application processor and the graphics subsystem to re-order two or more work items in an order independent mode, a queue primitive included in at least one of the two or more draw calls to define a producer stage and a consumer stage, and an order-independent executor communicatively coupled to the application processor and the graphics subsystem to provide tile-based order independent execution of a compute stage. Other embodiments are disclosed and claimed.Type: GrantFiled: February 2, 2022Date of Patent: June 27, 2023Assignee: Intel CorporationInventors: Devan Burke, Adam T. Lake, Jeffery S. Boles, John H. Feit, Karthik Vaidyanathan, Abhishek R. Appu, Joydeep Ray, Subramaniam Maiyuran, Altug Koker, Balaji Vembu, Murali Ramadoss, Prasoonkumar Surti, Eric J. Hoekstra, Gabor Liktor, Jonathan Kennedy, Slawomir Grajewski, Elmoustapha Ould-Ahmed-Vall
-
Patent number: 11675597Abstract: An apparatus to facilitate thread scheduling is disclosed. In one embodiment the apparatus includes a processor comprising a plurality of multiprocessors comprising single-instruction multiple thread (SIMT) execution circuitry to simultaneously execute multiple threads, a shared local memory to be shared by the multiple threads, and scheduling hardware logic to schedule the multiple threads in a thread group for execution across the plurality of multiprocessors in accordance with barrier data. The instructions of the multiple threads are to produce shared data to be stored in the shared local memory when executed by the plurality of multiprocessors, wherein additional instructions of at least a first thread of the multiple threads are to use the shared data, and wherein, in accordance with the barrier data, the first thread is to wait for other threads of the multiple threads to finish producing the shared data before executing the additional instructions.Type: GrantFiled: August 31, 2022Date of Patent: June 13, 2023Assignee: Intel CorporationInventors: Balaji Vembu, Abhishek R. Appu, Joydeep Ray, Altug Koker
-
Patent number: 11651090Abstract: A method for securely terminating a distributed trusted execution environment (TEE) spanning a plurality of work accelerators. After wiping sensitive data from the memory of its accelerator, a root of trust for each accelerator is configured to receive confirmation that the data has been wiped from the processor memory in relevant other accelerators prior to moving on to the next stage at which the TEE on its associated accelerator is terminated. Since the data has been wiped from the other accelerators, even if a third party were to inject malicious code into the accelerator, they would be unable to read out the secret data from the other accelerators since the data has been wiped from those other accelerators. In this way, a mechanism is provided for ensuring that when the distributed TEE is terminated, malicious third parties are unable to read out confidential data from the accelerators.Type: GrantFiled: July 13, 2021Date of Patent: May 16, 2023Assignee: GRAPHCORE LTD.Inventors: Daniel John Pelham Wilkinson, Stavros Volos, Kapil Vaswani, Balaji Vembu
-
Patent number: 11650928Abstract: A mechanism is described for facilitating optimization of cache associated with graphics processors at computing devices. A method of embodiments, as described herein, includes introducing coloring bits to contents of a cache associated with a processor including a graphics processor, wherein the coloring bits to represent a signal identifying one or more caches available for use, while avoiding explicit invalidations and flushes.Type: GrantFiled: April 7, 2022Date of Patent: May 16, 2023Assignee: INTEL CORPORATIONInventors: Altug Koker, Balaji Vembu, Joydeep Ray, Abhishek R. Appu
-
Patent number: 11651089Abstract: A method for securely terminating a distributed trusted execution environment spanning a plurality of work accelerators. Each accelerator is configured to self-isolate upon determining that the distributed TEE is to be terminated across the system of accelerators. The data is also wiped from the processor memory of each accelerator, such that the data cannot be read out from the processor memory once the accelerator's links are re-enabled. The self-isolation is performed on each accelerator prior to the step of terminating the TEE on that accelerator. An accelerator only re-enables its links to other accelerators once the data is wiped from its processor memory such that the secret data is removed from the accelerator memory.Type: GrantFiled: July 13, 2021Date of Patent: May 16, 2023Assignee: GRAPHCORE LTD.Inventors: Daniel John Pelham Wilkinson, Stavros Volos, Kapil Vaswani, Balaji Vembu
-
Publication number: 20230142472Abstract: In accordance with some embodiments, the render rate is varied across and/or up and down the display screen. This may be done based on where the user is looking in order to reduce power consumption and/or increase performance. Specifically the screen display is separated into regions, such as quadrants. Each of these regions is rendered at a rate determined by at least one of what the user is currently looking at, what the user has looked at in the past and/or what it is predicted that the user will look at next. Areas of less focus may be rendered at a lower rate, reducing power consumption in some embodiments.Type: ApplicationFiled: October 4, 2022Publication date: May 11, 2023Inventors: Eric J. Asperheim, Subramaniam Maiyuran, Kiran C. Veernapu, Sanjeev S. Jahagirdar, Balaji Vembu, Devan Burke, Philip R. Laws, Kamal Sinha, Abhishek R. Appu, Elmoustapha Ould-Ahmed-Vall, Peter L. Doyle, Joydeep Ray, Travis T. Schluessler, John H. Feit, Nikos Kaburlasos, Jacek Kwiatkowski, Altug Koker
-
Patent number: 11631198Abstract: An apparatus to facilitate compute compression is disclosed. The apparatus includes a graphics processing unit including mapping logic to map a first block of integer pixel data to a compression block and compression logic to compress the compression block.Type: GrantFiled: June 23, 2021Date of Patent: April 18, 2023Assignee: Intel CorporationInventors: Abhishek Appu, Altug Koker, Joydeep Ray, Balaji Vembu, Prasoonkumar Surti, Kamal Sinha, Nadathur Rajagopalan Satish, Narayan Srinivasa, Feng Chen, Dukhwan Kim, Farshad Akhbari
-
Patent number: 11620723Abstract: One embodiment provides a graphics processor including a plurality of processing clusters, each processing cluster including a plurality of multiprocessors and a data interconnect coupled to the plurality of multiprocessors. At least one multiprocessor of the plurality of multiprocessors is configured to share data with another multiprocessor over the data interconnect.Type: GrantFiled: July 21, 2022Date of Patent: April 4, 2023Assignee: Intel CorporationInventors: Balaji Vembu, Altug Koker, Joydeep Ray
-
Patent number: 11609856Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.Type: GrantFiled: March 19, 2021Date of Patent: March 21, 2023Assignee: INTEL CORPORATIONInventors: Chandrasekaran Sakthivel, Prasoonkumar Surti, John C. Weast, Sara S. Baghsorkhi, Justin E. Gottschlich, Abhishek R. Appu, Nicolas C. Galoppo Von Borries, Joydeep Ray, Narayan Srinivasa, Feng Chen, Ben J. Ashbaugh, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha, Eriko Nurvitadhi, Balaji Vembu, Altug Koker
-
Patent number: 11593269Abstract: In an example, an apparatus comprises a plurality of execution units, and a cache memory communicatively coupled to the plurality of execution units, wherein the cache memory is structured into a plurality of sectors, wherein each sector in the plurality of sectors comprises at least two cache lines. Other embodiments are also disclosed and claimed.Type: GrantFiled: August 12, 2021Date of Patent: February 28, 2023Assignee: Intel CorporationInventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, David Puffer, Prasoonkumar Surti, Lakshminarayanan Striramassarma, Vasanth Ranganathan, Kiran C. Veernapu, Balaji Vembu, Pattabhiraman K
-
Patent number: 11593910Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.Type: GrantFiled: May 11, 2022Date of Patent: February 28, 2023Assignee: Intel CorporationInventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
-
Patent number: 11592817Abstract: A mechanism is described for facilitating storage management for machine learning at autonomous machines. A method of embodiments, as described herein, includes detecting one or more components associated with machine learning, where the one or more components include memory and a processor coupled to the memory, and where the processor includes a graphics processor. The method may further include allocating a storage portion of the memory and a hardware portion of the processor to a machine learning training set, where the storage and hardware portions are precise for implementation and processing of the training set.Type: GrantFiled: April 28, 2017Date of Patent: February 28, 2023Assignee: INTEL CORPORATIONInventors: Abhishek R. Appu, John C. Weast, Sara S. Baghsorkhi, Justin E. Gottschlich, Prasoonkumar Surti, Chandrasekaran Sakthivel, Altug Koker, Farshad Akhbari, Feng Chen, Dukhwan Kim, Narayan Srinivasa, Nadathur Rajagopalan Satish, Kamal Sinha, Joydeep Ray, Balaji Vembu, Mike B. Macpherson, Linda L. Hurd, Sanjeev Jahagirdar, Vasanth Ranganathan
-
Patent number: 11586548Abstract: In an example, an apparatus comprises a plurality of execution units, and a cache memory communicatively coupled to the plurality of execution units, wherein the cache memory is structured into a plurality of sectors, wherein each sector in the plurality of sectors comprises at least two cache lines. Other embodiments are also disclosed and claimed.Type: GrantFiled: March 3, 2021Date of Patent: February 21, 2023Assignee: Intel CorporationInventors: Abhishek R. Appu, Altug Koker, Joydeep Ray, David Puffer, Prasoonkumar Surti, Lakshminarayanan Striramassarma, Vasanth Ranganathan, Kiran C. Veernapu, Balaji Vembu, Pattabhiraman K
-
Publication number: 20230046506Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute an intermediate product of 16-bit operands and to compute a 32-bit sum based on the intermediate product.Type: ApplicationFiled: October 17, 2022Publication date: February 16, 2023Applicant: Intel CorporationInventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
-
Publication number: 20230039729Abstract: Methods and apparatus relating to autonomous vehicle neural network optimization techniques are described. In an embodiment, the difference between a first training dataset to be used for a neural network and a second training dataset to be used for the neural network is detected. The second training dataset is authenticated in response to the detection of the difference. The neural network is used to assist in an autonomous vehicle/driving. Other embodiments are also disclosed and claimed.Type: ApplicationFiled: October 11, 2022Publication date: February 9, 2023Applicant: Intel CorporationInventors: Abhishek R. Appu, Altug Koker, Linda L. Hurd, Dukhwan Kim, Mike B. MacPherson, John C. Weast, Justin E. Gottschlich, Jingyi Jin, Barath Lakshmanan, Chandrasekaran Sakthivel, Michael S. Strickland, Joydeep Ray, Kamal Sinha, Prasoonkumar Surti, Balaji Vembu, Ping T. Tang, Anbang Yao, Tatiana Shpeisman, Xiaoming Chen