Patents by Inventor Eriko Nurvitadhi
Eriko Nurvitadhi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12217053Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute an intermediate product of 16-bit operands and to compute a 32-bit sum based on the intermediate product.Type: GrantFiled: December 4, 2023Date of Patent: February 4, 2025Assignee: Intel CorporationInventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
-
Patent number: 12210873Abstract: An integrated circuit device may include programmable logic circuitry on a first integrated circuit die and memory that includes compute-in-memory circuitry on a second die. The programmable logic circuitry may be programmed with a circuit design that operates on a first set of data. The compute-in-memory circuitry of the memory may perform an arithmetic operation using the first set of data from the programmable logic circuitry and a second set of data stored in the memory.Type: GrantFiled: April 10, 2023Date of Patent: January 28, 2025Assignee: Altera CorporationInventors: Eriko Nurvitadhi, Scott J. Weber, Ravi Prakash Gutala, Aravind Raghavendra Dasu
-
Patent number: 12210900Abstract: A mechanism is described for facilitating intelligent thread scheduling at autonomous machines. A method of embodiments, as described herein, includes detecting dependency information relating to a plurality of threads corresponding to a plurality of workloads associated with tasks relating to a processor including a graphics processor. The method may further include generating a tree of thread groups based on the dependency information, where each thread group includes multiple threads, and scheduling one or more of the thread groups associated a similar dependency to avoid dependency conflicts.Type: GrantFiled: May 17, 2022Date of Patent: January 28, 2025Assignee: INTEL CORPORATIONInventors: Joydeep Ray, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Rajkishore Barik, Eriko Nurvitadhi, Nicolas Galoppo Von Borries, Tsung-Han Lin, Sanjeev Jahagirdar, Vasanth Ranganathan
-
Patent number: 12198221Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.Type: GrantFiled: February 8, 2024Date of Patent: January 14, 2025Assignee: Intel CorporationInventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
-
Patent number: 12166688Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to optimize resources in edge networks. An example apparatus includes agent managing circuitry to invoke an exploration agent to identify platform resource devices, select a first one of the identified platform resource devices, and generate first optimization metrics for the workload corresponding to the first one of the identified platform resource devices, the first optimization metrics corresponding to a first path. The example agent is to further select a second one of the identified platform resource devices, generate second optimization metrics for the workload corresponding to the second one of the identified platform resource devices, the second optimization metrics corresponding to a second path.Type: GrantFiled: June 25, 2021Date of Patent: December 10, 2024Assignee: INTEL CORPORATIONInventors: Nilesh Jain, Rajesh Poornachandran, Eriko Nurvitadhi, Anahita Bhiwandiwalla, Juan Pablo Munoz, Ravishankar Iyer, Chaunte W. Lacewell
-
Publication number: 20240403620Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises at least one processor to perform operations to implement a neural network and accelerator logic to perform communicatively coupled to the processor to perform compute operations for the neural network.Type: ApplicationFiled: May 31, 2024Publication date: December 5, 2024Applicant: Intel CorporationInventors: Amit Bleiweiss, Anavai Ramesh, Asit Mishra, Deborah Marr, Jeffrey Cook, Srinivas Sridharan, Eriko Nurvitadhi, Elmoustapha Ould-Ahmed-Vall, Dheevatsa Mudigere, Mohammad Ashraf Bhuiyan, Md Faijul Amin, Wei Wang, Dhawal Srivastava, Niharika Maheshwari
-
Patent number: 12141891Abstract: Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.Type: GrantFiled: September 14, 2023Date of Patent: November 12, 2024Assignee: Intel CorporationInventors: Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Nicolas C. Galoppo Von Borries
-
Patent number: 12141578Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute a 32-bit intermediate product of 16-bit operands and to compute a 32-bit sum based on the 32-bit intermediate product.Type: GrantFiled: December 9, 2020Date of Patent: November 12, 2024Assignee: Intel CorporationInventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
-
Patent number: 12135981Abstract: Embodiments of systems, methods, and apparatuses for heterogeneous computing are described. In some embodiments, a hardware heterogeneous scheduler dispatches instructions for execution on one or more plurality of heterogeneous processing elements, the instructions corresponding to a code fragment to be processed by the one or more of the plurality of heterogeneous processing elements, wherein the instructions are native instructions to at least one of the one or more of the plurality of heterogeneous processing elements.Type: GrantFiled: June 9, 2023Date of Patent: November 5, 2024Assignee: Intel CorporationInventors: Rajesh M. Sankaran, Gilbert Neiger, Narayan Ranganathan, Stephen R. Van Doren, Joseph Nuzman, Niall D. McDonnell, Michael A. O'Hanlon, Lokpraveen B. Mosur, Tracy Garrett Drysdale, Eriko Nurvitadhi, Asit K. Mishra, Ganesh Venkatesh, Deborah T. Marr, Nicholas P. Carter, Jonathan D. Pearce, Edward T. Grochowski, Richard J. Greco, Robert Valentine, Jesus Corbal, Thomas D. Fletcher, Dennis R. Bradford, Dwight P. Manley, Mark J. Charney, Jeffrey J. Cook, Paul Caprioli, Koichi Yamada, Kent D. Glossop, David B. Sheffield
-
Patent number: 12112397Abstract: One embodiment provides a parallel processor comprising a hardware scheduler to schedule pipeline commands for compute operations to one or more of multiple types of compute units, a plurality of processing resources including a first sparse compute unit configured for input at a first level of sparsity and hybrid memory circuitry including a memory controller, a memory interface, and a second sparse compute unit configured for input at a second level of sparsity that is greater than the first level of sparsity.Type: GrantFiled: June 14, 2023Date of Patent: October 8, 2024Assignee: Intel CorporationInventors: Eriko Nurvitadhi, Balaji Vembu, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Altug Koker, Narayan Srinivasa, Dukhwan Kim, Sara S. Baghsorkhi, Justin E. Gottschlich, Feng Chen, Elmoustapha Ould-Ahmed-Vall, Kevin Nealis, Xiaoming Chen, Anbang Yao
-
Patent number: 12086705Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes a at least one processor to perform operations to implement a neural network and compute logic to accelerate neural network computations.Type: GrantFiled: December 29, 2017Date of Patent: September 10, 2024Assignee: Intel CorporationInventors: Amit Bleiweiss, Abhishek Venkatesh, Gokce Keskin, John Gierach, Oguz Elibol, Tomer Bar-On, Huma Abidi, Devan Burke, Jaikrishnan Menon, Eriko Nurvitadhi, Pruthvi Gowda Thorehosur Appajigowda, Travis T. Schluessler, Dhawal Srivastava, Nishant Patel, Anil Thomas
-
Publication number: 20240257294Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.Type: ApplicationFiled: February 8, 2024Publication date: August 1, 2024Applicant: Intel CorporationInventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
-
Publication number: 20240256845Abstract: An apparatus to facilitate processing of a sparse matrix for arbitrary graph data is disclosed. The apparatus includes a graphics processing unit having a data management unit (DMU) that includes a scheduler for scheduling matrix operations, an active logic for tracking active input operands, and a skip logic for tracking unimportant input operands to be skipped by the scheduler. Processing circuitry is coupled to the DMU. The processing circuitry comprises a plurality of processing elements including logic to read operands and a multiplication unit to multiply two or more operands for the arbitrary graph data and customizable circuitry to provide custom functions.Type: ApplicationFiled: March 28, 2024Publication date: August 1, 2024Applicant: Intel CorporationInventors: Eriko Nurvitadhi, Amit Bleiweiss, Deborah Marr, Eugene Wang, Saritha Dwarakapuram, Sabareesh Ganapathy
-
Patent number: 12050984Abstract: One embodiment provides for a general-purpose graphics processing unit including a scheduler to schedule multiple matrix operations for execution by a general-purpose graphics processing unit. The multiple matrix operations are determined based on a single machine learning compute instruction. The single machine learning compute instruction is a convolution instruction and the multiple matrix operations are associated with a convolution operation.Type: GrantFiled: October 28, 2020Date of Patent: July 30, 2024Assignee: Intel CorporationInventors: Rajkishore Barik, Elmoustapha Ould-Ahmed-Vall, Xiaoming Chen, Dhawal Srivastava, Anbang Yao, Kevin Nealis, Eriko Nurvitadhi, Sara S. Baghsorkhi, Balaji Vembu, Tatiana Shpeisman, Ping T. Tang
-
Patent number: 12039331Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute an intermediate product of 16-bit operands and to compute a 32-bit sum based on the intermediate product.Type: GrantFiled: October 17, 2022Date of Patent: July 16, 2024Assignee: Intel CorporationInventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
-
Patent number: 12039435Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises at least one processor to perform operations to implement a neural network and accelerator logic to perform communicatively coupled to the processor to perform compute operations for the neural network.Type: GrantFiled: June 21, 2022Date of Patent: July 16, 2024Assignee: INTEL CORPORATIONInventors: Amit Bleiweiss, Anavai Ramesh, Asit Mishra, Deborah Marr, Jeffrey Cook, Srinivas Sridharan, Eriko Nurvitadhi, Elmoustapha Ould-Ahmed-Vall, Dheevatsa Mudigere, Mohammad Ashraf Bhuiyan, Md Faijul Amin, Wei Wang, Dhawal Srivastava, Niharika Maheshwari
-
Patent number: 12014265Abstract: An apparatus to facilitate processing of a sparse matrix for arbitrary graph data is disclosed. The apparatus includes a graphics processing unit having a data management unit (DMU) that includes a scheduler for scheduling matrix operations, an active logic for tracking active input operands, and a skip logic for tracking unimportant input operands to be skipped by the scheduler. Processing circuitry is coupled to the DMU. The processing circuitry comprises a plurality of processing elements including logic to read operands and a multiplication unit to multiply two or more operands for the arbitrary graph data and customizable circuitry to provide custom functions.Type: GrantFiled: April 19, 2023Date of Patent: June 18, 2024Inventors: Eriko Nurvitadhi, Amit Bleiweiss, Deborah Marr, Eugene Wang, Saritha Dwarakapuram, Sabareesh Ganapathy
-
Publication number: 20240184572Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute an intermediate product of 16-bit operands and to compute a 32-bit sum based on the intermediate product.Type: ApplicationFiled: December 4, 2023Publication date: June 6, 2024Applicant: Intel CorporationInventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
-
Publication number: 20240086683Abstract: An apparatus to facilitate workload scheduling is disclosed. The apparatus includes one or more clients, one or more processing units to processes workloads received from the one or more clients, including hardware resources and scheduling logic to schedule direct access of the hardware resources to the one or more clients to process the workloads.Type: ApplicationFiled: September 21, 2023Publication date: March 14, 2024Applicant: Intel CorporationInventors: Liwei Ma, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Eriko Nurvitadhi, Chandrasekaran Sakthivel, Barath Lakshmanan, Jingyi Jin, Justin E. Gottschlich, Michael Strickland
-
Publication number: 20240078629Abstract: Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.Type: ApplicationFiled: September 14, 2023Publication date: March 7, 2024Applicant: Intel CorporationInventors: Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Nicolas C. Galoppo Von Borries