Patents by Inventor Nicolas C. Galoppo Von Borries
Nicolas C. Galoppo Von Borries has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240078629Abstract: Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.Type: ApplicationFiled: September 14, 2023Publication date: March 7, 2024Applicant: Intel CorporationInventors: Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Nicolas C. Galoppo Von Borries
-
Patent number: 11922535Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.Type: GrantFiled: February 13, 2023Date of Patent: March 5, 2024Assignee: Intel CorporationInventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
-
Publication number: 20240005136Abstract: In an example, an apparatus comprises a compute engine comprising a high precision component and a low precision component; and logic, at least partially including hardware logic, to receive instructions in the compute engine; select at least one of the high precision component or the low precision component to execute the instructions; and apply a gate to at least one of the high precision component or the low precision component to execute the instructions. Other embodiments are also disclosed and claimed.Type: ApplicationFiled: July 12, 2023Publication date: January 4, 2024Applicant: Intel CorporationInventors: Kamal Sinha, Balaji Vembu, Eriko Nurvitadhi, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Farshad Akhbari, Narayan Srinivasa, Feng Chen, Dukhwan Kim, Nadathur Rajagopalan Satish, John C. Weast, Mike B. MacPherson, Linda L. Hurd, Vasanth Ranganathan, Sanjeev Jahagirdar
-
Publication number: 20230394616Abstract: One embodiment provides a parallel processor comprising a hardware scheduler to schedule pipeline commands for compute operations to one or more of multiple types of compute units, a plurality of processing resources including a first sparse compute unit configured for input at a first level of sparsity and hybrid memory circuitry including a memory controller, a memory interface, and a second sparse compute unit configured for input at a second level of sparsity that is greater than the first level of sparsity.Type: ApplicationFiled: June 14, 2023Publication date: December 7, 2023Applicant: Intel CorporationInventors: Eriko Nurvitadhi, Balaji Vembu, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Altug Koker, Narayan Srinivasa, Dukhwan Kim, Sara S. Baghsorkhi, Justin E. Gottschlich, Feng Chen, Elmoustapha Ould-Ahmed-Vall, Kevin Nealis, Xiaoming Chen, Anbang Yao
-
Publication number: 20230359461Abstract: One embodiment provides for a compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including a multi-bit input value and a one-bit weight associated with a neural network, as well as an arithmetic logic unit including a multiplier, an adder, and an accumulator register. To execute the decoded instruction, the multiplier is to perform a fused operation including an exclusive not OR (XNOR) operation and a population count operation. The adder is configured to add the intermediate product to a value stored in the accumulator register and update the value stored in the accumulator register.Type: ApplicationFiled: May 11, 2023Publication date: November 9, 2023Applicant: Intel CorporationInventors: Kevin Nealis, Anbang Yao, Xiaoming Chen, Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Eriko Nurvitadhi, Balaji Vembu, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha
-
Patent number: 11809905Abstract: One embodiment provides for a general-purpose graphics processing unit comprising a set of processing elements to execute one or more thread groups of a second kernel to be executed by the general-purpose graphics processor, an on-chip memory coupled to the set of processing elements, and a scheduler coupled with the set of processing elements, the scheduler to schedule the thread groups of the kernel to the set of processing elements, wherein the scheduler is to schedule a thread group of the second kernel to execute subsequent to a thread group of a first kernel, the thread group of the second kernel configured to access a region of the on-chip memory that contains data written by the thread group of the first kernel in response to a determination that the second kernel is dependent upon the first kernel.Type: GrantFiled: September 13, 2021Date of Patent: November 7, 2023Assignee: Intel CorporationInventors: Valentin Andrei, Aravindh Anantaraman, Abhishek R. Appu, Nicolas C. Galoppo von Borries, Altug Koker, SungYe Kim, Elmoustapha Ould-Ahmed-Vall, Mike Macpherson, Subramaniam Maiyuran, Vasanth Ranganathan, Joydeep Ray, Varghese George
-
Patent number: 11803935Abstract: Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.Type: GrantFiled: August 5, 2022Date of Patent: October 31, 2023Assignee: Intel CorporationInventors: Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Nicolas C. Galoppo Von Borries
-
Patent number: 11748606Abstract: In an example, an apparatus comprises a compute engine comprising a high precision component and a low precision component; and logic, at least partially including hardware logic, to receive instructions in the compute engine; select at least one of the high precision component or the low precision component to execute the instructions; and apply a gate to at least one of the high precision component or the low precision component to execute the instructions. Other embodiments are also disclosed and claimed.Type: GrantFiled: May 11, 2021Date of Patent: September 5, 2023Assignee: INTEL CORPORATIONInventors: Kamal Sinha, Balaji Vembu, Eriko Nurvitadhi, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Anbang Yao, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Farshad Akhbari, Narayan Srinivasa, Feng Chen, Dukhwan Kim, Nadathur Rajagopalan Satish, John C. Weast, Mike B. MacPherson, Linda L. Hurd, Vasanth Ranganathan, Sanjeev S. Jahagirdar
-
Publication number: 20230260072Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.Type: ApplicationFiled: February 13, 2023Publication date: August 17, 2023Applicant: Intel CorporationInventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
-
Patent number: 11727527Abstract: One embodiment provides for a compute apparatus to perform machine learning operations, the compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction, the decoded instruction to cause the compute apparatus to perform a complex compute operation.Type: GrantFiled: December 3, 2021Date of Patent: August 15, 2023Assignee: Intel CorporationInventors: Eriko Nurvitadhi, Balaji Vembu, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha, Nadathur Rajagopalan Satish, Jeremy Bottleson, Farshad Akhbari, Altug Koker, Narayan Srinivasa, Dukhwan Kim, Sara S. Baghsorkhi, Justin E. Gottschlich, Feng Chen, Elmoustapha Ould-Ahmed-Vall, Kevin Nealis, Xiaoming Chen, Anbang Yao
-
Patent number: 11720355Abstract: One embodiment provides a graphics processor comprising a memory controller and a graphics processing resource coupled with the memory controller. The graphics processing resource includes circuitry configured to execute an instruction to perform a matrix operation on first input including weight data and second input including input activation data, generate intermediate data based on a result of the matrix operation, quantize the intermediate data to a floating-point format determined based on a statistical distribution of first output data, and output, as second output data, quantized intermediate data in a determined floating-point format.Type: GrantFiled: June 7, 2022Date of Patent: August 8, 2023Assignee: Intel CorporationInventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
-
Publication number: 20230245452Abstract: Embodiments of adaptive multi-modal event detection are disclosed herein. In one example, sensor data captured by multiple sensors is received, and an inconsistency among the sensors is detected based on performing event detection on the sensor data. An external environment of the sensors is detected based on the sensor data, and one or more configuration parameter(s) for event detection are adjusted based on the external environment of the sensors. Event detection is then performed on the sensor data based on the adjusted configuration parameter(s).Type: ApplicationFiled: April 6, 2023Publication date: August 3, 2023Applicant: Intel CorporationInventors: Say Chuan Tan, Mee Sim Lai, Ying Wei Liew, Chien Chern Yew, Kuan Heng Lee, Nicolas C. Galoppo von Borries, Rita Chattopadhyay
-
Patent number: 11693658Abstract: One embodiment provides for a compute apparatus comprising a decode unit to decode a single instruction into a decoded instruction that specifies multiple operands including a multi-bit input value and a ternary weight associated with a neural network and an arithmetic logic unit including a multiplier, an adder, and an accumulator register. To execute the decoded instruction, the multiplier is to perform a multiplication operation on the multi-bit input based on the ternary weight to generate an intermediate product and the adder is to add the intermediate product to a value stored in the accumulator register and update the value stored in the accumulator register.Type: GrantFiled: July 26, 2021Date of Patent: July 4, 2023Assignee: Intel CorporationInventors: Kevin Nealis, Anbang Yao, Xiaoming Chen, Elmoustapha Ould-Ahmed-Vall, Sara S. Baghsorkhi, Eriko Nurvitadhi, Balaji Vembu, Nicolas C. Galoppo Von Borries, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha
-
Patent number: 11609856Abstract: In an example, an apparatus comprises a plurality of processing unit cores, a plurality of cache memory modules associated with the plurality of processing unit cores, and a machine learning model communicatively coupled to the plurality of processing unit cores, wherein the plurality of cache memory modules share cache coherency data with the machine learning model. Other embodiments are also disclosed and claimed.Type: GrantFiled: March 19, 2021Date of Patent: March 21, 2023Assignee: INTEL CORPORATIONInventors: Chandrasekaran Sakthivel, Prasoonkumar Surti, John C. Weast, Sara S. Baghsorkhi, Justin E. Gottschlich, Abhishek R. Appu, Nicolas C. Galoppo Von Borries, Joydeep Ray, Narayan Srinivasa, Feng Chen, Ben J. Ashbaugh, Rajkishore Barik, Tsung-Han Lin, Kamal Sinha, Eriko Nurvitadhi, Balaji Vembu, Altug Koker
-
Patent number: 11593910Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.Type: GrantFiled: May 11, 2022Date of Patent: February 28, 2023Assignee: Intel CorporationInventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
-
Publication number: 20230046506Abstract: One embodiment provides for a graphics processing unit to accelerate machine-learning operations, the graphics processing unit comprising a multiprocessor having a single instruction, multiple thread (SIMT) architecture, the multiprocessor to execute at least one single instruction; and a first compute unit included within the multiprocessor, the at least one single instruction to cause the first compute unit to perform a two-dimensional matrix multiply and accumulate operation, wherein to perform the two-dimensional matrix multiply and accumulate operation includes to compute an intermediate product of 16-bit operands and to compute a 32-bit sum based on the intermediate product.Type: ApplicationFiled: October 17, 2022Publication date: February 16, 2023Applicant: Intel CorporationInventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
-
Publication number: 20230040631Abstract: Techniques to improve performance of matrix multiply operations are described in which a compute kernel can specify one or more element-wise operations to perform on output of the compute kernel before the output is transferred to higher levels of a processor memory hierarchy.Type: ApplicationFiled: August 5, 2022Publication date: February 9, 2023Applicant: Intel CorporationInventors: Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Nicolas C. Galoppo Von Borries
-
Patent number: 11562461Abstract: An apparatus to facilitate compute optimization is disclosed. The apparatus includes one or more processing units to provide a first set of shader operations associated with a shader stage of a graphics pipeline, a scheduler to schedule shader threads for processing, and a field-programmable gate array (FPGA) dynamically configured to provide a second set of shader operations associated with the shader stage of the graphics pipeline.Type: GrantFiled: November 18, 2021Date of Patent: January 24, 2023Assignee: Intel CorporationInventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu
-
Publication number: 20220357945Abstract: One embodiment provides a graphics processor comprising a memory controller and a graphics processing resource coupled with the memory controller. The graphics processing resource includes circuitry configured to execute an instruction to perform a matrix operation on first input including weight data and second input including input activation data, generate intermediate data based on a result of the matrix operation, quantize the intermediate data to a floating-point format determined based on a statistical distribution of first output data, and output, as second output data, quantized intermediate data in a determined floating-point format.Type: ApplicationFiled: June 7, 2022Publication date: November 10, 2022Applicant: Intel CorporationInventors: Himanshu Kaul, Mark A. Anders, Sanu K. Mathew, Anbang Yao, Joydeep Ray, Ping T. Tang, Michael S. Strickland, Xiaoming Chen, Tatiana Shpeisman, Abhishek R. Appu, Altug Koker, Kamal Sinha, Balaji Vembu, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Rajkishore Barik, Tsung-Han Lin, Vasanth Ranganathan, Sanjeev Jahagirdar
-
Publication number: 20220335562Abstract: Embodiments provide mechanisms to facilitate compute operations for deep neural networks. One embodiment comprises a graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including a register file to store a plurality of different types of operands and a plurality of processing cores. The plurality of processing cores includes a first set of processing cores of a first type and a second set of processing cores of a second type. The first set of processing cores are associated with a first memory channel and the second set of processing cores are associated with a second memory channel.Type: ApplicationFiled: May 11, 2022Publication date: October 20, 2022Applicant: Intel CorporationInventors: Prasoonkumar Surti, Narayan Srinivasa, Feng Chen, Joydeep Ray, Ben J. Ashbaugh, Nicolas C. Galoppo Von Borries, Eriko Nurvitadhi, Balaji Vembu, Tsung-Han Lin, Kamal Sinha, Rajkishore Barik, Sara S. Baghsorkhi, Justin E. Gottschlich, Altug Koker, Nadathur Rajagopalan Satish, Farshad Akhbari, Dukhwan Kim, Wenyin Fu, Travis T. Schluessler, Josh B. Mastronarde, Linda L. Hurd, John H. Feit, Jeffery S. Boles, Adam T. Lake, Karthik Vaidyanathan, Devan Burke, Subramaniam Maiyuran, Abhishek R. Appu