Patents by Inventor Jonathan Pearce

Jonathan Pearce has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220261289
    Abstract: Embodiments are generally directed to thread group scheduling for graphics processing. An embodiment of an apparatus includes a plurality of processors including a plurality of graphics processors to process data; a memory; and one or more caches for storage of data for the plurality of graphics processors, wherein the one or more processors are to schedule a plurality of groups of threads for processing by the plurality of graphics processors, the scheduling of the plurality of groups of threads including the plurality of processors to apply a bias for scheduling the plurality of groups of threads according to a cache locality for the one or more caches.
    Type: Application
    Filed: March 3, 2022
    Publication date: August 18, 2022
    Applicant: Intel Corporation
    Inventors: Ben Ashbaugh, Jonathan Pearce, Murali Ramadoss, Vikranth Vemulapalli, William B. Sadler, Sungye Kim, Marian Alin Petre
  • Publication number: 20220261347
    Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.
    Type: Application
    Filed: April 28, 2022
    Publication date: August 18, 2022
    Applicant: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Ben Ashbaugh, Jonathan Pearce, Abhishek Appu, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Elmoustapha Ould-Ahmed-Vall, Aravindh Anantaraman, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Yoav Harel, Arthur Hunter,, JR., Brent Insko, Scott Janus, Pattabhiraman K, Mike Macpherson, Subramaniam Maiyuran, Marian Alin Petre, Murali Ramadoss, Shailesh Shah, Kamal Sinha, Prasoonkumar Surti, Vikranth Vemulapalli
  • Patent number: 11409658
    Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.
    Type: Grant
    Filed: January 28, 2021
    Date of Patent: August 9, 2022
    Assignee: Intel Corporation
    Inventors: Vikranth Vemulapalli, Lakshminarayanan Striramassarma, Mike MacPherson, Aravindh Anantaraman, Ben Ashbaugh, Murali Ramadoss, William B. Sadler, Jonathan Pearce, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, Jr., Prasoonkumar Surti, Nicolas Galoppo von Borries, Joydeep Ray, Abhishek R. Appu, ElMoustapha Ould-Ahmed-Vall, Altug Koker, Sungye Kim, Subramaniam Maiyuran, Valentin Andrei
  • Patent number: 11409693
    Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: May 17, 2021
    Date of Patent: August 9, 2022
    Assignee: INTEL CORPORATION
    Inventors: Joydeep Ray, Aravindh Anantaraman, Abhishek R. Appu, Altug Koker, Elmoustapha Ould-Ahmed-Vall, Valentin Andrei, Subramaniam Maiyuran, Nicolas Galoppo Von Borries, Varghese George, Mike MacPherson, Ben Ashbaugh, Murali Ramadoss, Vikranth Vemulapalli, William Sadler, Jonathan Pearce, Sungye Kim
  • Patent number: 11379229
    Abstract: An apparatus and method for performing efficient, adaptable tensor operations. For example, one embodiment of a processor comprises: front end circuitry to schedule matrix operations responsive to a matrix multiplication instruction; a plurality of lanes to perform parallel execution of the matrix operations, wherein a lane comprises an arithmetic logic unit to multiply a block of a first matrix with a block of a second matrix to generate a product and to accumulate the product with a block of a third matrix, and wherein the matrix blocks are to be stored in registers within the lane; and broadcast circuitry to broadcast one or more invariant matrix blocks to at least one of different registers within the lane and different registers across different lanes.
    Type: Grant
    Filed: August 7, 2020
    Date of Patent: July 5, 2022
    Assignee: INTEL CORPORATION
    Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jeffrey Cook, Debbie Marr, Abhijit Davare, Asit Mishra, Steven Burns, Desmond A. Kirkpatrick, Andrey Ayupov, Anton Alexandrovich Sorokin, Eriko Nurvitadhi
  • Publication number: 20220179787
    Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.
    Type: Application
    Filed: March 14, 2020
    Publication date: June 9, 2022
    Applicant: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Ben Ashbaugh, Jonathan Pearce, Abhishek Appu, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Elmoustapha Ould-Ahmed-Vall, Aravindh Anantaraman, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Yoav Harel, Arthur Hunter, Jr., Brent Insko, Scott Janus, Pattabhiraman K, Mike Macpherson, Subramaniam Maiyuran, Marian Alin Petre, Murali Ramadoss, Shailesh Shah, Kamal Sinha, Prasoonkumar Surti, Vikranth Vemulapalli
  • Patent number: 11281496
    Abstract: Embodiments are generally directed to thread group scheduling for graphics processing. An embodiment of an apparatus includes a plurality of processors including a plurality of graphics processors to process data; a memory; and one or more caches for storage of data for the plurality of graphics processors, wherein the one or more processors are to schedule a plurality of groups of threads for processing by the plurality of graphics processors, the scheduling of the plurality of groups of threads including the plurality of processors to apply a bias for scheduling the plurality of groups of threads according to a cache locality for the one or more caches.
    Type: Grant
    Filed: March 15, 2019
    Date of Patent: March 22, 2022
    Assignee: INTEL CORPORATION
    Inventors: Ben Ashbaugh, Jonathan Pearce, Murali Ramadoss, Vikranth Vemulapalli, William B. Sadler, Sungye Kim, Marian Alin Petre
  • Publication number: 20210349848
    Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: May 17, 2021
    Publication date: November 11, 2021
    Applicant: Intel Corporation
    Inventors: JOYDEEP RAY, ARAVINDH ANANTARAMAN, ABHISHEK R. APPU, ALTUG KOKER, ELMOUSTAPHA OULD-AHMED-VALL, VALENTIN ANDREI, SUBRAMANIAM MAIYURAN, NICOLAS GALOPPO VON BORRIES, VARGHESE GEORGE, MIKE MACPHERSON, BEN ASHBAUGH, MURALI RAMADOSS, VIKRANTH VEMULAPALLI, WILLIAM SADLER, JONATHAN PEARCE, SUNGYE KIM
  • Patent number: 11161965
    Abstract: Organopolysulfides such as organodisulfides, organotrisulfides and/or organotetrasulfides are useful stabilizers for polymer compositions, wherein the tendency of a polymer to degrade when exposed to environmental conditions such as heat, light and oxygen may be ameliorated by the incorporation of one or more of such organopolysulfides, optionally together with one or more additional stabilization additives such as a hindered phenol antioxidant, phosp(on)ite stabilizer or hindered amine light stabilizer.
    Type: Grant
    Filed: March 6, 2019
    Date of Patent: November 2, 2021
    Assignee: Arkema Inc.
    Inventors: George Charles Fortman, Jonathan Pearce Stehman, Stephanie Christina Vrakas, Kurt Wood
  • Publication number: 20210255957
    Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.
    Type: Application
    Filed: January 28, 2021
    Publication date: August 19, 2021
    Applicant: Intel Corporation
    Inventors: Vikranth Vemulapalli, Lakshminarayanan Striramassarma, Mike MacPherson, Aravindh Anantaraman, Ben Ashbaugh, Murali Ramadoss, William B. Sadler, Jonathan Pearce, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, JR., Prasoonkumar Surti, Nicolas Galoppo von Borries, Joydeep Ray, Abhishek R. Appu, ElMoustapha Ould-Ahmed-Vall, Altug Koker, Sungye Kim, Subramaniam Maiyuran, Valentin Andrei
  • Patent number: 11093250
    Abstract: An apparatus and method for efficiently processing invariant operations on a parallel execution engine.
    Type: Grant
    Filed: September 29, 2018
    Date of Patent: August 17, 2021
    Assignee: Intel Corporation
    Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jaewoong Sim, Andrey Ayupov
  • Patent number: 11016929
    Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: March 15, 2019
    Date of Patent: May 25, 2021
    Assignee: INTEL CORPORATION
    Inventors: Joydeep Ray, Aravindh Anantaraman, Abhishek R. Appu, Altug Koker, Elmoustapha Ould-Ahmed-Vall, Valentin Andrei, Subramaniam Maiyuran, Nicolas Galappo Von Borries, Varghese George, Mike Macpherson, Ben Ashbaugh, Murali Ramadoss, Vikranth Vemulapalli, William Sadler, Jonathan Pearce, Sungye Kim
  • Patent number: 10915328
    Abstract: An apparatus and method for offloading iterative, parallel work to a data parallel cluster. For example, one embodiment of a processor comprises: a host processor to execute a primary thread; a data parallel cluster coupled to the host processor over a high speed interconnect, the data parallel cluster comprising a plurality of execution lanes to perform parallel execution of one or more secondary threads related to the primary thread; and a data parallel cluster controller integral to the host processor to offload processing of the one or more secondary threads to the data parallel cluster in response to one of the cores executing a parallel processing call instruction from the primary thread.
    Type: Grant
    Filed: December 14, 2018
    Date of Patent: February 9, 2021
    Assignee: Intel Corporation
    Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jeffrey Cook, Deborah Marr
  • Patent number: 10909039
    Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.
    Type: Grant
    Filed: March 15, 2019
    Date of Patent: February 2, 2021
    Assignee: INTEL CORPORATION
    Inventors: Vikranth Vemulapalli, Lakshminarayanan Striramassarma, Mike MacPherson, Aravindh Anantaraman, Ben Ashbaugh, Murali Ramadoss, William B. Sadler, Jonathan Pearce, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, Jr., Prasoonkumar Surti, Nicolas Galoppo von Borries, Joydeep Ray, Abhishek R. Appu, ElMoustapha Ould-Ahmed-Vall, Altug Koker, Sungye Kim, Subramaniam Maiyuran, Valentin Andrei
  • Publication number: 20200364045
    Abstract: An apparatus and method for performing efficient, adaptable tensor operations. For example, one embodiment of a processor comprises: front end circuitry to schedule matrix operations responsive to a matrix multiplication instruction; a plurality of lanes to perform parallel execution of the matrix operations, wherein a lane comprises an arithmetic logic unit to multiply a block of a first matrix with a block of a second matrix to generate a product and to accumulate the product with a block of a third matrix, and wherein the matrix blocks are to be stored in registers within the lane; and broadcast circuitry to broadcast one or more invariant matrix blocks to at least one of different registers within the lane and different registers across different lanes.
    Type: Application
    Filed: August 7, 2020
    Publication date: November 19, 2020
    Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jeffrey Cook, Debbie Marr, Abhijit Davare, Asit Mishra, Steven Burns, Desmond A. Kirkpatrick, Andrey Ayupov, Anton Alexandrovich Sorokin, Eriko Nurvitadhi
  • Patent number: 10831505
    Abstract: An apparatus and method for data parallel single program multiple data (SPMD) execution. For example, one embodiment of a processor comprises: instruction fetch circuitry to fetch instructions of one or more primary threads; a decoder to decode the instructions to generate uops; a data parallel cluster (DPC) to execute microthreads comprising a subset of the uops, the DPC further comprising: a plurality of execution lanes to perform parallel execution of the microthreads; an instruction decode queue (IDQ) to store the uops prior to execution; and a scheduler to evaluate the microthreads based on associated variables including instruction pointer (IP) values, the scheduler to gang microthreads into fragments for parallel execution on the execution lanes based on the evaluation.
    Type: Grant
    Filed: September 29, 2018
    Date of Patent: November 10, 2020
    Assignee: Intel Corporation
    Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jeffrey Cook, Deborah Marr, Abhijit Davare, Andrey Ayupov
  • Publication number: 20200293488
    Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: March 15, 2019
    Publication date: September 17, 2020
    Applicant: Intel Corporation
    Inventors: JOYDEEP RAY, ARAVINDH ANANTARAMAN, ABHISHEK R. APPU, ALTUG KOKER, ELMOUSTAPHA OULD-AHMED-VALL, VALENTIN ANDREI, SUBRAMANIAM MAIYURAN, NICOLAS GALAPPO VON BORRIES, VARGHESE GEORGE, MIKE MACPHERSON, BEN ASHBAUGH, MURALI RAMADOSS, VIKRANTH VEMULAPALLI, WILLIAM SADLER, JONATHAN PEARCE, SUNGYE KIM
  • Publication number: 20200293450
    Abstract: Embodiments are generally directed to data prefetching for graphics data processing. An embodiment of an apparatus includes one or more processors including one or more graphics processing units (GPUs); and a plurality of caches to provide storage for the one or more GPUs, the plurality of caches including at least an L1 cache and an L3 cache, wherein the apparatus to provide intelligent prefetching of data by a prefetcher of a first GPU of the one or more GPUs including measuring a hit rate for the L1 cache; upon determining that the hit rate for the L1 cache is equal to or greater than a threshold value, limiting a prefetch of data to storage in the L3 cache, and upon determining that the hit rate for the L1 cache is less than a threshold value, allowing the prefetch of data to the L1 cache.
    Type: Application
    Filed: March 15, 2019
    Publication date: September 17, 2020
    Applicant: Intel Corporation
    Inventors: Vikranth Vemulapalli, Lakshminarayanan Striramassarma, Mike MacPherson, Aravindh Anantaraman, Ben Ashbaugh, Murali Ramadoss, William B. Sadler, Jonathan Pearce, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, JR., Prasoonkumar Surti, Nicolas Galoppo von Borries, Joydeep Ray, Abhishek R. Appu, ElMoustapha Ould-Ahmed-Vall, Altug Koker, Sungye Kim, Subramaniam Maiyuran, Valentin Andrei
  • Publication number: 20200293380
    Abstract: Embodiments are generally directed to thread group scheduling for graphics processing. An embodiment of an apparatus includes a plurality of processors including a plurality of graphics processors to process data; a memory; and one or more caches for storage of data for the plurality of graphics processors, wherein the one or more processors are to schedule a plurality of groups of threads for processing by the plurality of graphics processors, the scheduling of the plurality of groups of threads including the plurality of processors to apply a bias for scheduling the plurality of groups of threads according to a cache locality for the one or more caches.
    Type: Application
    Filed: March 15, 2019
    Publication date: September 17, 2020
    Applicant: Intel Corporation
    Inventors: Ben Ashbaugh, Jonathan Pearce, Murali Ramadoss, Vikranth Vemulapalli, William B. Sadler, Sungye Kim, Marian Alin Petre
  • Patent number: 10776110
    Abstract: An apparatus and method for performing efficient, adaptable tensor operations.
    Type: Grant
    Filed: September 29, 2018
    Date of Patent: September 15, 2020
    Assignee: Intel Corporation
    Inventors: Jonathan Pearce, David Sheffield, Srikanth Srinivasan, Jeffrey Cook, Deborah Marr, Abhijit Davare, Asit Mishra, Steven Burns, Desmond Kirkpatrick, Andrey Ayupov, Anton Alexandrovich Sorokin, Eriko Nurvitadhi