Patents by Inventor Valentin Andrei

Valentin Andrei has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12561276
    Abstract: Systems and methods for updating remote memory side caches in a multi-GPU configuration are disclosed herein. In one embodiment, a graphics processor for a multi-tile architecture includes a first graphics processing unit (GPU) having a first memory, a first memory side cache memory, a first communication fabric, and a first memory management unit (MMU). The graphics processor includes a second graphics processing unit (GPU) having a second memory, a second memory side cache memory, a second memory management unit (MMU), and a second communication fabric that is communicatively coupled to the first communication fabric. The first MMU is configured to control memory requests for the first memory, to update content in the first memory, to update content in the first memory side cache memory, and to determine whether to update the content in the second memory side cache memory.
    Type: Grant
    Filed: March 14, 2020
    Date of Patent: February 24, 2026
    Assignee: INTEL CORPORATION
    Inventors: Altug Koker, Joydeep Ray, Aravindh Anantaraman, Valentin Andrei, Abhishek Appu, Sean Coleman, Nicolas Galoppo Von Borries, Varghese George, Pattabhiraman K, SungYe Kim, Mike Macpherson, Subramaniam Maiyuran, Elmoustapha Ould-Ahmed-Vall, Vasanth Ranganathan, James Valerio
  • Patent number: 12561277
    Abstract: Methods and apparatus relating to memory controller techniques. In an example, an apparatus comprises a cache memory, a high-bandwidth memory, and a processor communicatively coupled to the cache memory and the high-bandwidth memory, the processor to manage data transfer between the cache memory and the high-bandwidth memory for memory access operations directed to the high-bandwidth memory. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: March 14, 2020
    Date of Patent: February 24, 2026
    Assignee: INTEL CORPORATION
    Inventors: Abhishek R. Appu, Aravindh Anantaraman, Elmoustapha Ould-Ahmed-Vall, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Altug Koker, Mike Macpherson, Subramaniam Maiyuran, Joydeep Ray, Lakshminarayana Pappu, Guadalupe Garcia
  • Patent number: 12554674
    Abstract: Methods and apparatus relating to techniques for multi-tile memory management. In an example, a graphics processor includes an interposer, a first chiplet coupled with the interposer, the first chiplet including a graphics processing resource and an interconnect network coupled with the graphics processing resource, cache circuitry coupled with the graphics processing resource via the interconnect network, and a second chiplet coupled with the first chiplet via the interposer, the second chiplet including a memory-side cache and a memory controller coupled with the memory-side cache. The memory controller is configured to enable access to a high-bandwidth memory (HBM) device, the memory-side cache is configured to cache data associated with a memory access performed via the memory controller, and the cache circuitry is logically positioned between the graphics processing resource and a chiplet interface.
    Type: Grant
    Filed: October 15, 2024
    Date of Patent: February 17, 2026
    Assignee: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, Aravindh Anantaraman, Elmoustapha Ould-Ahmed-Vall, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Mike Macpherson, Subramaniam Maiyuran, Joydeep Ray, Lakshminarayanan Striramassarma, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, Prasoonkumar Surti, David Puffer, James Valerio, Ankur N. Shah
  • Publication number: 20250378045
    Abstract: Embodiments described herein provide techniques to enable the dynamic reconfiguration of memory on a general-purpose graphics processing unit. One embodiment described herein enables dynamic reconfiguration of cache memory bank assignments based on hardware statistics. One embodiment enables for virtual memory address translation using mixed four kilobyte and sixty-four kilobyte pages within the same page table hierarchy and under the same page directory. One embodiment provides for a graphics processor and associated heterogenous processing system having near and far regions of the same level of a cache hierarchy.
    Type: Application
    Filed: May 22, 2025
    Publication date: December 11, 2025
    Applicant: Intel Corporation
    Inventors: JOYDEEP RAY, NIRANJAN COORAY, SUBRAMANIAM MAIYURAN, ALTUG KOKER, PRASOONKUMAR SURTI, VARGHESE GEORGE, VALENTIN ANDREI, ABHISHEK APPU, GUADALUPE GARCIA, PATTABHIRAMAN K, SUNGYE KIM, SANJAY KUMAR, PRATIK MAROLIA, ELMOUSTAPHA OULD-AHMED-VALL, VASANTH RANGANATHAN, WILLIAM SADLER, LAKSHMINARAYANAN STRIRAMASSARMA
  • Patent number: 12386779
    Abstract: Embodiments described herein provide techniques to enable the dynamic reconfiguration of memory on a general-purpose graphics processing unit. One embodiment described herein enables dynamic reconfiguration of cache memory bank assignments based on hardware statistics. One embodiment enables for virtual memory address translation using mixed four kilobyte and sixty-four kilobyte pages within the same page table hierarchy and under the same page directory. One embodiment provides for a graphics processor and associated heterogenous processing system having near and far regions of the same level of a cache hierarchy.
    Type: Grant
    Filed: February 5, 2024
    Date of Patent: August 12, 2025
    Assignee: Intel Corporation
    Inventors: Joydeep Ray, Niranjan Cooray, Subramaniam Maiyuran, Altug Koker, Prasoonkumar Surti, Varghese George, Valentin Andrei, Abhishek Appu, Guadalupe Garcia, Pattabhiraman K, Sungye Kim, Sanjay Kumar, Pratik Marolia, Elmoustapha Ould-Ahmed-Vall, Vasanth Ranganathan, William Sadler, Lakshminarayanan Striramassarma
  • Patent number: 12373912
    Abstract: Embodiments are generally directed to memory prefetching in multiple GPU environment. An embodiment of an apparatus includes multiple processors including a host processor and multiple graphics processing units (GPUs) to process data, each of the GPUs including a prefetcher and a cache; and a memory for storage of data, the memory including a plurality of memory elements, wherein the prefetcher of each of the GPUs is to prefetch data from the memory to the cache of the GPU; and wherein the prefetcher of a GPU is prohibited from prefetching from a page that is not owned by the GPU or by the host processor.
    Type: Grant
    Filed: November 16, 2023
    Date of Patent: July 29, 2025
    Assignee: INTEL CORPORATION
    Inventors: Joydeep Ray, Aravindh Anantaraman, Valentin Andrei, Abhishek R. Appu, Nicolas Galoppo von Borries, Varghese George, Altug Koker, Elmoustapha Ould-Ahmed-Vall, Mike Macpherson, Subramaniam Maiyuran
  • Publication number: 20250209564
    Abstract: Embodiments described herein include, software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. Embodiment described herein provided techniques to detect zero value elements within a vector or a set of packed data elements output by a processing resource and generate metadata to indicate a location of the zero value elements within the plurality of data elements.
    Type: Application
    Filed: February 20, 2025
    Publication date: June 26, 2025
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Scott Janus, Varghese George, Subramaniam Maiyuran, Altug Koker, Abhishek Appu, Prasoonkumar Surti, Vasanth Ranganathan, Valentin Andrei, Ashutosh Garg, Yoav Harel, Arthur Hunter, JR., SungYe Kim, Mike Macpherson, Elmoustapha Ould-Ahmed-Vall, William Sadler, Lakshminarayanan Striramassarma, Vikranth Vemulapalli
  • Publication number: 20250200697
    Abstract: Apparatuses including general-purpose graphics processing units having on chip dense memory for temporal buffering are disclosed. In one embodiment, a graphics multiprocessor includes a plurality of compute engines to perform first computations to generate a first set of data, cache for storing data, and a high density memory that is integrated on chip with the plurality of compute engines and the cache. The high density memory to receive the first set of data, to temporarily store the first set of data, and to provide the first set of data to the cache during a first time period that is prior to a second time period when the plurality of compute engines will use the first set of data for second computations.
    Type: Application
    Filed: February 7, 2025
    Publication date: June 19, 2025
    Applicant: Intel Corporation
    Inventors: Varghese George, Altug Koker, Aravindh Anantaraman, Subramaniam Maiyuran, SungYe Kim, Valentin Andrei, Elmoustapha Ould-Ahmed-Vall, Joydeep Ray, Abhishek R. Appu, Nicolas C. Galoppo von Borries, Prasoonkumar Surti, Mike Macpherson
  • Publication number: 20250173308
    Abstract: Embodiments are generally directed to graphics processor data access and sharing. An embodiment of an apparatus includes a circuit element to produce a result in processing of an application; a load-store unit to receive the result and generate pre-fetch information for a cache utilizing the result; and a prefetch generator to produce prefetch addresses based at least in part on the pre-fetch information; wherein the load-store unit is to receive software assistance for prefetching, and wherein generation of the pre-fetch information is based at least in part on the software assistance.
    Type: Application
    Filed: November 25, 2024
    Publication date: May 29, 2025
    Applicant: Intel Corporation
    Inventors: Altug Koker, Varghese George, Aravindh Anantaraman, Valentin Andrei, Abhishek R. Appu, Niranjan Cooray, Nicolas Galoppo Von Borries, Mike MacPherson, Subramaniam Maiyuran, ElMoustapha Ould-Ahmed-Vall, David Puffer, Vasanth Ranganathan, Joydeep Ray, Ankur N. Shah, Lakshminarayanan Striramassarma, Prasoonkumar Surti, Saurabh Tangri
  • Publication number: 20250156371
    Abstract: Methods and apparatus relating to data initialization techniques. In an example, an apparatus comprises a processor to read one or more metadata codes which map to one or more cache lines in a cache memory and invoke a random number generator to generate random numerical data for the one or more cache lines in response to a determination that the one more metadata codes indicate that the cache lines are to contain random numerical data. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: January 9, 2025
    Publication date: May 15, 2025
    Applicant: Intel Corporation
    Inventors: Abhishek R. Appu, Aravindh Anantaraman, Elmoustapha Ould-Ahmed-Vall, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Altug Koker, Mike Macpherson, Subramaniam Maiyuran, Joydeep Ray, Vasanth Ranganathan
  • Publication number: 20250156222
    Abstract: Apparatuses to synchronize lanes that diverge or threads that drift are disclosed. In one embodiment, a graphics multiprocessor includes a queue having an initial state of groups with a first group having threads of first and second instruction types and a second group having threads of the first and second instruction types. A regroup engine (or regroup circuitry) regroups threads into a third group having threads of the first instruction type and a fourth group having threads of the second instruction type.
    Type: Application
    Filed: January 3, 2025
    Publication date: May 15, 2025
    Applicant: Intel Corporation
    Inventors: Valentin Andrei, Subramaniam Maiyuran, SungYe Kim, Varghese George, Altug Koker, Aravindh Anantaraman
  • Patent number: 12293431
    Abstract: Embodiments described herein include, software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. Embodiment described herein provided techniques to detect zero value elements within a vector or a set of packed data elements output by a processing resource and generate metadata to indicate a location of the zero value elements within the plurality of data elements.
    Type: Grant
    Filed: May 2, 2023
    Date of Patent: May 6, 2025
    Assignee: Intel Corporation
    Inventors: Joydeep Ray, Scott Janus, Varghese George, Subramaniam Maiyuran, Altug Koker, Abhishek Appu, Prasoonkumar Surti, Vasanth Ranganathan, Valentin Andrei, Ashutosh Garg, Yoav Harel, Arthur Hunter, Jr., SungYe Kim, Mike Macpherson, Elmoustapha Ould-Ahmed-Vall, William Sadler, Lakshminarayanan Striramassarma, Vikranth Vemulapalli
  • Publication number: 20250117356
    Abstract: Methods and apparatus relating to techniques for multi-tile memory management. In an example, a graphics processor includes an interposer, a first chiplet coupled with the interposer, the first chiplet including a graphics processing resource and an interconnect network coupled with the graphics processing resource, cache circuitry coupled with the graphics processing resource via the interconnect network, and a second chiplet coupled with the first chiplet via the interposer, the second chiplet including a memory-side cache and a memory controller coupled with the memory-side cache. The memory controller is configured to enable access to a high-bandwidth memory (HBM) device, the memory-side cache is configured to cache data associated with a memory access performed via the memory controller, and the cache circuitry is logically positioned between the graphics processing resource and a chiplet interface.
    Type: Application
    Filed: October 15, 2024
    Publication date: April 10, 2025
    Applicant: Intel Corporation
    Inventors: Abhishek R. Appu, Altug Koker, Aravindh Anantaraman, Elmoustapha Ould-Ahmed-Vall, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Mike Macpherson, Subramaniam Maiyuran, Joydeep Ray, Lakshminarayanan Striramassarma, Scott Janus, Brent Insko, Vasanth Ranganathan, Kamal Sinha, Arthur Hunter, Prasoonkumar Surti, David Puffer, James Valerio, Ankur N. Shah
  • Publication number: 20250103546
    Abstract: Embodiments are generally directed to cache structure and utilization. An embodiment of an apparatus includes one or more processors including a graphics processor; a memory for storage of data for processing by the one or more processors; and a cache to cache data from the memory; wherein the apparatus is to provide for dynamic overfetching of cache lines for the cache, including receiving a read request and accessing the cache for the requested data, and upon a miss in the cache, overfetching data from memory or a higher level cache in addition to fetching the requested data, wherein the overfetching of data is based at least in part on a current overfetch boundary, and provides for data is to be prefetched extending to the current overfetch boundary.
    Type: Application
    Filed: October 4, 2024
    Publication date: March 27, 2025
    Applicant: Intel Corporation
    Inventors: Altug Koker, Lakshminarayanan Striramassarma, Aravindh Anantaraman, Valentin Andrei, Abhishek R. Appu, Sean Coleman, Varghese George, Pattabhiraman K, Mike MacPherson, Subramaniam Maiyuran, ElMoustapha Ould-Ahmed-Vall, Vasanth Ranganathan, Joydeep Ray, Jayakrishna P S, Prasoonkumar Surti
  • Publication number: 20250103547
    Abstract: Embodiments described herein include software, firmware, and hardware logic that provides techniques to perform arithmetic on sparse data via a systolic processing unit. One embodiment provides techniques to optimize training and inference on a systolic array when using sparse data. One embodiment provides techniques to use decompression information when performing sparse compute operations. One embodiment enables the disaggregation of special function compute arrays via a shared reg file. One embodiment enables packed data compress and expand operations on a GPGPU. One embodiment provides techniques to exploit block sparsity within the cache hierarchy of a GPGPU.
    Type: Application
    Filed: October 4, 2024
    Publication date: March 27, 2025
    Applicant: INTEL CORPORATION
    Inventors: Prasoonkumar Surti, Subramaniam Maiyuran, Valentin Andrei, Abhishek Appu, Varghese George, Altug Koker, Mike Macpherson, Elmoustapha Ould-Ahmed-Vall, Vasanth Ranganathan, Joydeep Ray, Lakshminarayanan Striramassarma, SungYe Kim
  • Publication number: 20250103511
    Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache memory that is coupled to the processing resources. The cache controller is configured to set an initial aging policy using an aging field based on age of cache lines within the cache memory and to determine whether a hint or an instruction to indicate a level of aging has been received.
    Type: Application
    Filed: October 3, 2024
    Publication date: March 27, 2025
    Applicant: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Elmoustapha Ould-Ahmed-Vall, Abhishek Appu, Aravindh Anantaraman, Valentin Andrei, Durgaprasad Bilagi, Varghese George, Brent Insko, Sanjeev Jahagirdar, Scott Janus, Pattabhiraman K, SungYe Kim, Subramaniam Maiyuran, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Xinmin Tian
  • Publication number: 20250103548
    Abstract: Systems and methods for improving cache efficiency and utilization are disclosed. In one embodiment, a graphics processor includes processing resources to perform graphics operations and a cache controller of a cache coupled to the processing resources. The cache controller is configured to control cache priority by determining whether default settings or an instruction will control cache operations for the cache.
    Type: Application
    Filed: November 14, 2024
    Publication date: March 27, 2025
    Applicant: Intel Corporation
    Inventors: Altug Koker, Joydeep Ray, Ben Ashbaugh, Jonathan Pearce, Abhishek Appu, Vasanth Ranganathan, Lakshminarayanan Striramassarma, Elmoustapha Ould-Ahmed-Vall, Aravindh Anantaraman, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Yoav Harel, Arthur Hunter, JR., Brent Insko, Scott Janus, Pattabhiraman K, Mike Macpherson, Subramaniam Maiyuran, Marian Alin Petre, Murali Ramadoss, Shailesh Shah, Kamal Sinha, Prasoonkumar Surti, Vikranth Vemulapalli
  • Patent number: 12254526
    Abstract: Apparatuses including general-purpose graphics processing units having on chip dense memory for temporal buffering are disclosed. In one embodiment, a graphics multiprocessor includes a plurality of compute engines to perform first computations to generate a first set of data, cache for storing data, and a high density memory that is integrated on chip with the plurality of compute engines and the cache. The high density memory to receive the first set of data, to temporarily store the first set of data, and to provide the first set of data to the cache during a first time period that is prior to a second time period when the plurality of compute engines will use the first set of data for second computations.
    Type: Grant
    Filed: March 15, 2019
    Date of Patent: March 18, 2025
    Assignee: Intel Corporation
    Inventors: Varghese George, Altug Koker, Aravindh Anantaraman, Subramaniam Maiyuran, SungYe Kim, Valentin Andrei, Elmoustapha Ould-Ahmed-Vall, Joydeep Ray, Abhishek R. Appu, Nicolas C. Galoppo von Borries, Prasoonkumar Surti, Mike Macpherson
  • Patent number: 12242414
    Abstract: Methods and apparatus relating to data initialization techniques. In an example, an apparatus comprises a processor to read one or more metadata codes which map to one or more cache lines in a cache memory and invoke a random number generator to generate random numerical data for the one or more cache lines in response to a determination that the one more metadata codes indicate that the cache lines are to contain random numerical data. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: March 14, 2020
    Date of Patent: March 4, 2025
    Assignee: INTEL CORPORATION
    Inventors: Abhishek R. Appu, Aravindh Anantaraman, Elmoustapha Ould-Ahmed-Vall, Valentin Andrei, Nicolas Galoppo Von Borries, Varghese George, Altug Koker, Mike Macpherson, Subramaniam Maiyuran, Joydeep Ray, Vasanth Ranganathan
  • Publication number: 20250068588
    Abstract: Methods and apparatus relating to scalar core integration in a graphics processor. In an example, an apparatus comprises a processor to receive a set of workload instructions for a graphics workload from a host complex, determine a first subset of operations in the set of operations that is suitable for execution by a scalar processor complex of the graphics processing device and a second subset of operations in the set of operations that is suitable for execution by a vector processor complex of the graphics processing device, assign the first subset of operations to the scalar processor complex for execution to generate a first set of outputs, assign the second subset of operations to the vector processor complex for execution to generate a second set of outputs. Other embodiments are also disclosed and claimed.
    Type: Application
    Filed: September 3, 2024
    Publication date: February 27, 2025
    Applicant: Intel Corporation
    Inventors: Joydeep RAY, Aravindh ANANTARAMAN, Abhishek R. APPU, Altug KOKER, Elmoustapha OULD-AHMED-VALL, Valentin ANDREI, Subramaniam MAIYURAN, Nicolas GALOPPO VON BORRIES, Varghese GEORGE, Mike MACPHERSON, Ben ASHBAUGH, Murali RAMADOSS, Vikranth VEMULAPALLI, William SADLER, Jonathan PEARCE, Sungye KIM