Patents by Inventor John Wiegert

John Wiegert has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12639779
    Abstract: Embodiments described herein provide a technique to merge partial cache line writes to a cache memory. One embodiment provides a graphics processor comprising a graphics core, a cache coupled with the graphics core, and memory access circuitry to process memory access messages received from the graphics core. The memory access circuitry includes partial cache line write merge circuitry configured to merge a first partial write to a cache line of the cache with a second partial write to the cache line of the cache.
    Type: Grant
    Filed: September 14, 2022
    Date of Patent: May 26, 2026
    Assignee: Intel Corporation
    Inventors: Joydeep Ray, Abhishek R. Appu, Prathamesh Raghunath Shinde, John Wiegert
  • Publication number: 20260064805
    Abstract: An apparatus to facilitate hardware support for n-dimensional matrix load and store instructions is disclosed. The apparatus includes a graphics processor comprising a general-purpose graphics execution resources, the general-purpose graphics execution resources including a matrix accelerator, the matrix accelerator configured to perform a matrix operation on a plurality of tensors stored in a memory; and circuitry configured to facilitate access to the memory by the general-purpose graphics execution resources, wherein the circuitry is configured to: receive a request to access a tensor of the plurality of tensors; and generate a n-dimensional block access message along a dimension of n>2 of the tensor, the n-dimensional block access message to enable access to the tensor by the matrix accelerator, wherein the n-dimensional block access message comprises an application programming interface (API) descriptor defining a tensor width, tensor pitch, tensor block offset, and a tensor block size of the tensor.
    Type: Application
    Filed: October 1, 2022
    Publication date: March 5, 2026
    Applicant: Intel Corporation
    Inventors: Fangwen Fu, Biju George, Sabareesh Ganapathy, Wei Xiong, Chengxi Wu, John Wiegert, Joydeep Ray
  • Publication number: 20250348321
    Abstract: Embodiments described herein provide a technique to decompose 64-bit per-lane virtual addresses to access a plurality of data elements on behalf of a multi-lane parallel processing execution resource of a graphics or compute accelerator. The 64-bit per-lane addresses are decomposed into a base address and a plurality of per-lane offsets for transmission to memory access circuitry. The memory access circuitry then combines the base address and the per-lane offsets to reconstruct the per-lane addresses.
    Type: Application
    Filed: May 9, 2025
    Publication date: November 13, 2025
    Applicant: Intel Corporation
    Inventors: John Wiegert, Joydeep Ray, Timothy Bauer, James Valerio
  • Publication number: 20250307155
    Abstract: Apparatus and method for extended cache control operations for scratch space usage. For example, one embodiment of an apparatus comprises: a cache subsystem comprising at least a first level (L1) cache; a graphics processor core block to execute a workload using a temporary scratch memory space containing cacheable data, resulting in partially dirty cache lines in the cache subsystem containing data which is no longer needed, the graphics processor core block to execute a cache control instruction including fields to identify one or more of the partially dirty cache lines associated with the workload, the cache control instruction executed to reduce one or more instances of unnecessary memory read and memory write operations.
    Type: Application
    Filed: March 26, 2024
    Publication date: October 2, 2025
    Inventors: Karol A. SZERSZEN, Pawel MAJEWSKI, Radoslaw DRABINSKI, Joshua BARCZAK, Pazhani PILLAI, Ruijin WU, John WIEGERT, Sven WOOP
  • Patent number: 12333310
    Abstract: Embodiments described herein provide a technique to decompose 64-bit per-lane virtual addresses to access a plurality of data elements on behalf of a multi-lane parallel processing execution resource of a graphics or compute accelerator. The 64-bit per-lane addresses are decomposed into a base address and a plurality of per-lane offsets for transmission to memory access circuitry. The memory access circuitry then combines the base address and the per-lane offsets to reconstruct the per-lane addresses.
    Type: Grant
    Filed: March 28, 2024
    Date of Patent: June 17, 2025
    Assignee: Intel Corporation
    Inventors: John Wiegert, Joydeep Ray, Timothy Bauer, James Valerio
  • Publication number: 20250147762
    Abstract: Described herein is a graphics processor having processing resources with configurable thread and register configurations. Program code can configure a number of registers and accumulators that will be used by hardware threads during execution of the program code by the graphics processor. Processing resources within the graphics processor can be configured to assign different numbers of registers and accumulators to hardware threads based on the configuration requested by program code to be executed by the processing resource.
    Type: Application
    Filed: November 8, 2023
    Publication date: May 8, 2025
    Applicant: Intel Corporation
    Inventors: Vasanth Ranganathan, Gang Chen, Supratim Pal, Jorge Eduardo Parra Osorio, Arthur Hunter, Boris Kuznetsov, Deepak N K, Siva Kumar Seemakurthi, James Valerio, Shubham Dinesh Chavan, Abhishek Kumar Singh, Samir Pandya, Sandeep Tippannanavar Niranjan, Alan Curtis, Jain Philip, Maltesh Kulkarni, Fangwen Fu, John Wiegert, Brent Schwartz
  • Publication number: 20250130848
    Abstract: An apparatus to facilitate barrier state save and restore for preemption in a graphics environment is disclosed. The apparatus includes processing resources to execute a plurality of execution threads that are comprised in a thread group (TG) and mid-thread preemption barrier save and restore hardware circuitry to: initiate an exception handling routine in response to a mid-thread preemption event, the exception handling routine to cause a barrier signaling event to be issued; receive indication of a valid designated thread status for a thread of a thread group (TG) in response to the barrier signaling event; and in response to receiving the indication of the valid designated thread status for the thread of the TG, cause, by the thread of the TG having the valid designated thread status, a barrier save routine and a barrier restore routine to be initiated for named barriers of the TG.
    Type: Application
    Filed: November 1, 2024
    Publication date: April 24, 2025
    Applicant: Intel Corporation
    Inventors: Vasanth Ranganathan, James Valerio, Joydeep Ray, Abhishek R. Appu, Alan Curtis, Prathamesh Raghunath Shinde, Brandon Fliflet, Ben J. Ashbaugh, John Wiegert
  • Publication number: 20250118003
    Abstract: An apparatus to facilitate exception handling for debugging in a graphics environment is disclosed. The apparatus includes load store pipeline hardware circuitry to: in response to a page fault exception being enabled for a memory access request received from a thread of the plurality of threads, allocate a memory dependency token correlated to a scoreboard identifier (SBID) that is included with the memory access request; send, to memory fabric of the graphics processor, the memory access request comprising the memory dependency token; receive, from the memory fabric in response to the memory access request, a memory access response comprising the memory dependency token and indicating occurrence of a page fault error condition and fault details associated with the page fault error condition; and return the SBID associated with the memory access response and fault details of the page fault error condition to a debug register of the thread.
    Type: Application
    Filed: October 18, 2024
    Publication date: April 10, 2025
    Applicant: Intel Corporation
    Inventors: John Wiegert, Joydeep Ray, Fabian Schnell, Kelvin Thomas Gardiner
  • Patent number: 12164952
    Abstract: An apparatus to facilitate barrier state save and restore for preemption in a graphics environment is disclosed. The apparatus includes processing resources to execute a plurality of execution threads that are comprised in a thread group (TG) and mid-thread preemption barrier save and restore hardware circuitry to: initiate an exception handling routine in response to a mid-thread preemption event, the exception handling routine to cause a barrier signaling event to be issued; receive indication of a valid designated thread status for a thread of a thread group (TG) in response to the barrier signaling event; and in response to receiving the indication of the valid designated thread status for the thread of the TG, cause, by the thread of the TG having the valid designated thread status, a barrier save routine and a barrier restore routine to be initiated for named barriers of the TG.
    Type: Grant
    Filed: June 25, 2021
    Date of Patent: December 10, 2024
    Assignee: INTEL CORPORATION
    Inventors: Vasanth Ranganathan, James Valerio, Joydeep Ray, Abhishek R. Appu, Alan Curtis, Prathamesh Raghunath Shinde, Brandon Fliflet, Ben J. Ashbaugh, John Wiegert
  • Patent number: 12154207
    Abstract: An apparatus to facilitate exception handling for debugging in a graphics environment is disclosed. The apparatus includes load store pipeline hardware circuitry to: in response to a page fault exception being enabled for a memory access request received from a thread of the plurality of threads, allocate a memory dependency token correlated to a scoreboard identifier (SBID) that is included with the memory access request; send, to memory fabric of the graphics processor, the memory access request comprising the memory dependency token; receive, from the memory fabric in response to the memory access request, a memory access response comprising the memory dependency token and indicating occurrence of a page fault error condition and fault details associated with the page fault error condition; and return the SBID associated with the memory access response and fault details of the page fault error condition to a debug register of the thread.
    Type: Grant
    Filed: June 25, 2021
    Date of Patent: November 26, 2024
    Assignee: INTEL CORPORATION
    Inventors: John Wiegert, Joydeep Ray, Fabian Schnell, Kelvin Thomas Gardiner
  • Publication number: 20240330001
    Abstract: Embodiments described herein provide a technique to decompose 64-bit per-lane virtual addresses to access a plurality of data elements on behalf of a multi-lane parallel processing execution resource of a graphics or compute accelerator. The 64-bit per-lane addresses are decomposed into a base address and a plurality of per-lane offsets for transmission to memory access circuitry. The memory access circuitry then combines the base address and the per-lane offsets to reconstruct the per-lane addresses.
    Type: Application
    Filed: March 28, 2024
    Publication date: October 3, 2024
    Applicant: Intel Corporation
    Inventors: John Wiegert, Joydeep Ray, Timothy Bauer, James Valerio
  • Patent number: 12014183
    Abstract: Embodiments described herein provide a technique to decompose 64-bit per-lane virtual addresses to access a plurality of data elements on behalf of a multi-lane parallel processing execution resource of a graphics or compute accelerator. The 64-bit per-lane addresses are decomposed into a base address and a plurality of per-lane offsets for transmission to memory access circuitry. The memory access circuitry then combines the base address and the per-lane offsets to reconstruct the per-lane addresses.
    Type: Grant
    Filed: September 21, 2022
    Date of Patent: June 18, 2024
    Assignee: Intel Corporation
    Inventors: John Wiegert, Joydeep Ray, Timothy Bauer, James Valerio
  • Publication number: 20240160478
    Abstract: An apparatus to facilitate increasing processing resources in processing cores of a graphics environment is disclosed. The apparatus includes a plurality of processing resources to execute one or more execution threads; a plurality of message arbiter-processing resource (MA-PR) routers, wherein a respective MA-PR router of the plurality of MA-PR routers corresponds to a pair of processing resources of the plurality of processing resources and is to arbitrate routing of a thread control message from a message arbiter between the pair of processing resources; a plurality of local shared cache (LSC) sequencers to provide an interface between at least one LSC of the processing core and the plurality of processing resources; and a plurality of instruction caches (ICs) to store instructions of the one or more execution threads, wherein a respective IC of the plurality of ICs interfaces with a portion of the plurality of processing resources.
    Type: Application
    Filed: November 15, 2022
    Publication date: May 16, 2024
    Applicant: Intel Corporation
    Inventors: Jiasheng Chen, Chunhui Mei, Ben J. Ashbaugh, Naveen Matam, Joydeep Ray, Timothy Bauer, Guei-Yuan Lueh, Vasanth Ranganathan, Prashant Chaudhari, Vikranth Vemulapalli, Nishanth Reddy Pendluru, Piotr Reiter, Jain Philip, Marek Rudniewski, Christopher Spencer, Parth Damani, Prathamesh Raghunath Shinde, John Wiegert, Fataneh Ghodrat
  • Publication number: 20240095038
    Abstract: Embodiments described herein provide a technique to decompose 64-bit per-lane virtual addresses to access a plurality of data elements on behalf of a multi-lane parallel processing execution resource of a graphics or compute accelerator. The 64-bit per-lane addresses are decomposed into a base address and a plurality of per-lane offsets for transmission to memory access circuitry. The memory access circuitry then combines the base address and the per-lane offsets to reconstruct the per-lane addresses.
    Type: Application
    Filed: September 21, 2022
    Publication date: March 21, 2024
    Applicant: Intel Corporation
    Inventors: John Wiegert, Joydeep Ray, Timothy Bauer, James Valerio
  • Publication number: 20240086064
    Abstract: Embodiments described herein enable the offload of address calculations required to access a data element within an array of data elements from primary compute resources of a graphics processor to the memory access circuitry of the graphics processor. The memory access circuitry is configured to receive a message to access a data element of an array of data elements in the memory, the message to include an index of the data element in the array of data elements, calculate a byte address for the data element based in part on the index of the data element in the array of data elements, and submit a memory access request to the memory to access the data element at the byte address.
    Type: Application
    Filed: September 14, 2022
    Publication date: March 14, 2024
    Applicant: Intel Corporation
    Inventors: John Wiegert, Joydeep Ray, Timothy Bauer, James Valerio
  • Publication number: 20240087077
    Abstract: Embodiments described herein provide a technique to merge partial cache line writes to a cache memory. One embodiment provides a graphics processor comprising a graphics core, a cache coupled with the graphics core, and memory access circuitry to process memory access messages received from the graphics core. The memory access circuitry includes partial cache line write merge circuitry configured to merge a first partial write to a cache line of the cache with a second partial write to the cache line of the cache.
    Type: Application
    Filed: September 14, 2022
    Publication date: March 14, 2024
    Applicant: Intel Corporation
    Inventors: Joydeep Ray, Abhishek R. Appu, Prathamesh Raghunath Shinde, John Wiegert
  • Publication number: 20220414968
    Abstract: An apparatus to facilitate exception handling for debugging in a graphics environment is disclosed. The apparatus includes load store pipeline hardware circuitry to: in response to a page fault exception being enabled for a memory access request received from a thread of the plurality of threads, allocate a memory dependency token correlated to a scoreboard identifier (SBID) that is included with the memory access request; send, to memory fabric of the graphics processor, the memory access request comprising the memory dependency token; receive, from the memory fabric in response to the memory access request, a memory access response comprising the memory dependency token and indicating occurrence of a page fault error condition and fault details associated with the page fault error condition; and return the SBID associated with the memory access response and fault details of the page fault error condition to a debug register of the thread.
    Type: Application
    Filed: June 25, 2021
    Publication date: December 29, 2022
    Applicant: Intel Corporation
    Inventors: John Wiegert, Joydeep Ray, Fabian Schnell, Kelvin Thomas Gardiner
  • Publication number: 20220413994
    Abstract: An apparatus to facilitate watchpoints for debugging in a graphics environment is disclosed. The apparatus includes processing resources to perform graphics operations using a plurality of threads; and load store pipeline hardware circuitry coupled to the processing resources to: configure a watchpoint register with a value of a watchpoint address, the watchpoint address comprising an address of a memory location in the processor; receive a memory access request from a thread of the plurality of threads; determine, using the watchpoint register, whether the memory access request is requesting access to the watchpoint address; and responsive to the memory access request requesting access to the watchpoint address, return an exception payload to the thread, the exception payload comprising watchpoint details corresponding to the watchpoint address and a scoreboard identifier (SBID) associated with the memory access request.
    Type: Application
    Filed: June 25, 2021
    Publication date: December 29, 2022
    Applicant: Intel Corporation
    Inventors: John Wiegert, Joydeep Ray, Fabian Schnell, Kelvin Thomas Gardiner
  • Publication number: 20220413899
    Abstract: An apparatus to facilitate barrier state save and restore for preemption in a graphics environment is disclosed. The apparatus includes processing resources to execute a plurality of execution threads that are comprised in a thread group (TG) and mid-thread preemption barrier save and restore hardware circuitry to: initiate an exception handling routine in response to a mid-thread preemption event, the exception handling routine to cause a barrier signaling event to be issued; receive indication of a valid designated thread status for a thread of a thread group (TG) in response to the barrier signaling event; and in response to receiving the indication of the valid designated thread status for the thread of the TG, cause, by the thread of the TG having the valid designated thread status, a barrier save routine and a barrier restore routine to be initiated for named barriers of the TG.
    Type: Application
    Filed: June 25, 2021
    Publication date: December 29, 2022
    Applicant: Intel Corporation
    Inventors: Vasanth Ranganathan, James Valerio, Joydeep Ray, Abhishek R. Appu, Alan Curtis, Prathamesh Raghunath Shinde, Brandon Fliflet, Ben J. Ashbaugh, John Wiegert
  • Publication number: 20070011358
    Abstract: Mechanisms to implement memory management to enable protocol-aware asynchronous, zero-copy transmits. A transport protocol engine exposes interfaces via which memory buffers from a memory pool in operating system (OS) kernel space may be allocated to applications running in an OS user layer. The memory buffers may be used to store data that is to be transferred to a network destination using a zero-copy transmit mechanism, wherein the data is directly transmitted from the memory buffers to the network via a network interface controller. The transport protocol engine also exposes a buffer reuse API to the user layer to enable applications to obtain buffer availability information maintained by the protocol engine. In view of the buffer availability information, the application may adjust its data transfer rate.
    Type: Application
    Filed: June 30, 2005
    Publication date: January 11, 2007
    Inventors: John Wiegert, Annie Foong