Patents by Inventor John Shalf

John Shalf has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240411834
    Abstract: A system and method of performing sparse accumulation in column-wise sparse general matrix-matrix multiplication (SpGEMM) algorithms. The method includes receiving a request to perform SpGEMM based on a first matrix and a second matrix. The method includes accumulating, in a hardware buffer, a hash key and an intermediate multiplication result of the first matrix and the second matrix. The method includes performing a probe search of a hardware cache to identify a match between the hash key and a partial sum associated with the first matrix and the second matrix. The method includes generating, by a hardware adder, a multiplication result based on the partial sum and the intermediate multiplication result from the accumulation waiting buffer.
    Type: Application
    Filed: June 7, 2024
    Publication date: December 12, 2024
    Inventors: Chao Zhang, Xiaochen Guo, Maximilian Bremer, Cy Chan, John Shalf
  • Patent number: 12075201
    Abstract: Disclosed herein are methods, systems, and devices for bandwidth steering. Systems may include a plurality of compute nodes configured to execute one or more applications, a plurality of first level resources communicatively coupled to the plurality of compute nodes, a plurality of second level resources communicatively coupled to the plurality of first level resources, and a plurality of third level resources communicatively coupled to the plurality of second level resources. Systems may also include a plurality of optical switch circuits communicatively coupled to the plurality of first level resources and the plurality of second level resources, wherein each of the plurality of optical switch circuits is coupled to more than one of the plurality of the first level resources and is also coupled to more than one of the plurality of the second level resources.
    Type: Grant
    Filed: November 13, 2020
    Date of Patent: August 27, 2024
    Assignee: The Regents of the University of California
    Inventors: Georgios Michelogiannakis, Yiwen Shen, Min Yee Teh, John Shalf, Madeleine Glick, Keren Bergman
  • Publication number: 20240204783
    Abstract: A primitive race-logic temporal operator is described, comprising superconducting logic single flux quantum (SFQ) cells.
    Type: Application
    Filed: March 3, 2021
    Publication date: June 20, 2024
    Inventors: Georgios Tzimpragos, Dilip Vasudevan, Nestan Tsiskaridze, Georgios Michelogiannakis, Advait Madhavan, Jennifer Volk, John Shalf, Timothy Sherwood
  • Patent number: 11599470
    Abstract: A last-level collective hardware prefetcher (LLCHP) is described. The LLCHP is to detect a first off-chip memory access request by a first processor core of a plurality of processor cores. The LLCHP is further to determine, based on the first off-chip memory access request, that first data associated with the first off-chip memory access request is associated with second data of a second processor core of the plurality of processor cores. The LLCHP is further to prefetch the first data and the second data based on the determination.
    Type: Grant
    Filed: November 6, 2019
    Date of Patent: March 7, 2023
    Assignee: The Regents of the University of California
    Inventors: Georgios Michelogiannakis, John Shalf
  • Publication number: 20220394362
    Abstract: Disclosed herein are methods, systems, and devices for bandwidth steering. Systems may include a plurality of compute nodes configured to execute one or more applications, a plurality of first level resources communicatively coupled to the plurality of compute nodes, a plurality of second level resources communicatively coupled to the plurality of first level resources, and a plurality of third level resources communicatively coupled to the plurality of second level resources. Systems may also include a plurality of optical switch circuits communicatively coupled to the plurality of first level resources and the plurality of second level resources, wherein each of the plurality of optical switch circuits is coupled to more than one of the plurality of the first level resources and is also coupled to more than one of the plurality of the second level resources.
    Type: Application
    Filed: November 13, 2020
    Publication date: December 8, 2022
    Inventors: Georgios Michelogiannakis, Yiwen Shen, Min Yee Teh, John Shalf, Madeleine Glick, Keren Bergman
  • Publication number: 20220012178
    Abstract: A last-level collective hardware prefetcher (LLCHP) is described. The LLCHP is to detect a first off-chip memory access request by a first processor core of a plurality of processor cores. The LLCHP is further to determine, based on the first off-chip memory access request, that first data associated with the first off-chip memory access request is associated with second data of a second processor core of the plurality of processor cores. The LLCHP is further to prefetch the first data and the second data based on the determination.
    Type: Application
    Filed: November 6, 2019
    Publication date: January 13, 2022
    Inventors: Georgios Michelogiannakis, John Shalf
  • Patent number: 10318444
    Abstract: This disclosure provides systems, methods, and apparatus for collective memory transfers. A control unit may be configured to coordinate a transfer of data between a memory and processor cores. For a read data transfer operation, the control unit may receive a trigger packet identifying a read data transfer operation and identifying a first plurality of data lines based on data values included in the trigger packet. The control unit may read the first plurality of data lines from the memory sequentially and send a second plurality of data lines to the processor cores. For a write data transfer operation, the control unit may send a request for at least one data line to a plurality of processor cores, may receive and reorder the requested data lines, and may write the data lines to a memory. The control unit may determine a mapping between processor cores and the memory.
    Type: Grant
    Filed: April 10, 2014
    Date of Patent: June 11, 2019
    Assignee: The Regents of the University of California
    Inventors: Georgios Michelogiannakis, John Shalf
  • Patent number: 10102179
    Abstract: A multi-core computer processor including a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture, a plurality of caches, each of the plurality of caches being associated with one and only one of the plurality of processor cores, and a plurality of memories, each of the plurality of memories being associated with a different set of at least one of the plurality of processor cores and each of the plurality of memories being configured to be visible in a global memory address space such that the plurality of memories are visible to two or more of the plurality of processor cores.
    Type: Grant
    Filed: August 22, 2016
    Date of Patent: October 16, 2018
    Assignee: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: John Shalf, David Donofrio, Leonid Oliker
  • Patent number: 10078593
    Abstract: A multi-core computer processor including a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture, a plurality of caches, each of the plurality of caches being associated with one and only one of the plurality of processor cores, and a plurality of memories, each of the plurality of memories being associated with a different set of at least one of the plurality of processor cores and each of the plurality of memories being configured to be visible in a global memory address space such that the plurality of memories are visible to two or more of the plurality of processor cores, wherein at least one of a number of the processor cores, a size of each of the plurality of caches, or a size of each of the plurality of memories is configured for performing a reverse-time-migration (RTM) computation.
    Type: Grant
    Filed: October 26, 2012
    Date of Patent: September 18, 2018
    Assignee: THE REGENTS OF THE UNIVERSITY OF CALIFORNIA
    Inventors: John Shalf, David Donofrio, Leonid Oliker, Jens Kruger, Samuel Williams
  • Publication number: 20160371226
    Abstract: A multi-core computer processor including a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture, a plurality of caches, each of the plurality of caches being associated with one and only one of the plurality of processor cores, and a plurality of memories, each of the plurality of memories being associated with a different set of at least one of the plurality of processor cores and each of the plurality of memories being configured to be visible in a global memory address space such that the plurality of memories are visible to two or more of the plurality of processor cores.
    Type: Application
    Filed: August 22, 2016
    Publication date: December 22, 2016
    Inventors: John Shalf, David Donofrio, Leonid Oliker
  • Patent number: 9448940
    Abstract: A multi-core computer processor including a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture, a plurality of caches, each of the plurality of caches being associated with one and only one of the plurality of processor cores, and a plurality of memories, each of the plurality of memories being associated with a different set of at least one of the plurality of processor cores and each of the plurality of memories being configured to be visible in a global memory address space such that the plurality of memories are visible to two or more of the plurality of processor cores.
    Type: Grant
    Filed: October 26, 2012
    Date of Patent: September 20, 2016
    Assignee: The Regents of the University of California
    Inventors: John Shalf, David Donofrio, Leonid Oliker
  • Publication number: 20140310495
    Abstract: This disclosure provides systems, methods, and apparatus for collective memory transfers. A control unit may be configured to coordinate a transfer of data between a memory and processor cores. For a read data transfer operation, the control unit may receive a trigger packet identifying a read data transfer operation and identifying a first plurality of data lines based on data values included in the trigger packet. The control unit may read the first plurality of data lines from the memory sequentially and send a second plurality of data lines to the processor cores. For a write data transfer operation, the control unit may send a request for at least one data line to a plurality of processor cores, may receive and reorder the requested data lines, and may write the data lines to a memory. The control unit may determine a mapping between processor cores and the memory.
    Type: Application
    Filed: April 10, 2014
    Publication date: October 16, 2014
    Applicant: The Regents of the University of California
    Inventors: Georgios Michelogiannakis, John Shalf
  • Publication number: 20140310467
    Abstract: A multi-core computer processor including a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture, a plurality of caches, each of the plurality of caches being associated with one and only one of the plurality of processor cores, and a plurality of memories, each of the plurality of memories being associated with a different set of at least one of the plurality of processor cores and each of the plurality of memories being configured to be visible in a global memory address space such that the plurality of memories are visible to two or more of the plurality of processor cores, wherein at least one of a number of the processor cores, a size of each of the plurality of caches, or a size of each of the plurality of memories is configured for performing a reverse-time-migration (RTM) computation.
    Type: Application
    Filed: October 26, 2012
    Publication date: October 16, 2014
    Inventors: John Shalf, David Donofrio, Leonid Oliker, Jens Kruger, Samuel Williams
  • Publication number: 20140281243
    Abstract: A multi-core computer processor including a plurality of processor cores interconnected in a Network-on-Chip (NoC) architecture, a plurality of caches, each of the plurality of caches being associated with one and only one of the plurality of processor cores, and a plurality of memories, each of the plurality of memories being associated with a different set of at least one of the plurality of processor cores and each of the plurality of memories being configured to be visible in a global memory address space such that the plurality of memories are visible to two or more of the plurality of processor cores.
    Type: Application
    Filed: October 26, 2012
    Publication date: September 18, 2014
    Inventors: John Shalf, David Donofrio, Leonid Oliker