Patents by Inventor David Alexander Majnemer

David Alexander Majnemer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230418797
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a kNN computation using a hardware accelerator. One of the methods includes obtaining a set of one or more query vectors; obtaining a set of database vectors; and performing, on a hardware accelerator and for each query vector in the set, a search for the k most similar database vectors to the query vector, comprising: computing, by circuitry of the hardware accelerator and for each query vector, a respective similarity value between the query vector and each database vector; and for each query vector, identifying, by the hardware accelerator and for each bin, (i) an index of the most similar database vector within the bin and (ii) the respective similarity value for the most similar database vector within the bin.
    Type: Application
    Filed: June 26, 2023
    Publication date: December 28, 2023
    Inventors: Felix Ren-Chyan Chern, Blake Alan Hechtman, Andrew Thomas Davis, Ruiqi Guo, Sanjiv Kumar, David Alexander Majnemer
  • Patent number: 11763142
    Abstract: Methods and systems, including computer programs encoded on a computer storage medium. In one aspect, a method includes the actions of receiving a request to perform convolutional computations for a neural network on a hardware circuit having a matrix computation unit, the request specifying the convolutional computation to be performed on a feature tensor and a filter and padding applied to the feature tensor prior to performing the convolutional computation; and generating instructions that when executed by the hardware circuit cause the hardware circuit to perform operations comprising: transferring feature tensor data from a main memory of the hardware circuit to a scratchpad memory of the hardware circuit; and repeatedly performing the following operations: identifying a current subset of the feature tensor; and determining whether a memory view into the scratchpad memory for the current subset is consistent with a memory view of the current subset in the main memory.
    Type: Grant
    Filed: September 2, 2022
    Date of Patent: September 19, 2023
    Assignee: Google LLC
    Inventors: David Alexander Majnemer, Blake Alan Hechtman, Bjarke Hammersholt Roune
  • Publication number: 20220414441
    Abstract: Methods and systems, including computer programs encoded on a computer storage medium. In one aspect, a method includes the actions of receiving a request to perform convolutional computations for a neural network on a hardware circuit having a matrix computation unit, the request specifying the convolutional computation to be performed on a feature tensor and a filter and padding applied to the feature tensor prior to performing the convolutional computation; and generating instructions that when executed by the hardware circuit cause the hardware circuit to perform operations comprising: transferring feature tensor data from a main memory of the hardware circuit to a scratchpad memory of the hardware circuit; and repeatedly performing the following operations: identifying a current subset of the feature tensor; and determining whether a memory view into the scratchpad memory for the current subset is consistent with a memory view of the current subset in the main memory.
    Type: Application
    Filed: September 2, 2022
    Publication date: December 29, 2022
    Inventors: David Alexander Majnemer, Blake Alan Hechtman, Bjarke Hammersholt Roune
  • Patent number: 11500959
    Abstract: Methods, systems, and apparatus, including instructions encoded on storage media, for performing reduction of gradient vectors and similarly structured data that are generated in parallel, for example, on nodes organized in a mesh or torus topology defined by connections in at least two dimension between the nodes. The methods provide parallel computation and communication between nodes in the topology.
    Type: Grant
    Filed: August 16, 2019
    Date of Patent: November 15, 2022
    Assignee: Google LLC
    Inventors: David Alexander Majnemer, Blake Alan Hechtman
  • Patent number: 11449739
    Abstract: Methods and systems, including computer programs encoded on a computer storage medium. In one aspect, a method includes the actions of receiving a request to perform convolutional computations for a neural network on a hardware circuit having a matrix computation unit, the request specifying the convolutional computation to be performed on a feature tensor and a filter and padding applied to the feature tensor prior to performing the convolutional computation; and generating instructions that when executed by the hardware circuit cause the hardware circuit to perform operations comprising: transferring feature tensor data from a main memory of the hardware circuit to a scratchpad memory of the hardware circuit; and repeatedly performing the following operations: identifying a current subset of the feature tensor; and determining whether a memory view into the scratchpad memory for the current subset is consistent with a memory view of the current subset in the main memory.
    Type: Grant
    Filed: August 22, 2019
    Date of Patent: September 20, 2022
    Assignee: Google LLC
    Inventors: David Alexander Majnemer, Blake Alan Hechtman, Bjarke Hammersholt Roune
  • Publication number: 20220245453
    Abstract: Methods, systems, and apparatus, including an apparatus for redistributing tensor elements among computing units are described. In one aspect, a method includes distributing tensor elements of an N-dimensional tensor among multiple computing units of a computation system. Each computing unit redistributes the subset of tensor elements previously distributed to the computing unit to computing units. Each computing unit accesses redistribution partitioning data that specifies, for each computing unit, the tensor elements that are to be stored by the computing unit after redistributing the tensor elements. For each tensor element previously distributed to the particular computing unit, the computing unit determines a global linearized index value for the tensor element based on a multi-dimensional index for the tensor element.
    Type: Application
    Filed: October 7, 2020
    Publication date: August 4, 2022
    Inventors: David Alexander Majnemer, Ravi Narayanaswami, Dong Hyuk Woo, Carrell Daniel Killebrew
  • Patent number: 11221879
    Abstract: Methods, systems, and apparatus for scheduling first-in-first-out instructions are described. In one aspect, a method includes receiving data representing code of a program to be executed by a processing unit comprising hardware processors. For each of one or more of the hardware processors, an order of independent groups of first-in-first-out (FIFO) instructions for execution by the hardware processor is identified in the data representing the code of the program. For each independent group of FIFO instructions for execution by the hardware processor, a path length metric that represents how long it will take to reach an end of the program from the independent group of FIFO instructions is determined. A new order of the independent groups of FIFO instructions for execution by the hardware processor is generated based at least on the path length metric for each independent group of FIFO instructions for execution by the hardware processor.
    Type: Grant
    Filed: July 2, 2020
    Date of Patent: January 11, 2022
    Assignee: Google LLC
    Inventors: Yuanzhong Xu, James M. Stichnoth, David Alexander Majnemer
  • Publication number: 20210056396
    Abstract: Methods and systems, including computer programs encoded on a computer storage medium. In one aspect, a method includes the actions of receiving a request to perform convolutional computations for a neural network on a hardware circuit having a matrix computation unit, the request specifying the convolutional computation to be performed on a feature tensor and a filter and padding applied to the feature tensor prior to performing the convolutional computation; and generating instructions that when executed by the hardware circuit cause the hardware circuit to perform operations comprising: transferring feature tensor data from a main memory of the hardware circuit to a scratchpad memory of the hardware circuit; and repeatedly performing the following operations: identifying a current subset of the feature tensor; and determining whether a memory view into the scratchpad memory for the current subset is consistent with a memory view of the current subset in the main memory.
    Type: Application
    Filed: August 22, 2019
    Publication date: February 25, 2021
    Inventors: David Alexander Majnemer, Blake Alan Hechtman, Bjarke Hammersholt Roune
  • Publication number: 20210049231
    Abstract: Methods, systems, and apparatus, including instructions encoded on storage media, for performing reduction of gradient vectors and similarly structured data that are generated in parallel, for example, on nodes organized in a mesh or torus topology defined by connections in at least two dimension between the nodes. The methods provide parallel computation and communication between nodes in the topology.
    Type: Application
    Filed: August 16, 2019
    Publication date: February 18, 2021
    Inventors: David Alexander Majnemer, Blake Alan Hechtman
  • Publication number: 20200341807
    Abstract: Methods, systems, and apparatus for scheduling first-in-first-out instructions are described. In one aspect, a method includes receiving data representing code of a program to be executed by a processing unit comprising hardware processors. For each of one or more of the hardware processors, an order of independent groups of first-in-first-out (FIFO) instructions for execution by the hardware processor is identified in the data representing the code of the program. For each independent group of FIFO instructions for execution by the hardware processor, a path length metric that represents how long it will take to reach an end of the program from the independent group of FIFO instructions is determined. A new order of the independent groups of FIFO instructions for execution by the hardware processor is generated based at least on the path length metric for each independent group of FIFO instructions for execution by the hardware processor.
    Type: Application
    Filed: July 2, 2020
    Publication date: October 29, 2020
    Inventors: Yuanzhong Xu, James M. Stichnoth, David Alexander Majnemer
  • Patent number: 10733016
    Abstract: Methods, systems, and apparatus for scheduling first-in-first-out instructions are described. In one aspect, a method includes receiving data representing code of a program to be executed by a processing unit comprising hardware processors. For each of one or more of the hardware processors, an order of independent groups of first-in-first-out (FIFO) instructions for execution by the hardware processor is identified in the data representing the code of the program. For each independent group of FIFO instructions for execution by the hardware processor, a path length metric that represents how long it will take to reach an end of the program from the independent group of FIFO instructions is determined. A new order of the independent groups of FIFO instructions for execution by the hardware processor is generated based at least on the path length metric for each independent group of FIFO instructions for execution by the hardware processor.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: August 4, 2020
    Assignee: Google LLC
    Inventors: Yuanzhong Xu, James M. Stichnoth, David Alexander Majnemer
  • Patent number: 9575976
    Abstract: Methods and apparatuses that maintain birth time for a file system to optimize file update operations are described. The file system can include a plurality of snapshots or clones of data stored in one or more extents of blocks allocated in a storage device. Each extent may be associated with a time stamp according to the birth time. A request may be received from an executable using the file system to update data in a particular extent associated with a particular time stamp. In response, the current birth time in the file system and the particular time stamp may be compared to determine if the particular extent is not shared by more than one of the snapshots. If the particular time stamp is equal to the current birth time, the particular extent may be updated directly without performing an expensive operation to check whether a reference count of the particular extent is equal to one.
    Type: Grant
    Filed: September 17, 2014
    Date of Patent: February 21, 2017
    Assignee: Apple Inc.
    Inventors: Wenguang Wang, Deric Horn, David Alexander Majnemer, Owen Strain
  • Patent number: 8972690
    Abstract: Methods and apparatuses that maintain an access history of a file allocated with allocation blocks in storage devices are described. In response to receiving a usage request to allocate additional space for the file, an allocation block size may be adjusted or adapted based on the access history. The storage devices may be allocated with one or more allocation blocks using the adapted allocation block size to provide requested space for the file.
    Type: Grant
    Filed: January 5, 2010
    Date of Patent: March 3, 2015
    Inventors: Deric Horn, Donald James Brady, David Alexander Majnemer, Eric Brandon Tamura
  • Publication number: 20150006495
    Abstract: Methods and apparatuses that maintain birth time for a file system to optimize file update operations are described. The file system can include a plurality of snapshots or clones of data stored in one or more extents of blocks allocated in a storage device. Each extent may be associated with a time stamp according to the birth time. A request may be received from an executable using the file system to update data in a particular extent associated with a particular time stamp. In response, the current birth time in the file system and the particular time stamp may be compared to determine if the particular extent is not shared by more than one of the snapshots. If the particular time stamp is equal to the current birth time, the particular extent may be updated directly without performing an expensive operation to check whether a reference count of the particular extent is equal to one.
    Type: Application
    Filed: September 17, 2014
    Publication date: January 1, 2015
    Inventors: Wenguang WANG, Deric HORN, David Alexander MAJNEMER, Owen STRAIN
  • Patent number: 8849876
    Abstract: Methods and apparatuses that maintain birth time for a file system to optimize file update operations are described. The file system can include a plurality of snapshots or clones of data stored in one or more extents of blocks allocated in a storage device. Each extent may be associated with a time stamp according to the birth time. A request may be received from an executable using the file system to update data in a particular extent associated with a particular time stamp. In response, the current birth time in the file system and the particular time stamp may be compared to determine if the particular extent is not shared by more than one of the snapshots. If the particular time stamp is equal to the current birth time, the particular extent may be updated directly without performing an expensive operation to check whether a reference count of the particular extent is equal to one.
    Type: Grant
    Filed: December 28, 2009
    Date of Patent: September 30, 2014
    Inventors: Wenguang Wang, Deric Horn, David Alexander Majnemer, Owen Strain
  • Patent number: 8504792
    Abstract: Methods and apparatuses that search tree representations of a bitmap for available blocks to allocate in storage devices are described. An allocation request for a file may be received to initiate the search. In one embodiment, the bitmap may include an array of bits corresponding to blocks in the storage devices. Each bit may indicate whether one of the blocks is available. The tree representations may include at least one red-black tree having nodes corresponding to one or more consecutive bits in the bitmap indicating an extent of available blocks. One of the tree representations may be selected according to a file associated with an allocation request to identify an extent of available block matching the allocation request. The tree representations may be synchronized as the bitmap is updated with changes of block allocations in the storage devices.
    Type: Grant
    Filed: December 22, 2009
    Date of Patent: August 6, 2013
    Assignee: Apple Inc.
    Inventors: Eric Brandon Tamura, David Alexander Majnemer
  • Publication number: 20110167239
    Abstract: Methods and apparatuses that maintain an access history of a file allocated with allocation blocks in storage devices are described. In response to receiving a usage request to allocate additional space for the file, an allocation block size may be adjusted or adapted based on the access history. The storage devices may be allocated with one or more allocation blocks using the adapted allocation block size to provide requested space for the file.
    Type: Application
    Filed: January 5, 2010
    Publication date: July 7, 2011
    Inventors: Deric Horn, Donald James Brady, David Alexander Majnemer, Eric Brandon Tamura
  • Publication number: 20110161381
    Abstract: Methods and apparatuses that maintain birth time for a file system to optimize file update operations are described. The file system can include a plurality of snapshots or clones of data stored in one or more extents of blocks allocated in a storage device. Each extent may be associated with a time stamp according to the birth time. A request may be received from an executable using the file system to update data in a particular extent associated with a particular time stamp. In response, the current birth time in the file system and the particular time stamp may be compared to determine if the particular extent is not shared by more than one of the snapshots. If the particular time stamp is equal to the current birth time, the particular extent may be updated directly without performing an expensive operation to check whether a reference count of the particular extent is equal to one.
    Type: Application
    Filed: December 28, 2009
    Publication date: June 30, 2011
    Inventors: Wenguang Wang, Deric Horn, David Alexander Majnemer, Owen Strain
  • Publication number: 20110153976
    Abstract: Methods and apparatuses that search tree representations of a bitmap for available blocks to allocate in storage devices are described. An allocation request for a file may be received to initiate the search. In one embodiment, the bitmap may include an array of bits corresponding to blocks in the storage devices. Each bit may indicate whether one of the blocks is available. The tree representations may include at least one red-black tree having nodes corresponding to one or more consecutive bits in the bitmap indicating an extent of available blocks. One of the tree representations may be selected according to a file associated with an allocation request to identify an extent of available block matching the allocation request. The tree representations may be synchronized as the bitmap is updated with changes of block allocations in the storage devices.
    Type: Application
    Filed: December 22, 2009
    Publication date: June 23, 2011
    Inventors: Eric Brandon Tamura, David Alexander Majnemer