Patents by Inventor Robert Geva

Robert Geva has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 12645605
    Abstract: A computer-implemented method includes generating or receiving instruction code for executing by a computing device to implement a neural network model, where the instruction code includes a plurality of direct memory access (DMA) instructions for data transferring between a local memory of an accelerator of the computing device and a system memory of the computing device; modifying the instruction code to arrange sources or destinations of a group of DMA instructions of the plurality of DMA instructions into a contiguous block in the local memory; and replacing the group of DMA instructions with a single DMA instruction, wherein a source address or a destination address of the single DMA instruction is the contiguous block of the local memory.
    Type: Grant
    Filed: September 30, 2021
    Date of Patent: June 2, 2026
    Assignee: Amazon Technologies, Inc.
    Inventors: Ron Diamant, Yunxuan Yu, Taylor Goodhart, Robert Geva
  • Patent number: 12632693
    Abstract: A technique for packing matrix multiplications for concurrent execution in an integrated circuit device may include obtaining a description of a neural network model, and generating an intermediate representation of the neural network model. Matrix multiplication instructions in the intermediate representation of the neural network model can then be vectorized for concurrent execution on an integrated circuit device, and machine instructions can be generated for the integrated circuit device based on the vectorized matrix multiplication instructions.
    Type: Grant
    Filed: March 30, 2022
    Date of Patent: May 19, 2026
    Assignee: Amazon Technologies, Inc.
    Inventors: Jiading Gai, Tobias Joseph Kastulus Edler von Koch, Robert Geva, Paul Gilbert Meyer, Donald John Kretsch, Ron Diamant
  • Patent number: 12530178
    Abstract: A technique for arranging matrix multiplications for concurrent execution in an integrated circuit device may include obtaining a representation of a data dependency graph of a neural network model. The data dependency graph may include having an accumulation group (AG) pack of accumulation groups (AGs), in which each of the AGs has one or more matrix multipartition instructions. A representation of a memory location base partition constraint graph of the AG pack can be generated, and an AG row group constraint graph can be generated based on the memory location base partition constraint graph. The AGs of the AG pack can then be assigned to tiles in an integrated circuit device based on the AG row group constraint graph.
    Type: Grant
    Filed: March 30, 2022
    Date of Patent: January 20, 2026
    Assignee: Amazon Technologies, Inc.
    Inventors: Jiading Gai, Tobias Joseph Kastulus Edler von Koch, Robert Geva, Paul Gilbert Meyer, Donald John Kretsch, Ron Diamant
  • Patent number: 12430110
    Abstract: In one example, a method performed by a compiler comprises: receiving a dataflow graph of a neural network, the neural network comprising a neural network operator; receiving information of computation resources and memory resources of a neural network hardware accelerator intended to execute the neural network operator; determining, based on the dataflow graph, iterations of an operation on elements of a tensor included in the neural network operator; determining, based on the information, a mapping between the elements of the tensor to addresses in the portion of the local memory, and a number of the iterations of the operation to be included in a batch, wherein the number of the iterations in the batch are to be executed in parallel by the neural network hardware accelerator; and generating a schedule of execution of the batches of the iterations of the operations.
    Type: Grant
    Filed: August 28, 2023
    Date of Patent: September 30, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Hongbin Zheng, Randy Renfu Huang, Robert Geva
  • Patent number: 12400106
    Abstract: A computer-implemented method includes receiving a neural network model that includes memory load operations and a plurality of computation operations, selecting a memory load operation having an arithmetic intensity factor (AIF) greater than a threshold value from the memory load operations, grouping computation operations associated with data loaded by the selected memory load operation into two or more clusters of computation operations, and incorporating an instance of the selected memory load operation before each cluster of the two or more clusters of computation operations in the neural network model.
    Type: Grant
    Filed: June 18, 2021
    Date of Patent: August 26, 2025
    Assignee: Amazon Technologies, Inc.
    Inventors: Ron Diamant, Robert Geva, Jindrich Zejda, Tiandong Zhao
  • Patent number: 12182549
    Abstract: A compiler-implemented technique for performing a storage allocation is described. Computer code to be converted into machine instructions for execution on an integrated circuit device is received. The integrated circuit device includes a memory having a set of memory locations. Based on the computer code, a set of values that are to be stored on the integrated circuit device are determined. An interference graph that includes the set of values and a set of interferences is constructed. While traversing the interference graph, a set of memory location assignments are generated by assigning the set of values to the set of memory locations in accordance with one or more color selection schemes.
    Type: Grant
    Filed: August 7, 2023
    Date of Patent: December 31, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Preston Pengra Briggs, Ron Diamant, Robert Geva
  • Patent number: 12131188
    Abstract: A technique for scheduling instructions includes obtaining a set of instructions that operate on memory objects, and determining the dependencies of the memory objects. The memory objects are then sorted into a sequence of memory objects based on the dependencies of the memory objects, and the set of instructions are scheduled into a sequence of instructions according to the sequence of memory objects. Sorting memory objects allows instructions that operate on the same memory object to be kept together. This helps minimize spilling conditions because intervening instructions that do not operate on the same memory object can be avoided.
    Type: Grant
    Filed: March 29, 2023
    Date of Patent: October 29, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Robert Geva, Taylor Goodhart, Ron Diamant, Preston Pengra Briggs
  • Patent number: 11809849
    Abstract: In one example, a method performed by a compiler comprises: receiving a dataflow graph of a neural network, the neural network comprising a neural network operator; receiving information of computation resources and memory resources of a neural network hardware accelerator intended to execute the neural network operator; determining, based on the dataflow graph, iterations of an operation on elements of a tensor included in the neural network operator; determining, based on the information, a mapping between the elements of the tensor to addresses in the portion of the local memory, and a number of the iterations of the operation to be included in a batch, wherein the number of the iterations in the batch are to be executed in parallel by the neural network hardware accelerator; and generating a schedule of execution of the batches of the iterations of the operations.
    Type: Grant
    Filed: May 20, 2021
    Date of Patent: November 7, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Hongbin Zheng, Randy Renfu Huang, Robert Geva
  • Patent number: 11775268
    Abstract: A compiler-implemented technique for performing a storage allocation is described. Computer code to be converted into machine instructions for execution on an integrated circuit device is received. The integrated circuit device includes a memory having a set of memory locations. Based on the computer code, a set of values that are to be stored on the integrated circuit device are determined. An interference graph that includes the set of values and a set of interferences is constructed. While traversing the interference graph, a set of memory location assignments are generated by assigning the set of values to the set of memory locations in accordance with one or more color selection schemes.
    Type: Grant
    Filed: June 8, 2021
    Date of Patent: October 3, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Preston Pengra Briggs, Ron Diamant, Robert Geva
  • Patent number: 11625269
    Abstract: A technique for scheduling instructions includes obtaining a set of instructions that operate on memory objects, and determining the dependencies of the memory objects. The memory objects are then sorted into a sequence of memory objects based on the dependencies of the memory objects, and the set of instructions are scheduled into a sequence of instructions according to the sequence of memory objects. Sorting memory objects allows instructions that operate on the same memory object to be kept together. This helps minimize spilling conditions because intervening instructions that do not operate on the same memory object can be avoided.
    Type: Grant
    Filed: March 31, 2021
    Date of Patent: April 11, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Robert Geva, Taylor Goodhart, Ron Diamant, Preston Pengra Briggs
  • Patent number: 11372677
    Abstract: When scheduling instructions for execution on a computing device, load instructions are processed before their dependent computational instructions. This can result in the load instructions being scheduled in a non-optimal order. To schedule the load instructions in a preferred order, a scheduler can speculatively schedule the load instructions without committing to their order. Subsequently, when the scheduler encounters the dependent computational instructions, the scheduler can reorder the speculatively scheduled load instructions according to the execution order of the dependent computational instructions.
    Type: Grant
    Filed: June 4, 2020
    Date of Patent: June 28, 2022
    Assignee: Amazon Technologies, Inc.
    Inventor: Robert Geva
  • Patent number: 8762694
    Abstract: Method, apparatus, and system for a programmable event-driven yield mechanism. The mechanism may disrupt processing of a program to deliver a yield event. The event may be treated as a fault-like yield event or a trap-like event. For a fault-like yield event, the faulting instruction is canceled before retirement and processor state is not updated before the yield event is delivered. For a trap-like yield event the instruction causing the trap is retired and the yield event is delivered on an interrupt boundary. Multiple pending yield events may be handled according to priority. Other embodiments are also described and claimed.
    Type: Grant
    Filed: March 31, 2006
    Date of Patent: June 24, 2014
    Assignee: Intel Corporation
    Inventors: Xiang Zou, Hong Wang, Robert Knight, Robert Geva, Gautham Chinya, Scott Dion Rodgers, Chris Newburn, Bryant E. Bigbee, Per Hammarlund, Ittai Anati, Jim B. Crossland, John P. Shen
  • Patent number: 8719839
    Abstract: A computer system may comprise a computer platform and input-output devices. The computer platform may include a plurality of heterogeneous processors comprising a central processing unit (CPU) and a graphics processing unit) GPU, for example. The GPU may be coupled to a GPU compiler and a GPU linker/loader and the CPU may be coupled to a CPU compiler and a CPU linker/loader. The user may create a shared object in an object oriented language and the shared object may include virtual functions. The shared object may be fine grain partitioned between the heterogeneous processors. The GPU compiler may allocate the shared object to the CPU and may create a first and a second enabling path to allow the GPU to invoke virtual functions of the shared object. Thus, the shared object that may include virtual functions may be shared seamlessly between the CPU and the GPU.
    Type: Grant
    Filed: October 30, 2009
    Date of Patent: May 6, 2014
    Assignee: Intel Corporation
    Inventors: Shoumeng Yan, Xiaocheng Zhou, Ying Gao, Mohan Rajagopalan, Rajiv Deodhar, David Putzolu, Clark Nelson, Milind Girkar, Robert Geva, Tiger Chen, Sai Luo, Stephen Junkins, Bratin Saha, Ravi Narayanaswamy, Patrick Xi
  • Patent number: 8566567
    Abstract: Method, apparatus, and system for monitoring performance within a processing resource, which may be used to modify user-level software. Some embodiments of the invention pertain to an architecture to allow a user to improve software running on a processing resources on a per-thread basis in real-time and without incurring significant processing overhead.
    Type: Grant
    Filed: June 21, 2012
    Date of Patent: October 22, 2013
    Assignee: Intel Corporation
    Inventors: Chris J. Newburn, Robert Knight, Robert Geva, Dion Rodgers, Xiang Zou, Hong Wang, Bryant E. Bigbee, Ittai Anati
  • Publication number: 20130061240
    Abstract: A computer system may comprise a computer platform and input-output devices. The computer platform may include a plurality of heterogeneous processors comprising a central processing unit (CPU) and a graphics processing unit) GPU, for example. The GPU may be coupled to a GPU compiler and a GPU linker/loader and the CPU may be coupled to a CPU compiler and a CPU linker/loader. The user may create a shared object in an object oriented language and the shared object may include virtual functions. The shared object may be fine grain partitioned between the heterogeneous processors. The GPU compiler may allocate the shared object to the CPU and may create a first and a second enabling path to allow the GPU to invoke virtual functions of the shared object. Thus, the shared object that may include virtual functions may be shared seamlessly between the CPU and the GPU.
    Type: Application
    Filed: October 30, 2009
    Publication date: March 7, 2013
    Inventors: Shoumeng Yan, Xiaocheng Zhou, Ying Gao, Mohan Rajagopalan, Rajiv Deodhar, David Putzolu, Clark Nelson, Milind Girkar, Robert Geva, Tiger Chen, Sai Luo, Stephen Junkins, Bratin Saha, Ravi Narayanaswamy, Patrick Xi
  • Publication number: 20130031557
    Abstract: Method, apparatus, and system for monitoring performance within a processing resource, which may be used to modify user-level software. Some embodiments of the invention pertain to an architecture to allow a user to improve software running on a processing resources on a per-thread basis in real-time and without incurring significant processing overhead.
    Type: Application
    Filed: June 21, 2012
    Publication date: January 31, 2013
    Inventors: Chris J. Newburn, Robert Knight, Robert Geva, Dion Rodgers, Xiang Zou, Hong Wang, Bryant E. Bigbee, Ittai Anati
  • Patent number: 8301868
    Abstract: Method, apparatus, and system for monitoring performance within a processing resource, which may be used to modify user-level software. Some embodiments of the invention pertain to an architecture to allow a user to improve software running on a processing resources on a per-thread basis in real-time and without incurring significant processing overhead.
    Type: Grant
    Filed: September 23, 2005
    Date of Patent: October 30, 2012
    Assignee: Intel Corporation
    Inventors: Chris J. Newburn, Robert Knight, Robert Geva, Dion Rodgers, Xiang Zou, Hong Wang, Bryant E. Bigbee, Ittai Anati
  • Patent number: 8296552
    Abstract: In one embodiment, the present invention includes a method of determining a relative priority between a first agent and a second agent, and assigning the first agent to a first channel and the second agent to a second channel according to the relative priority. Depending on the currently programmed status of the channels, information stored in at least one of the channels may be dynamically migrated to another channel based on the assignments. Other embodiments are described and claimed.
    Type: Grant
    Filed: June 29, 2011
    Date of Patent: October 23, 2012
    Assignee: Intel Corporation
    Inventors: Gautham Chinya, Robert Geva, Robert Knight, Hong Wang, Xiang Zou
  • Publication number: 20110258632
    Abstract: In one embodiment, the present invention includes a method of determining a relative priority between a first agent and a second agent, and assigning the first agent to a first channel and the second agent to a second channel according to the relative priority. Depending on the currently programmed status of the channels, information stored in at least one of the channels may be dynamically migrated to another channel based on the assignments. Other embodiments are described and claimed.
    Type: Application
    Filed: June 29, 2011
    Publication date: October 20, 2011
    Inventors: Gautham Chinya, Robert Geva, Robert Knight, Hong Wang, Xiang Zou
  • Patent number: 8001364
    Abstract: In one embodiment, the present invention includes a method of determining a relative priority between a first agent and a second agent, and assigning the first agent to a first channel and the second agent to a second channel according to the relative priority. Depending on the currently programmed status of the channels, information stored in at least one of the channels may be dynamically migrated to another channel based on the assignments. Other embodiments are described and claimed.
    Type: Grant
    Filed: October 23, 2009
    Date of Patent: August 16, 2011
    Assignee: Intel Corporation
    Inventors: Gautham Chinya, Robert Geva, Robert Knight, Hong Wang, Xiang Zou