Patents by Inventor Robert Geva
Robert Geva has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 12645605Abstract: A computer-implemented method includes generating or receiving instruction code for executing by a computing device to implement a neural network model, where the instruction code includes a plurality of direct memory access (DMA) instructions for data transferring between a local memory of an accelerator of the computing device and a system memory of the computing device; modifying the instruction code to arrange sources or destinations of a group of DMA instructions of the plurality of DMA instructions into a contiguous block in the local memory; and replacing the group of DMA instructions with a single DMA instruction, wherein a source address or a destination address of the single DMA instruction is the contiguous block of the local memory.Type: GrantFiled: September 30, 2021Date of Patent: June 2, 2026Assignee: Amazon Technologies, Inc.Inventors: Ron Diamant, Yunxuan Yu, Taylor Goodhart, Robert Geva
-
Patent number: 12632693Abstract: A technique for packing matrix multiplications for concurrent execution in an integrated circuit device may include obtaining a description of a neural network model, and generating an intermediate representation of the neural network model. Matrix multiplication instructions in the intermediate representation of the neural network model can then be vectorized for concurrent execution on an integrated circuit device, and machine instructions can be generated for the integrated circuit device based on the vectorized matrix multiplication instructions.Type: GrantFiled: March 30, 2022Date of Patent: May 19, 2026Assignee: Amazon Technologies, Inc.Inventors: Jiading Gai, Tobias Joseph Kastulus Edler von Koch, Robert Geva, Paul Gilbert Meyer, Donald John Kretsch, Ron Diamant
-
Patent number: 12530178Abstract: A technique for arranging matrix multiplications for concurrent execution in an integrated circuit device may include obtaining a representation of a data dependency graph of a neural network model. The data dependency graph may include having an accumulation group (AG) pack of accumulation groups (AGs), in which each of the AGs has one or more matrix multipartition instructions. A representation of a memory location base partition constraint graph of the AG pack can be generated, and an AG row group constraint graph can be generated based on the memory location base partition constraint graph. The AGs of the AG pack can then be assigned to tiles in an integrated circuit device based on the AG row group constraint graph.Type: GrantFiled: March 30, 2022Date of Patent: January 20, 2026Assignee: Amazon Technologies, Inc.Inventors: Jiading Gai, Tobias Joseph Kastulus Edler von Koch, Robert Geva, Paul Gilbert Meyer, Donald John Kretsch, Ron Diamant
-
Patent number: 12430110Abstract: In one example, a method performed by a compiler comprises: receiving a dataflow graph of a neural network, the neural network comprising a neural network operator; receiving information of computation resources and memory resources of a neural network hardware accelerator intended to execute the neural network operator; determining, based on the dataflow graph, iterations of an operation on elements of a tensor included in the neural network operator; determining, based on the information, a mapping between the elements of the tensor to addresses in the portion of the local memory, and a number of the iterations of the operation to be included in a batch, wherein the number of the iterations in the batch are to be executed in parallel by the neural network hardware accelerator; and generating a schedule of execution of the batches of the iterations of the operations.Type: GrantFiled: August 28, 2023Date of Patent: September 30, 2025Assignee: Amazon Technologies, Inc.Inventors: Hongbin Zheng, Randy Renfu Huang, Robert Geva
-
Patent number: 12400106Abstract: A computer-implemented method includes receiving a neural network model that includes memory load operations and a plurality of computation operations, selecting a memory load operation having an arithmetic intensity factor (AIF) greater than a threshold value from the memory load operations, grouping computation operations associated with data loaded by the selected memory load operation into two or more clusters of computation operations, and incorporating an instance of the selected memory load operation before each cluster of the two or more clusters of computation operations in the neural network model.Type: GrantFiled: June 18, 2021Date of Patent: August 26, 2025Assignee: Amazon Technologies, Inc.Inventors: Ron Diamant, Robert Geva, Jindrich Zejda, Tiandong Zhao
-
Patent number: 12182549Abstract: A compiler-implemented technique for performing a storage allocation is described. Computer code to be converted into machine instructions for execution on an integrated circuit device is received. The integrated circuit device includes a memory having a set of memory locations. Based on the computer code, a set of values that are to be stored on the integrated circuit device are determined. An interference graph that includes the set of values and a set of interferences is constructed. While traversing the interference graph, a set of memory location assignments are generated by assigning the set of values to the set of memory locations in accordance with one or more color selection schemes.Type: GrantFiled: August 7, 2023Date of Patent: December 31, 2024Assignee: Amazon Technologies, Inc.Inventors: Preston Pengra Briggs, Ron Diamant, Robert Geva
-
Patent number: 12131188Abstract: A technique for scheduling instructions includes obtaining a set of instructions that operate on memory objects, and determining the dependencies of the memory objects. The memory objects are then sorted into a sequence of memory objects based on the dependencies of the memory objects, and the set of instructions are scheduled into a sequence of instructions according to the sequence of memory objects. Sorting memory objects allows instructions that operate on the same memory object to be kept together. This helps minimize spilling conditions because intervening instructions that do not operate on the same memory object can be avoided.Type: GrantFiled: March 29, 2023Date of Patent: October 29, 2024Assignee: Amazon Technologies, Inc.Inventors: Robert Geva, Taylor Goodhart, Ron Diamant, Preston Pengra Briggs
-
Patent number: 11809849Abstract: In one example, a method performed by a compiler comprises: receiving a dataflow graph of a neural network, the neural network comprising a neural network operator; receiving information of computation resources and memory resources of a neural network hardware accelerator intended to execute the neural network operator; determining, based on the dataflow graph, iterations of an operation on elements of a tensor included in the neural network operator; determining, based on the information, a mapping between the elements of the tensor to addresses in the portion of the local memory, and a number of the iterations of the operation to be included in a batch, wherein the number of the iterations in the batch are to be executed in parallel by the neural network hardware accelerator; and generating a schedule of execution of the batches of the iterations of the operations.Type: GrantFiled: May 20, 2021Date of Patent: November 7, 2023Assignee: Amazon Technologies, Inc.Inventors: Hongbin Zheng, Randy Renfu Huang, Robert Geva
-
Patent number: 11775268Abstract: A compiler-implemented technique for performing a storage allocation is described. Computer code to be converted into machine instructions for execution on an integrated circuit device is received. The integrated circuit device includes a memory having a set of memory locations. Based on the computer code, a set of values that are to be stored on the integrated circuit device are determined. An interference graph that includes the set of values and a set of interferences is constructed. While traversing the interference graph, a set of memory location assignments are generated by assigning the set of values to the set of memory locations in accordance with one or more color selection schemes.Type: GrantFiled: June 8, 2021Date of Patent: October 3, 2023Assignee: Amazon Technologies, Inc.Inventors: Preston Pengra Briggs, Ron Diamant, Robert Geva
-
Patent number: 11625269Abstract: A technique for scheduling instructions includes obtaining a set of instructions that operate on memory objects, and determining the dependencies of the memory objects. The memory objects are then sorted into a sequence of memory objects based on the dependencies of the memory objects, and the set of instructions are scheduled into a sequence of instructions according to the sequence of memory objects. Sorting memory objects allows instructions that operate on the same memory object to be kept together. This helps minimize spilling conditions because intervening instructions that do not operate on the same memory object can be avoided.Type: GrantFiled: March 31, 2021Date of Patent: April 11, 2023Assignee: Amazon Technologies, Inc.Inventors: Robert Geva, Taylor Goodhart, Ron Diamant, Preston Pengra Briggs
-
Patent number: 11372677Abstract: When scheduling instructions for execution on a computing device, load instructions are processed before their dependent computational instructions. This can result in the load instructions being scheduled in a non-optimal order. To schedule the load instructions in a preferred order, a scheduler can speculatively schedule the load instructions without committing to their order. Subsequently, when the scheduler encounters the dependent computational instructions, the scheduler can reorder the speculatively scheduled load instructions according to the execution order of the dependent computational instructions.Type: GrantFiled: June 4, 2020Date of Patent: June 28, 2022Assignee: Amazon Technologies, Inc.Inventor: Robert Geva
-
Patent number: 8762694Abstract: Method, apparatus, and system for a programmable event-driven yield mechanism. The mechanism may disrupt processing of a program to deliver a yield event. The event may be treated as a fault-like yield event or a trap-like event. For a fault-like yield event, the faulting instruction is canceled before retirement and processor state is not updated before the yield event is delivered. For a trap-like yield event the instruction causing the trap is retired and the yield event is delivered on an interrupt boundary. Multiple pending yield events may be handled according to priority. Other embodiments are also described and claimed.Type: GrantFiled: March 31, 2006Date of Patent: June 24, 2014Assignee: Intel CorporationInventors: Xiang Zou, Hong Wang, Robert Knight, Robert Geva, Gautham Chinya, Scott Dion Rodgers, Chris Newburn, Bryant E. Bigbee, Per Hammarlund, Ittai Anati, Jim B. Crossland, John P. Shen
-
Patent number: 8719839Abstract: A computer system may comprise a computer platform and input-output devices. The computer platform may include a plurality of heterogeneous processors comprising a central processing unit (CPU) and a graphics processing unit) GPU, for example. The GPU may be coupled to a GPU compiler and a GPU linker/loader and the CPU may be coupled to a CPU compiler and a CPU linker/loader. The user may create a shared object in an object oriented language and the shared object may include virtual functions. The shared object may be fine grain partitioned between the heterogeneous processors. The GPU compiler may allocate the shared object to the CPU and may create a first and a second enabling path to allow the GPU to invoke virtual functions of the shared object. Thus, the shared object that may include virtual functions may be shared seamlessly between the CPU and the GPU.Type: GrantFiled: October 30, 2009Date of Patent: May 6, 2014Assignee: Intel CorporationInventors: Shoumeng Yan, Xiaocheng Zhou, Ying Gao, Mohan Rajagopalan, Rajiv Deodhar, David Putzolu, Clark Nelson, Milind Girkar, Robert Geva, Tiger Chen, Sai Luo, Stephen Junkins, Bratin Saha, Ravi Narayanaswamy, Patrick Xi
-
Patent number: 8566567Abstract: Method, apparatus, and system for monitoring performance within a processing resource, which may be used to modify user-level software. Some embodiments of the invention pertain to an architecture to allow a user to improve software running on a processing resources on a per-thread basis in real-time and without incurring significant processing overhead.Type: GrantFiled: June 21, 2012Date of Patent: October 22, 2013Assignee: Intel CorporationInventors: Chris J. Newburn, Robert Knight, Robert Geva, Dion Rodgers, Xiang Zou, Hong Wang, Bryant E. Bigbee, Ittai Anati
-
Publication number: 20130061240Abstract: A computer system may comprise a computer platform and input-output devices. The computer platform may include a plurality of heterogeneous processors comprising a central processing unit (CPU) and a graphics processing unit) GPU, for example. The GPU may be coupled to a GPU compiler and a GPU linker/loader and the CPU may be coupled to a CPU compiler and a CPU linker/loader. The user may create a shared object in an object oriented language and the shared object may include virtual functions. The shared object may be fine grain partitioned between the heterogeneous processors. The GPU compiler may allocate the shared object to the CPU and may create a first and a second enabling path to allow the GPU to invoke virtual functions of the shared object. Thus, the shared object that may include virtual functions may be shared seamlessly between the CPU and the GPU.Type: ApplicationFiled: October 30, 2009Publication date: March 7, 2013Inventors: Shoumeng Yan, Xiaocheng Zhou, Ying Gao, Mohan Rajagopalan, Rajiv Deodhar, David Putzolu, Clark Nelson, Milind Girkar, Robert Geva, Tiger Chen, Sai Luo, Stephen Junkins, Bratin Saha, Ravi Narayanaswamy, Patrick Xi
-
Publication number: 20130031557Abstract: Method, apparatus, and system for monitoring performance within a processing resource, which may be used to modify user-level software. Some embodiments of the invention pertain to an architecture to allow a user to improve software running on a processing resources on a per-thread basis in real-time and without incurring significant processing overhead.Type: ApplicationFiled: June 21, 2012Publication date: January 31, 2013Inventors: Chris J. Newburn, Robert Knight, Robert Geva, Dion Rodgers, Xiang Zou, Hong Wang, Bryant E. Bigbee, Ittai Anati
-
Patent number: 8301868Abstract: Method, apparatus, and system for monitoring performance within a processing resource, which may be used to modify user-level software. Some embodiments of the invention pertain to an architecture to allow a user to improve software running on a processing resources on a per-thread basis in real-time and without incurring significant processing overhead.Type: GrantFiled: September 23, 2005Date of Patent: October 30, 2012Assignee: Intel CorporationInventors: Chris J. Newburn, Robert Knight, Robert Geva, Dion Rodgers, Xiang Zou, Hong Wang, Bryant E. Bigbee, Ittai Anati
-
Patent number: 8296552Abstract: In one embodiment, the present invention includes a method of determining a relative priority between a first agent and a second agent, and assigning the first agent to a first channel and the second agent to a second channel according to the relative priority. Depending on the currently programmed status of the channels, information stored in at least one of the channels may be dynamically migrated to another channel based on the assignments. Other embodiments are described and claimed.Type: GrantFiled: June 29, 2011Date of Patent: October 23, 2012Assignee: Intel CorporationInventors: Gautham Chinya, Robert Geva, Robert Knight, Hong Wang, Xiang Zou
-
Publication number: 20110258632Abstract: In one embodiment, the present invention includes a method of determining a relative priority between a first agent and a second agent, and assigning the first agent to a first channel and the second agent to a second channel according to the relative priority. Depending on the currently programmed status of the channels, information stored in at least one of the channels may be dynamically migrated to another channel based on the assignments. Other embodiments are described and claimed.Type: ApplicationFiled: June 29, 2011Publication date: October 20, 2011Inventors: Gautham Chinya, Robert Geva, Robert Knight, Hong Wang, Xiang Zou
-
Patent number: 8001364Abstract: In one embodiment, the present invention includes a method of determining a relative priority between a first agent and a second agent, and assigning the first agent to a first channel and the second agent to a second channel according to the relative priority. Depending on the currently programmed status of the channels, information stored in at least one of the channels may be dynamically migrated to another channel based on the assignments. Other embodiments are described and claimed.Type: GrantFiled: October 23, 2009Date of Patent: August 16, 2011Assignee: Intel CorporationInventors: Gautham Chinya, Robert Geva, Robert Knight, Hong Wang, Xiang Zou