Patents by Inventor Gary R. Frost

Gary R. Frost has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9152601
    Abstract: An approach and a method for efficient execution of nested map-reduce framework workloads to take advantage of the combined execution of central processing units (CPUs) and graphics processing units (GPUs) and lower latency of data access in accelerated processing units (APUs) is described. In embodiments, metrics are generated to determine whether a map or reduce function is more efficiently processed on a CPU or a GPU. A first metric is based on ratio of a number of branch instructions to a number of non-branch instructions, and a second metric is based on the comparison of execution times on each of the CPU and the GPU. Selecting execution of map and reduce functions based on the first and second metrics result in accelerated computations. Some embodiments include scheduling pipelined executions of functions on the CPU and functions on the GPU concurrently to achieve power-efficient nested map reduce framework execution.
    Type: Grant
    Filed: May 9, 2013
    Date of Patent: October 6, 2015
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Patryk Kaminski, Mauricio Breternitz, Gary R. Frost, Christophe Harle
  • Publication number: 20140333638
    Abstract: An approach and a method for efficient execution of nested map-reduce framework workloads to take advantage of the combined execution of central processing units (CPUs) and graphics processing units (GPUs) and lower latency of data access in accelerated processing units (APUs) is described. In embodiments, metrics are generated to determine whether a map or reduce function is more efficiently processed on a CPU or a GPU. A first metric is based on ratio of a number of branch instructions to a number of non-branch instructions, and a second metric is based on the comparison of execution times on each of the CPU and the GPU. Selecting execution of map and reduce functions based on the first and second metrics result in accelerated computations. Some embodiments include scheduling pipelined executions of functions on the CPU and functions on the GPU concurrently to achieve power-efficient nested map reduce framework execution.
    Type: Application
    Filed: May 9, 2013
    Publication date: November 13, 2014
    Applicant: Advanced Micro Devices, Inc.
    Inventors: Patryk KAMINSKI, Mauricio Breternitz, Gary R. Frost, Christophe Harle
  • Patent number: 8639730
    Abstract: A system and method for efficient garbage collection. A general-purpose central processing unit (CPU) sends a garbage collection request and a first log to a special processing unit (SPU). The first log includes an address and a data size of each allocated data object stored in a heap in memory corresponding to the CPU. The SPU has a single instruction multiple data (SIMD) parallel architecture and may be a graphics processing unit (GPU). The SPU efficiently performs operations of a garbage collection algorithm due to its architecture on a local representation of the data objects stored in the memory. The SPU records a list of changes it performs to remove dead data objects and compact live data objects. This list is subsequently sent to the CPU, which performs the included operations.
    Type: Grant
    Filed: September 24, 2012
    Date of Patent: January 28, 2014
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Azeem S. Jiva, Gary R. Frost
  • Patent number: 8473900
    Abstract: A system and method for creating synthetic immutable classes. A processor identifies first and second classes, instances of which include first and second data fields, respectively. The first data fields include a data field that references the second class. In response to determining that the first class is immutable and the second class is immutable, the processor constructs a first synthetic immutable class, an instance of which comprises a combination of the first data fields and the second data fields. The processor creates an instance of the first synthetic immutable class in which the first data fields and the second data fields occupy a contiguous region of a memory. In response to determining the first synthetic immutable class does not include an accessor for the second class, the processor combines header fields of the first and second data fields into a single data field in the first synthetic immutable class.
    Type: Grant
    Filed: July 1, 2009
    Date of Patent: June 25, 2013
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Gary R. Frost
  • Patent number: 8301672
    Abstract: A system and method for efficient garbage collection. A general-purpose central processing unit (CPU) sends a garbage collection request and a first log to a special processing unit (SPU). The first log includes an address and a data size of each allocated data object stored in a heap in memory corresponding to the CPU. The SPU has a single instruction multiple data (SIMD) parallel architecture and may be a graphics processing unit (GPU). The SPU efficiently performs operations of a garbage collection algorithm due to its architecture on a local representation of the data objects stored in the memory. The SPU records a list of changes it performs to remove dead data objects and compact live data objects. This list is subsequently sent to the CPU, which performs the included operations.
    Type: Grant
    Filed: September 22, 2008
    Date of Patent: October 30, 2012
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Azeem S. Jiva, Gary R. Frost
  • Patent number: 8195583
    Abstract: A system and method are disclosed for correlating instruction sequences. A plurality of instructions is processed to parse a first sequence of instructions comprising a first area of interest. A first instruction sequence pattern is then generated from the first sequence of instructions. Pattern matching operations are performed with the first instruction sequence pattern. A second sequence of instructions are parsed, comprising a second instruction sequence pattern and a second address of interest that is a substantially equivalent match to the first instruction sequence pattern.
    Type: Grant
    Filed: May 27, 2009
    Date of Patent: June 5, 2012
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Gary R. Frost
  • Publication number: 20110289519
    Abstract: Techniques are disclosed relating to distributing workloads between processors. In one embodiment, a computer system includes a first processor and a second processor. The first processor executes program instructions to receive a first set of bytecode specifying a first set of tasks and to determine whether to offload the first set of tasks to the second processor. In response to determining to offload the first set of tasks to the second processor, the program instructions are further executable to cause generation of a set of instructions to perform the first set of tasks, where the set of instructions are in a format different from that of the first set of bytecode, and where the format is supported by the second processor. The program instructions are further executable to cause the second processor to execute the set of instructions by causing the set of instructions to be provided to the second processor.
    Type: Application
    Filed: May 21, 2010
    Publication date: November 24, 2011
    Inventor: Gary R. Frost
  • Publication number: 20110004866
    Abstract: A system and method for creating synthetic immutable classes. A processor identifies first and second classes, instances of which include first and second data fields, respectively. The first data fields include a data field that references the second class. In response to determining that the first class is immutable and the second class is immutable, the processor constructs a first synthetic immutable class, an instance of which comprises a combination of the first data fields and the second data fields. The processor creates an instance of the first synthetic immutable class in which the first data fields and the second data fields occupy a contiguous region of a memory. In response to determining the first synthetic immutable class does not include an accessor for the second class, the processor combines header fields of the first and second data fields into a single data field in the first synthetic immutable class.
    Type: Application
    Filed: July 1, 2009
    Publication date: January 6, 2011
    Inventor: Gary R. Frost
  • Publication number: 20100306514
    Abstract: A system and method are disclosed for correlating instruction sequences. A plurality of instructions is processed to parse a first sequence of instructions comprising a first area of interest. A first instruction sequence pattern is then generated from the first sequence of instructions. Pattern matching operations are performed with the first instruction sequence pattern. A second sequence of instructions are parsed, comprising a second instruction sequence pattern and a second address of interest that is a substantially equivalent match to the first instruction sequence pattern.
    Type: Application
    Filed: May 27, 2009
    Publication date: December 2, 2010
    Inventor: Gary R. Frost
  • Publication number: 20100115502
    Abstract: A system and method are disclosed for improving the performance of compiled Java code. Java source code is annotated and then compiled by a Java compiler to produce annotated Java bytecode, which in turn is compiled by a just-in-time (JIT) compiler into annotated native code. The execution of the annotated native code is monitored with a patching agent, which captures the annotated native code as it is being executed. The captured native code is then provided through an application program interface to a dynamic linkage module, which in turn provides the captured native code to a user or to an application plug-in module for modifications. The modifications are saved as a patch. The annotated native code is then re-executed and the modifications to the annotated native code are applied as a patch by the patching agent.
    Type: Application
    Filed: November 6, 2008
    Publication date: May 6, 2010
    Inventors: Azeem S. Jiva, Gary R. Frost
  • Publication number: 20100082930
    Abstract: A system and method for efficient garbage collection. A general-purpose central processing unit (CPU) sends a garbage collection request and a first log to a special processing unit (SPU). The first log includes an address and a data size of each allocated data object stored in a heap in memory corresponding to the CPU. The SPU has a single instruction multiple data (SIMD) parallel architecture and may be a graphics processing unit (GPU). The SPU efficiently performs operations of a garbage collection algorithm due to its architecture on a local representation of the data objects stored in the memory. The SPU records a list of changes it performs to remove dead data objects and compact live data objects. This list is subsequently sent to the CPU, which performs the included operations.
    Type: Application
    Filed: September 22, 2008
    Publication date: April 1, 2010
    Inventors: Azeem S. Jiva, Gary R. Frost