Patents by Inventor Mark R. Nutter

Mark R. Nutter has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 9727241
    Abstract: A processor maintains a count of accesses to each memory page. When the accesses to a memory page exceed a threshold amount for that memory page, the processor sets an indicator for the page. Based on the indicators for the memory pages, the processor manages data at one or more levels of the processor's memory hierarchy.
    Type: Grant
    Filed: February 6, 2015
    Date of Patent: August 8, 2017
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Gabriel H. Loh, David A. Roberts, Mitesh R. Meswani, Mark R. Nutter, John R. Slice, Prashant Nair, Michael Ignatowski
  • Patent number: 9696995
    Abstract: Mechanisms for extracting data dependencies during runtime are provided. With these mechanisms, a portion of code having a loop is executed. A first parallel execution group is generated for the loop, the group comprising a subset of iterations of the loop less than a total number of iterations of the loop. The first parallel execution group is executed by executing each iteration in parallel. Store data for iterations are stored in corresponding store caches of the processor. Dependency checking logic of the processor determines, for each iteration, whether the iteration has a data dependence. Only the store data for stores where there was no data dependence determined are committed to memory.
    Type: Grant
    Filed: December 30, 2009
    Date of Patent: July 4, 2017
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
  • Patent number: 9696996
    Abstract: Mechanisms for extracting data dependencies during runtime are provided. With these mechanisms, a portion of code having a loop is executed. A first parallel execution group is generated for the loop, the group comprising a subset of iterations of the loop less than a total number of iterations of the loop. The first parallel execution group is executed by executing each iteration in parallel. Store data for iterations are stored in corresponding store caches of the processor, Dependency checking logic of the processor determines, for each iteration, whether the iteration has a data dependence. Only the store data for stores where there was no data dependence determined are committed to memory.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: July 4, 2017
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
  • Publication number: 20170161040
    Abstract: Mechanisms are provided for arranging binary code to reduce instruction cache conflict misses. These mechanisms generate a call graph of a portion of code. Nodes and edges in the call graph are weighted to generate a weighted call graph. The weighted call graph is then partitioned according to the weights, affinities between nodes of the call graph, and the size of cache lines in an instruction cache of the data processing system, so that binary code associated with one or more subsets of nodes in the call graph are combined into individual cache lines based on the partitioning. The binary code corresponding to the partitioned call graph is then output for execution in a computing device.
    Type: Application
    Filed: February 20, 2017
    Publication date: June 8, 2017
    Inventors: Tong Chen, Brian Flachs, Brad W. Michael, Mark R. Nutter, John K.P. O'Brien, Kathryn M. O'Brien, Tao Zhang
  • Patent number: 9600253
    Abstract: Mechanisms are provided for arranging binary code to reduce instruction cache conflict misses. These mechanisms generate a call graph of a portion of code. Nodes and edges in the call graph are weighted to generate a weighted call graph. The weighted call graph is then partitioned according to the weights, affinities between nodes of the call graph, and the size of cache lines in an instruction cache of the data processing system, so that binary code associated with one or more subsets of nodes in the call graph are combined into individual cache lines based on the partitioning. The binary code corresponding to the partitioned call graph is then output for execution in a computing device.
    Type: Grant
    Filed: April 12, 2012
    Date of Patent: March 21, 2017
    Assignee: International Business Machines Corporation
    Inventors: Tong Chen, Brian Flachs, Brad W. Michael, Mark R. Nutter, John K. P. O'Brien, Kathryn M. O'Brien, Tao Zhang
  • Publication number: 20170010873
    Abstract: Mechanisms are provided for arranging binary code to reduce instruction cache conflict misses. These mechanisms generate a call graph of a portion of code. Nodes and edges in the call graph are weighted to generate a weighted call graph. The weighted call graph is then partitioned according to the weights, affinities between nodes of the call graph, and the size of cache lines in an instruction cache of the data processing system, so that binary code associated with one or more subsets of nodes in the call graph are combined into individual cache lines based on the partitioning. The binary code corresponding to the partitioned call graph is then output for execution in a computing device.
    Type: Application
    Filed: September 23, 2016
    Publication date: January 12, 2017
    Inventors: Tong Chen, Brian Flachs, Brad W. Michael, Mark R. Nutter, John K.P. O'Brien, Kathryn M. O'Brien, Tao Zhang
  • Patent number: 9501809
    Abstract: A graphics client receives a frame, the frame comprising scene model data. A server load balancing factor is set based on the scene model data. A prospective rendering factor is set based on the scene model data. The frame is partitioned into a plurality of server bands based on the server load balancing factor and the prospective rendering factor. The server bands are distributed to a plurality of compute servers. Processed server bands are received from the compute servers. A processed frame is assembled based on the received processed server bands. The processed frame is transmitted for display to a user as an image.
    Type: Grant
    Filed: February 21, 2016
    Date of Patent: November 22, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Joaquin Madruga, Barry L. Minor, Mark R. Nutter
  • Patent number: 9483864
    Abstract: Scene model data, including a scene geometry model and a plurality of pixel data describing objects arranged in a scene, is received. A primary pixel color and a primary ray are generated based on a selected first pixel data. If the primary ray intersects an object in the scene, an intersection point is determined. A surface normal is determined based on the object intersected and the intersection point. The primary pixel color is modified based on a primary hit color, determined based on the intersection point. A plurality of ambient occlusion (AO) rays each having a direction, D, are generated based on the intersection point, P and the surface normal. Each AO ray direction is reversed and the AO ray origin is set to a point outside the scene. An AO ray that does not intersect an object before reaching the intersection point is included in ambient occlusion calculations.
    Type: Grant
    Filed: December 5, 2008
    Date of Patent: November 1, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Mark R. Nutter, Joaquin Madruga, Barry L. Minor
  • Patent number: 9459851
    Abstract: Mechanisms are provided for arranging binary code to reduce instruction cache conflict misses. These mechanisms generate a call graph of a portion of code. Nodes and edges in the call graph are weighted to generate a weighted call graph. The weighted call graph is then partitioned according to the weights, affinities between nodes of the call graph, and the size of cache lines in an instruction cache of the data processing system, so that binary code associated with one or more subsets of nodes in the call graph are combined into individual cache lines based on the partitioning. The binary code corresponding to the partitioned call graph is then output for execution in a computing device.
    Type: Grant
    Filed: June 25, 2010
    Date of Patent: October 4, 2016
    Assignee: International Business Machines Corporation
    Inventors: Tong Chen, Brian Flachs, Brad W. Michael, Mark R. Nutter, John K. P. O'Brien, Kathryn M. O'Brien, Tao Zhang
  • Publication number: 20160231933
    Abstract: A processor maintains a count of accesses to each memory page. When the accesses to a memory page exceed a threshold amount for that memory page, the processor sets an indicator for the page. Based on the indicators for the memory pages, the processor manages data at one or more levels of the processor's memory hierarchy.
    Type: Application
    Filed: February 6, 2015
    Publication date: August 11, 2016
    Inventors: Gabriel H. Loh, David A. Roberts, Mitesh R. Meswani, Mark R. Nutter, John R. Slice, Prashant Nair, Michael Ignatowski
  • Publication number: 20160171643
    Abstract: A graphics client receives a frame, the frame comprising scene model data. A server load balancing factor is set based on the scene model data. A prospective rendering factor is set based on the scene model data. The frame is partitioned into a plurality of server bands based on the server load balancing factor and the prospective rendering factor. The server bands are distributed to a plurality of compute servers. Processed server bands are received from the compute servers. A processed frame is assembled based on the received processed server bands. The processed frame is transmitted for display to a user as an image.
    Type: Application
    Filed: February 21, 2016
    Publication date: June 16, 2016
    Inventors: Joaquin Madruga, Barry L. Minor, Mark R. Nutter
  • Patent number: 9270783
    Abstract: A graphics client receives a frame, the frame comprising scene model data. A server load balancing factor is set based on the scene model data. A prospective rendering factor is set based on the scene model data. The frame is partitioned into a plurality of server bands based on the server load balancing factor and the prospective rendering factor. The server bands are distributed to a plurality of compute servers. Processed server bands are received from the compute servers. A processed frame is assembled based on the received processed server bands. The processed frame is transmitted for display to a user as an image.
    Type: Grant
    Filed: December 6, 2008
    Date of Patent: February 23, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Joaquin Madruga, Barry L. Minor, Mark R. Nutter
  • Patent number: 9053069
    Abstract: A mechanism is provided for efficient communication of producer/consumer buffer status. With the mechanism, devices in a data processing system notify each other of updates to head and tail pointers of a shared buffer region when the devices perform operations on the shared buffer region using signal notification channels of the devices. Thus, when a producer device that produces data to the shared buffer region writes data to the shared buffer region, an update to the head pointer is written to a signal notification channel of a consumer device. When a consumer device reads data from the shared buffer region, the consumer device writes a tail pointer update to a signal notification channel of the producer device. In addition, channels may operate in a blocking mode so that the corresponding device is kept in a low power state until an update is received over the channel.
    Type: Grant
    Filed: August 23, 2012
    Date of Patent: June 9, 2015
    Assignee: International Business Machines Corporation
    Inventors: Daniel A. Brokenshire, Charles R. Johns, Mark R. Nutter, Barry L. Minor
  • Patent number: 8782381
    Abstract: Mechanisms are provided for evicting cache lines from an instruction cache of the data processing system. The mechanisms store, for a portion of code in a current cache line, a linked list of call sites that directly or indirectly target the portion of code in the current cache line. A determination is made as to whether the current cache line is to be evicted from the instruction cache. The linked list of call sites is processed to identify one or more rewritten branch instructions having associated branch stubs, that either directly or indirectly target the portion of code in the current cache line. In addition, the one or more rewritten branch instructions are rewritten to restore the one or more rewritten branch instructions to an original state based on information in the associated branch stubs.
    Type: Grant
    Filed: April 12, 2012
    Date of Patent: July 15, 2014
    Assignee: International Business Machines Corporation
    Inventors: Tong Chen, Brian Flachs, Brad W. Michael, Mark R. Nutter, John K. P. O'Brien, Kathryn M. O'Brien, Tao Zhang
  • Patent number: 8734254
    Abstract: A mechanism is provided for generating event notifications for offline characters from within a persistent world online game. A player agent for an offline player includes an event monitor that monitors for events that occur in a persistent virtual world maintained by a game server. When a game event occurs that triggers an offline player rule, the player agent composes an event notification message and sends the message to the offline player. Event notification messages may include images, voice (text-to-speech), sound, or video. Offline players may receive event notifications at various messaging clients, such as personal computers and wireless telephones. A notification server may transmit the event notifications via existing communications channels, such as electronic mail, facsimile, instant messaging, text messaging, and voice communications.
    Type: Grant
    Filed: April 25, 2006
    Date of Patent: May 27, 2014
    Assignee: International Business Machines Corporation
    Inventors: Maximino Aguilar, Jr., Charles R. Johns, Mark R. Nutter
  • Patent number: 8713548
    Abstract: Mechanisms are provided for rewriting branch instructions in a portion of code. The mechanisms receive a portion of source code having an original branch instruction. The mechanisms generate a branch stub for the original branch instruction. The branch stub stores information about the original branch instruction including an original target address of the original branch instruction. Moreover, the mechanisms rewrite the original branch instruction so that a target of the rewritten branch instruction references the branch stub. In addition, the mechanisms output compiled code including the rewritten branch instruction and the branch stub for execution by a computing device. The branch stub is utilized by the computing device at runtime to determine if execution of the rewritten branch instruction can be redirected directly to a target instruction corresponding to the original target address in an instruction cache of the computing device without intervention by an instruction cache runtime system.
    Type: Grant
    Filed: April 10, 2012
    Date of Patent: April 29, 2014
    Assignee: International Business Machines Corporation
    Inventors: Tong Chen, Brian Flachs, Brad W. Michael, Mark R. Nutter, John K. P. O'Brien, Kathryn M. O'Brien, Tao Zhang
  • Patent number: 8631225
    Abstract: Mechanisms are provided for dynamically rewriting branch instructions in a portion of code. The mechanisms execute a branch instruction in the portion of code. The mechanisms determine if a target instruction of the branch instruction, to which the branch instruction branches, is present in an instruction cache associated with the processor. Moreover, the mechanisms directly branch execution of the portion of code to the target instruction in the instruction cache, without intervention from an instruction cache runtime system, in response to a determination that the target instruction is present in the instruction cache. In addition, the mechanisms redirect execution of the portion of code to the instruction cache runtime system in response to a determination that the target instruction cannot be determined to be present in the instruction cache.
    Type: Grant
    Filed: June 25, 2010
    Date of Patent: January 14, 2014
    Assignee: International Business Machines Corporation
    Inventors: Tong Chen, Brian Flachs, Brad W. Michael, Mark R. Nutter, John K. P. O'Brien, Kathryn M. O'Brien, Tao Zhang
  • Patent number: 8627043
    Abstract: Mechanisms for performing data parallel function calls in code during runtime are provided. These mechanisms may operate to execute, in the processor, a portion of code having a data parallel function call to a target portion of code. The mechanisms may further operate to determine, at runtime by the processor, whether the target portion of code is a data parallel portion of code or a scalar portion of code and determine whether the calling code is data parallel code or scalar code. Moreover, the mechanisms may operate to execute the target portion of code based on the determination of whether the target portion of code is a data parallel portion of code or a scalar portion of code, and the determination of whether the calling code is data parallel code or scalar code.
    Type: Grant
    Filed: March 26, 2012
    Date of Patent: January 7, 2014
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter
  • Patent number: 8627051
    Abstract: Mechanisms are provided for dynamically rewriting branch instructions in a portion of code. The mechanisms execute a branch instruction in the portion of code. The mechanisms determine if a target instruction of the branch instruction, to which the branch instruction branches, is present in an instruction cache associated with the processor. Moreover, the mechanisms directly branch execution of the portion of code to the target instruction in the instruction cache, without intervention from an instruction cache runtime system, in response to a determination that the target instruction is present in the instruction cache. In addition, the mechanisms redirect execution of the portion of code to the instruction cache runtime system in response to a determination that the target instruction cannot be determined to be present in the instruction cache.
    Type: Grant
    Filed: April 10, 2012
    Date of Patent: January 7, 2014
    Assignee: International Business Machines Corporation
    Inventors: Tong Chen, Brian Flachs, Brad W. Michael, Mark R. Nutter, John K. P. O'Brien, Kathryn M. O'Brien, Tao Zhang
  • Patent number: 8627042
    Abstract: Mechanisms for performing data parallel function calls in code during runtime are provided. These mechanisms may operate to execute, in the processor, a portion of code having a data parallel function call to a target portion of code. The mechanisms may further operate to determine, at runtime by the processor, whether the target portion of code is a data parallel portion of code or a scalar portion of code and determine whether the calling code is data parallel code or scalar code. Moreover, the mechanisms may operate to execute the target portion of code based on the determination of whether the target portion of code is a data parallel portion of code or a scalar portion of code, and the determination of whether the calling code is data parallel code or scalar code.
    Type: Grant
    Filed: December 30, 2009
    Date of Patent: January 7, 2014
    Assignee: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, Brian K. Flachs, Charles R. Johns, Mark R. Nutter