Patents by Inventor Robert A. Shearer

Robert A. Shearer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

VECTOR EXECUTION UNIT WITH PRENORMALIZATION OF DENORMAL VALUES

Publication number: 20140164464

Abstract: A method, circuit arrangement, and program product for executing instructions including denormal values for one or more operands in a vector execution unit. A denormal value operand may be prenormalized by a first processing lane of the vector execution unit upon detecting the denormal value. The prenormalized value and any other operands of the instruction may be communicated to a dot product adder of the vector execution unit. The dot product adder performs at least a portion of the floating point operation with the prenormalized value and any other operands of the instruction.

Type: Application

Filed: December 6, 2012

Publication date: June 12, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
TRANSLATION MANAGEMENT INSTRUCTIONS FOR UPDATING ADDRESS TRANSLATION DATA STRUCTURES IN REMOTE PROCESSING NODES

Publication number: 20140164732

Abstract: Translation management instructions are used in a multi-node data processing system to facilitate remote management of address translation data structures distributed throughout such a system. Thus, in multi-node data processing systems where multiple processing nodes collectively handle a workload, the address translation data structures for such nodes may be collectively managed to minimize translation misses and the performance penalties typically associated therewith.

Type: Application

Filed: December 10, 2012

Publication date: June 12, 2014

Applicant: International Business Machines Corporation

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
VECTOR EXECUTION UNIT WITH PRENORMALIZATION OF DENORMAL VALUES

Publication number: 20140164465

Abstract: A method, circuit arrangement, and program product for executing instructions including denormal values for one or more operands in a vector execution unit. A denormal value operand may be prenormalized by a first processing lane of the vector execution unit upon detecting the denormal value. The prenormalized value and any other operands of the instruction may be communicated to a dot product adder of the vector execution unit. The dot product adder performs at least a portion of the floating point operation with the prenormalized value and any other operands of the instruction.

Type: Application

Filed: March 11, 2013

Publication date: June 12, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
CACHE SWIZZLE WITH INLINE TRANSPOSITION

Publication number: 20140164704

Abstract: A method and circuit arrangement selectively swizzle data in one or more levels of cache memory coupled to a processing unit based upon one or more swizzle-related page attributes stored in a memory address translation data structure such as an Effective To Real Translation (ERAT) or Translation Lookaside Buffer (TLB). A memory address translation data structure may be accessed, for example, in connection with a memory access request for data in a memory page, such that attributes associated with the memory page in the data structure may be used to control whether data is swizzled, and if so, how the data is to be formatted in association with handling the memory access request.

Type: Application

Filed: March 13, 2013

Publication date: June 12, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
CACHE SWIZZLE WITH INLINE TRANSPOSITION

Publication number: 20140164703

Abstract: A method and circuit arrangement selectively swizzle data in one or more levels of cache memory coupled to a processing unit based upon one or more swizzle-related page attributes stored in a memory address translation data structure such as an Effective To Real Translation (ERAT) or Translation Lookaside Buffer (TLB). A memory address translation data structure may be accessed, for example, in connection with a memory access request for data in a memory page, such that attributes associated with the memory page in the data structure may be used to control whether data is swizzled, and if so, how the data is to be formatted in association with handling the memory access request.

Type: Application

Filed: December 12, 2012

Publication date: June 12, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
Memory address translation-based data encryption/compression

Patent number: 8751830

Abstract: A method and circuit arrangement selectively stream data to an encryption or compression engine based upon encryption and/or compression-related page attributes stored in a memory address translation data structure such as an Effective To Real Translation (ERAT) or Translation Lookaside Buffer (TLB). A memory address translation data structure may be accessed, for example, in connection with a memory access request for data in a memory page, such that attributes associated with the memory page in the data structure may be used to control whether data is encrypted/decrypted and/or compressed/decompressed in association with handling the memory access request.

Type: Grant

Filed: January 23, 2012

Date of Patent: June 10, 2014

Assignee: International Business Machines Corporation

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
FLOATING POINT EXECUTION UNIT FOR CALCULATING PACKED SUM OF ABSOLUTE DIFFERENCES

Publication number: 20140149720

Abstract: A method and circuit arrangement provide support for packed sum of absolute difference operations in a floating point execution unit, e.g., a scalar or vector floating point execution unit. Existing adders in a floating point execution unit may be utilized along with minimal additional logic in the floating point execution unit to support efficient execution of a fixed point packed sum of absolute differences instruction within the floating point execution unit, often eliminating the need for a separate vector fixed point execution unit in a processor architecture, and thereby leading to less logic and circuit area, lower power consumption and lower cost.

Type: Application

Filed: November 29, 2012

Publication date: May 29, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
DISTRIBUTED CHIP LEVEL MANAGED POWER SYSTEM

Publication number: 20140143558

Abstract: A method, circuit arrangement, and program product for dynamically reallocating power consumption at a component level of a processor. Power tokens representative of a power consumption metric are allocated to interconnected IP blocks of the processor, and as additional power is required by an IP block to perform assigned operations, the IP block may communicate a request for additional power tokens to one or more interconnected IP blocks. The interconnected IP blocks may grant power tokens for the request based on a priority, availability, and/or power consumption target. The requesting IP block may modify power consumption based on power tokens granted by interconnected IP blocks for the request. A power management block may adjust power token allocation of one or more IP blocks by communicating a command to one or more IP blocks and/or by adjusting a power token request.

Type: Application

Filed: November 21, 2012

Publication date: May 22, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
DISTRIBUTED CHIP LEVEL POWER SYSTEM

Publication number: 20140143557

Abstract: A method, circuit arrangement, and program product for dynamically reallocating power consumption at a component level of a processor. Power tokens representative of a power consumption metric are allocated to interconnected IP blocks of the processor, and as additional power is required by an IP block to perform assigned operations, the IP block may communicate a request for additional power tokens to one or more interconnected IP blocks. The interconnected IP blocks may grant power tokens for the request based on a priority, availability, and/or power consumption target. The requesting IP block may modify power consumption based on power tokens granted by interconnected IP blocks for the request.

Type: Application

Filed: November 21, 2012

Publication date: May 22, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
Near neighbor data cache sharing

Patent number: 8719508

Abstract: Parallel computing environments, where threads executing in neighboring processors may access the same set of data, may be designed and configured to share one or more levels of cache memory. Before a processor forwards a request for data to a higher level of cache memory following a cache miss, the processor may determine whether a neighboring processor has the data stored in a local cache memory. If so, the processor may forward the request to the neighboring processor to retrieve the data. Because access to the cache memories for the two processors is shared, the effective size of the memory is increased. This may advantageously decrease cache misses for each level of shared cache memory without increasing the individual size of the caches on the processor chip.

Type: Grant

Filed: December 10, 2012

Date of Patent: May 6, 2014

Assignee: International Business Machines Corporation

Inventors: Miguel Comparan, Robert A. Shearer
Near neighbor data cache sharing

Patent number: 8719507

Abstract: Parallel computing environments, where threads executing in neighboring processors may access the same set of data, may be designed and configured to share one or more levels of cache memory. Before a processor forwards a request for data to a higher level of cache memory following a cache miss, the processor may determine whether a neighboring processor has the data stored in a local cache memory. If so, the processor may forward the request to the neighboring processor to retrieve the data. Because access to the cache memories for the two processors is shared, the effective size of the memory is increased. This may advantageously decrease cache misses for each level of shared cache memory without increasing the individual size of the caches on the processor chip.

Type: Grant

Filed: January 4, 2012

Date of Patent: May 6, 2014

Assignee: International Business Machines Corporation

Inventors: Miguel Comparan, Robert A. Shearer
DMA-based acceleration of command push buffer between host and target devices

Patent number: 8719455

Abstract: Direct Memory Access (DMA) is used in connection with passing commands between a host device and a target device coupled via a push buffer. Commands passed to a push buffer by a host device may be accumulated by the host device prior to forwarding the commands to the push buffer, such that DMA may be used to collectively pass a block of commands to the push buffer. In addition, a host device may utilize DMA to pass command parameters for commands to a command buffer that is accessible by the target device but is separate from the push buffer, with the commands that are passed to the push buffer including pointers to the associated command parameters in the command buffer.

Type: Grant

Filed: June 28, 2010

Date of Patent: May 6, 2014

Assignee: International Business Machines Corporation

Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
Regular expression searches utilizing general purpose processors on a network interconnect

Patent number: 8719404

Abstract: A first hardware node in a network interconnect receives a data packet from a network. The first hardware node examines the data packet for a regular expression. In response to the first hardware node failing to identify the regular expression in the data packet, the data packet is forwarded to a second hardware node in the network interconnect for further examination of the data packet in order to search for the regular expression in the data packet.

Type: Grant

Filed: February 28, 2011

Date of Patent: May 6, 2014

Assignee: International Business Machines Corporation

Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
Reuse of static image data from prior image frames to reduce rasterization requirements

Patent number: 8711163

Abstract: An apparatus, program product and method reuse static image data generated during rasterization of static geometry to reduce the processing overhead associated with rasterizing subsequent image frames. In particular, static image data generated one frame may be reused in a subsequent image frame such that the subsequent image frame is generated without having to re-rasterize the static geometry from the scene, i.e., with only the dynamic geometry rasterized. The resulting image frame includes dynamic image data generated as a result of rasterizing the dynamic geometry during that image frame, and static image data generated as a result of rasterizing the static image data during a prior image frame.

Type: Grant

Filed: January 6, 2011

Date of Patent: April 29, 2014

Assignee: International Business Machines Corporation

Inventors: Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs, Eric O. Mejdrich
Parallelized streaming accelerated data structure generation

Patent number: 8692825

Abstract: A method includes receiving at a master processing element primitive data that includes properties of a primitive. The method includes partially traversing a spatial data structure that represents a three-dimensional image to identify an internal node of the spatial data structure. The internal node represents a portion of the three-dimensional image. The method also includes selecting a slave processing element from a plurality of slave processing elements. The selected processing element is associated with the internal node. The method further includes sending the primitive data to the selected slave processing element to traverse a portion of the spatial data structure to identify a leaf node of the spatial data structure.

Type: Grant

Filed: June 24, 2010

Date of Patent: April 8, 2014

Assignee: International Business Machines Corporation

Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
Performance event triggering through direct interthread communication on a network on chip

Patent number: 8661455

Abstract: Performance event triggering through direct interthread communication (‘DITC’) on a network on chip (‘NOC’), the NOC including integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controllers, with each IP block adapted to a router through a memory communications controller and a network interface controller, where each memory communications controller controlling communications between an IP block and memory, and each network interface controller controlling inter-IP block communications through routers, including enabling performance event monitoring in a selected set of IP blocks distributed throughout the NOC, each IP block within the selected set of IP blocks having one or more event counters; collecting performance results from the one or more event counters; and returning performance results from the one or more event counters to a destination repository, the returning being initiated by a triggering event occurring within the NOC.

Type: Grant

Filed: April 21, 2009

Date of Patent: February 25, 2014

Assignee: International Business Machines Corporation

Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
Inter-thread communication with software security

Patent number: 8640230

Abstract: A circuit arrangement and method utilize a process context translation data structure in connection with an on-chip network of a processor chip to implement secure inter-thread communication between hardware threads in the processor chip. The process context translation data structure maps processes to inter-thread communication hardware resources, e.g., the inbox and/or outbox buffers of a NOC processor, such that a user process is only allowed to access the inter-thread communication hardware resources that it has been granted access to, and typically with only certain types of authorized access types. Moreover, a hypervisor or supervisor may manage the process context translation data structure to grant or deny access rights to user processes such that, once those rights are established in the data structure, user processes are permitted to perform inter-thread communications without requiring context switches to a hypervisor or supervisor in order to handle the communications.

Type: Grant

Filed: December 19, 2011

Date of Patent: January 28, 2014

Assignee: International Business Machines Corporation

Inventors: Jason Greenwood, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
Multithreaded physics engine with predictive load balancing

Patent number: 8627329

Abstract: A circuit arrangement and method utilize predictive load balancing to allocate the workload among hardware threads in a multithreaded physics engine. The predictive load balancing is based at least in part upon the detection of predicted future collisions between objects in a scene, such that the reallocation of respective loads of a plurality of hardware threads may be initiated prior to detection of the actual collisions, thereby increasing the likelihood that hardware threads will be optimally allocated when the actual collisions occur.

Type: Grant

Filed: June 24, 2010

Date of Patent: January 7, 2014

Assignee: International Business Machines Corporation

Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
Parallelized ray tracing

Patent number: 8619078

Abstract: A method includes assigning a priority to a ray data structure of a plurality of ray data structures based on one or more priorities. The ray data structure includes properties of a ray to be traced from an illumination source in a three-dimensional image. The method includes identifying a portion of the three-dimensional image through which the ray passes. The method also includes identifying a slave processing element associated with the portion of the three-dimensional image. The method further includes sending the ray data structure to the slave processing element.

Type: Grant

Filed: May 21, 2010

Date of Patent: December 31, 2013

Assignee: International Business Machines Corporation

Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
Allocating resources based on a performance statistic

Patent number: 8587594

Abstract: A method includes rendering an object of a three dimensional image via a pixel shader based on a render context data structure associated with the object. The method includes measuring a performance statistic associated with rendering the object. The method also includes storing the performance statistic in the render context data structure associated with the object. The performance statistic is accessible to a host interface processor to determine whether to allocate a second pixel shader to render the object in a subsequent three-dimensional image.

Type: Grant

Filed: May 21, 2010

Date of Patent: November 19, 2013

Assignee: International Business Machines Corporation

Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs

prev … 5 6 7 8 9 10 11 12 13 … next