Patents by Inventor Robert A. Shearer

Robert A. Shearer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20130160114
    Abstract: A circuit arrangement and method utilize a process context translation data structure in connection with an on-chip network of a processor chip to implement secure inter-thread communication between hardware threads in the processor chip. The process context translation data structure maps processes to inter-thread communication hardware resources, e.g., the inbox and/or outbox buffers of a NOC processor, such that a user process is only allowed to access the inter-thread communication hardware resources that it has been granted access to, and typically with only certain types of authorized access types. Moreover, a hypervisor or supervisor may manage the process context translation data structure to grant or deny access rights to user processes such that, once those rights are established in the data structure, user processes are permitted to perform inter-thread communications without requiring context switches to a hypervisor or supervisor in order to handle the communications.
    Type: Application
    Filed: December 19, 2011
    Publication date: June 20, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jason Greenwood, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
  • Publication number: 20130159668
    Abstract: A circuit arrangement, method, and program product for substituting a plurality of scalar instructions in an instruction stream with a functionally equivalent vector instruction for execution by a vector execution unit. Predecode logic is coupled to an instruction buffer which stores instructions in an instruction stream to be executed by the vector execution unit. The predecode logic analyzes the instructions passing through the instruction buffer to identify a plurality of scalar instructions that may be replaced by a vector instruction in the instruction stream. The predecode logic may generate the functionally equivalent vector instruction based on the plurality of scalar instructions, and the functionally equivalent vector instruction may be substituted into the instruction stream, such that the vector execution unit executes the vector instruction in lieu of the plurality of scalar instructions.
    Type: Application
    Filed: December 20, 2011
    Publication date: June 20, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20130159799
    Abstract: A method and circuit arrangement utilize scan logic disposed on a multi-core processor integrated circuit device or chip to perform internal voting-based built in self test (BIST) of the chip. Test patterns are generated internally on the chip and communicated to the scan chains within multiple processing cores on the chip. Test results output by the scan chains are compared with one another on the chip, and majority voting is used to identify outlier test results that are indicative of a faulty processing core. A bit position in a faulty test result may be used to identify a faulty latch in a scan chain and/or a faulty functional unit in the faulty processing core, and a faulty processing core and/or a faulty functional unit may be automatically disabled in response to the testing.
    Type: Application
    Filed: December 20, 2011
    Publication date: June 20, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jeffrey D. Brown, Miguel Comparan, Robert A. Shearer, Alfred T. Watson, III
  • Publication number: 20130159669
    Abstract: A method and circuit arrangement utilize a low latency variable transfer network between the register files of multiple processing cores in a multi-core processor chip to support fine grained parallelism of virtual threads across multiple hardware threads. The communication of a variable over the variable transfer network may be initiated by a move from a local register in a register file of a source processing core to a variable register that is allocated to a destination hardware thread in a destination processing core, so that the destination hardware thread can then move the variable from the variable register to a local register in the destination processing core.
    Type: Application
    Filed: December 20, 2011
    Publication date: June 20, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Miguel Comparan, Russell D. Hoover, Robert A. Shearer, Alfred T. Watson, III
  • Publication number: 20130145128
    Abstract: A method and circuit arrangement utilize a programmable microcode unit that is capable of being programmed via software to modify the instruction sequences output by the microcode unit in response to microcode instructions issued to the microcode unit. Among other benefits, a programmable microcode unit consistent with the invention enables customization of a processor design to handle specific applications or tasks, as well as to support specific hardware configurations such as specific execution units. In addition, a programmable microcode unit may be updatable, e.g., to correct bugs or faults found in previous instruction sequences supported by the unit.
    Type: Application
    Filed: December 6, 2011
    Publication date: June 6, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20130138918
    Abstract: A circuit arrangement, method, and program product for compressing and decompressing data in a node of a system including a plurality of nodes interconnected via an on-chip network. Compressed data may be received and stored at an input buffer of a node, and in parallel with moving the compressed data to an execution register of the node, decompression logic of the node may decompress the data to generate uncompressed data, such that uncompressed data is stored in the execution register for utilization by an execution unit of the node. Uncompressed data may be output by the execution unit into the execution register, and in parallel with moving the uncompressed data to an output buffer of the node connected to the on-chip network, compression logic may compress the uncompressed data to generate compressed data, such that compressed data is stored at the output buffer.
    Type: Application
    Filed: November 30, 2011
    Publication date: May 30, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20130111190
    Abstract: A circuit arrangement and method support compression and expansion of instruction opcodes by detecting successive address targeting and decoding a first opcode of an instruction into a second opcode in response to detecting successive address targeting. The circuit arrangement and method execute instructions in an instruction stream and detect successive address targeting by two or more instructions in the instruction stream without the targeted address being utilized as a source address in an instruction executed between the first and second instructions in the instruction stream. Then, based on that detection, the opcode of the second instruction is modified, changed, or appended to such that a different opcode is indicated by the second instruction, such that executing the second instruction causes a different unique type of operation to be performed.
    Type: Application
    Filed: November 2, 2011
    Publication date: May 2, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 8423749
    Abstract: A computer-implemented method, system and computer program product for controlling an algorithm that is performed on a unit of work in a subsequent software pipeline stage in a Network On a Chip (NOC) is presented. In one embodiment, the method executes a first operation in a first node of the NOC. The first node generates payload, and then loads that payload into a message. The message with the payload is transmitted to a nanokernel that controls a second node in the NOC. The nanokernel calls an algorithm that is needed by a second operation in a second node in the NOC, which uses the algorithm to execute the second operation.
    Type: Grant
    Filed: October 22, 2008
    Date of Patent: April 16, 2013
    Assignee: International Business Machines Corporation
    Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 8423715
    Abstract: A memory hierarchy in a computer includes levels of cache. The computer also includes a processor operatively coupled through two or more levels of cache to a main random access memory. Caches closer to the processor in the hierarchy are characterized as higher in the hierarchy. Memory management among the levels of cache includes identifying a line in a first cache that is preferably retained in the first cache, where the first cache is backed up by at least one cache lower in the memory hierarchy and the lower cache implements an LRU-type cache line replacement policy. Memory management also includes updating LRU information for the lower cache to indicate that the line has been recently accessed.
    Type: Grant
    Filed: May 1, 2008
    Date of Patent: April 16, 2013
    Assignee: International Business Machines Corporation
    Inventors: Timothy H. Heil, Robert A. Shearer
  • Patent number: 8413166
    Abstract: A circuit arrangement and method implement impulse propagation in a multithreaded physics engine by assigning ownership of objects in a scene to individual threads and propagating impulses between objects that are in contact with one another by passing inter-thread impulse messages between the threads that own the contacting objects, while locally propagating impulses through objects using the threads to which such objects are assigned.
    Type: Grant
    Filed: August 18, 2011
    Date of Patent: April 2, 2013
    Assignee: International Business Machines Corporation
    Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 8405670
    Abstract: A multithreaded rendering software pipeline architecture utilizes a rolling texture context data structure to store multiple texture contexts that are associated with different textures that are being processed in the software pipeline. Each texture context stores state data for a particular texture, and facilitates the access to texture data by multiple, parallel stages in a software pipeline. In addition, texture contexts are capable of being “rolled”, or copied to enable different stages of a rendering pipeline that require different state data for a particular texture to separately access the texture data independently from one another, and without the necessity for stalling the pipeline to ensure synchronization of shared texture data among the stages of the pipeline.
    Type: Grant
    Filed: May 25, 2010
    Date of Patent: March 26, 2013
    Assignee: International Business Machines Corporation
    Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer
  • Publication number: 20130046518
    Abstract: A circuit arrangement and method implement impulse propagation in a multithreaded physics engine by assigning ownership of objects in a scene to individual threads and propagating impulses between objects that are in contact with one another by passing inter-thread impulse messages between the threads that own the contacting objects, while locally propagating impulses through objects using the threads to which such objects are assigned.
    Type: Application
    Filed: August 18, 2011
    Publication date: February 21, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20130044117
    Abstract: Frequently accessed state data used in a multithreaded graphics processing architecture is cached within a vector register file of a processing unit to optimize accesses to the state data and minimize memory bus utilization associated therewith. A processing unit may include a fixed point execution unit as well as a vector floating point execution unit, and a vector register file utilized by the vector floating point execution unit may be used to cache state data used by the fixed point execution unit and transferred as needed into the general purpose registers accessible by the fixed point execution unit, thereby reducing the need to repeatedly retrieve and write back the state data from and to an L1 or lower level cache accessed by the fixed point execution unit.
    Type: Application
    Filed: August 18, 2011
    Publication date: February 21, 2013
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 8363669
    Abstract: A method includes receiving a plurality of packets at an integrated processor block of a network on a chip device. The plurality of packets includes a first packet that includes an indication of a start of data associated with a pixel shader application. The method includes recovering the data from the plurality of packets. The method also includes storing the recovered data in a dedicated packet collection memory within the network on the chip device. The method further includes retaining the data stored in the dedicated packet collection memory during an interruption event. Upon completion of the interruption event, the method includes copying packets stored in the dedicated packet collection memory prior to the interruption event to an inbox of the network on the chip device for processing.
    Type: Grant
    Filed: June 25, 2010
    Date of Patent: January 29, 2013
    Assignee: International Business Machines Corporation
    Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 8339398
    Abstract: According to embodiments of the invention, a data structure may be created which may be used by both an image processing system and by a physics engine. The data structure may have an initial or upper portion representing bounding volumes which partition a three dimensional scene and a second or lower portion representing objects within the three dimensional scene. The integrated acceleration data structure may be used by an image processing system to render a two dimensional image from a three dimensional scene, and by a physics engine to perform physics based calculations in order to simulate physical phenomena in the three dimensional scene. Furthermore, the physics engine may update the integrated acceleration data structure in response to changes in position or shape of objects due to physical phenomena.
    Type: Grant
    Filed: September 28, 2006
    Date of Patent: December 25, 2012
    Assignee: International Business Machines Corporation
    Inventor: Robert A. Shearer
  • Publication number: 20120260252
    Abstract: A computer-implemented method, system, and/or computer program product schedules execution of software threads. A first software thread is executed together with a second software thread as a first software thread pair. A first content, which resulted from executing the first software pair together, of at least one performance counter, is stored. The first software thread is then executed with a third software thread as a second software thread pair, and the resulting second content of the performance counter(s) is stored. An identification is made of a most efficient software thread pair from the first and second software thread pairs. Upon receiving a request to re-execute the first software thread, the first software thread is selectively matched with either the second software thread or the third software thread for execution based on whether the first software thread pair or the second software thread pair has been identified as the most efficient software thread pair.
    Type: Application
    Filed: April 8, 2011
    Publication date: October 11, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: JAMIE R. KUESEL, MARK G. KUPFERSCHMIDT, PAUL E. SCHARDT, ROBERT A. SHEARER
  • Publication number: 20120254548
    Abstract: A method and apparatus dynamically allocates and deallocates a portion of a cache for use as a dedicated local storage. Cache lines may be dynamically allocated and deallocated for inclusion in the dedicated local storage. Cache entries that are included in the dedicated local storage may not be evicted or invalidated. Additionally, coherence is not maintained between the cache entries that are included in the dedicated local storage and the backing memory. A load instruction may be configured to allocate, e.g., lock, a portion of the data cache for inclusion in the dedicated local storage and load data into the dedicated local storage. A load instruction may be configured to read data from the dedicated local storage and to deallocate, e.g., unlock, a portion of the data cache that was included in the dedicated local storage.
    Type: Application
    Filed: April 4, 2011
    Publication date: October 4, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Miguel Comparan, Russell D. Hoover, Robert A. Shearer, Alfred T. Watson, III
  • Publication number: 20120254877
    Abstract: A method and apparatus for transferring architected state bypasses system memory by directly transmitting architected state between processor cores over a dedicated interconnect. The transfer may be performed by state transfer interface circuitry with or without software interaction. The architected state for a thread may be transferred from a first processing core to a second processing core when the state transfer interface circuitry detects an error that prevents proper execution of the thread corresponding to the architected state. A program instruction may be used to initiate the transfer of the architected state for the thread to one or more other threads in order to parallelize execution of the thread or perform load balancing between multiple processor cores by distributing processing of multiple threads.
    Type: Application
    Filed: April 1, 2011
    Publication date: October 4, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Miguel Comparan, Russell D. Hoover, Robert A. Shearer, Alfred T. Watson, III
  • Patent number: 8261025
    Abstract: Memory sharing in a software pipeline on a network on chip (‘NOC’), the NOC including integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controllers, with each IP block adapted to a router through a memory communications controller and a network interface controller, where each memory communications controller controlling communications between an IP block and memory, and each network interface controller controlling inter-IP block communications through routers, including segmenting a computer software application into stages of a software pipeline, the software pipeline comprising one or more paths of execution; allocating memory to be shared among at least two stages including creating a smart pointer, the smart pointer including data elements for determining when the shared memory can be deallocated; determining, in dependence upon the data elements for determining when the shared memory can be deallocated, that the shared memory can be deallocated; and d
    Type: Grant
    Filed: November 12, 2007
    Date of Patent: September 4, 2012
    Assignee: International Business Machines Corporation
    Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer
  • Publication number: 20120221711
    Abstract: A first hardware node in a network interconnect receives a data packet from a network. The first hardware node examines the data packet for a regular expression. In response to the first hardware node failing to identify the regular expression in the data packet, the data packet is forwarded to a second hardware node in the network interconnect for further examination of the data packet in order to search for the regular expression in the data packet.
    Type: Application
    Filed: February 28, 2011
    Publication date: August 30, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: JAMIE R. KUESEL, MARK G. KUPFERSCHMIDT, PAUL E. SCHARDT, ROBERT A. SHEARER