Patents by Inventor Paul E. Schardt

Paul E. Schardt has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20150032999
    Abstract: A method and circuit arrangement decode instructions based in part on one or more decode-related attributes stored in a memory address translation data structure such as an Effective To Real Translation (ERAT) or Translation Lookaside Buffer (TLB). A memory address translation data structure may be accessed, for example, in connection with a decode of an instruction stored in a page of memory, such that one or more attributes associated with the page in the data structure may be used to control how that instruction is decoded.
    Type: Application
    Filed: July 23, 2013
    Publication date: January 29, 2015
    Applicant: International Business Machines Corporation
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20150026500
    Abstract: A method and circuit arrangement utilize a general purpose processing unit having a low power DSP mode for reconfiguring the general purpose processing unit to efficiently execute DSP workloads with reduced power consumption. When in a DSP mode, one or more of a data cache, an execution unit, and simultaneous multithreading may be disabled to reduce power consumption and improve performance for DSP workloads. Furthermore, partitioning of a register file to support multithreading, and register renaming functionality, may be disabled to provide an expanded set of registers for use with DSP workloads. As a result, a general purpose processing unit may be provided with enhanced performance for DSP workloads with reduced power consumption, while also not sacrificing performance for other non-DSP/general purpose workloads.
    Type: Application
    Filed: July 22, 2013
    Publication date: January 22, 2015
    Applicant: International Business Machines Corporation
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20150026435
    Abstract: A method and circuit arrangement selectively source and/or write data from/to extended registers of an extended register file based in part on whether an operand address of an instruction references a primary register of primary register file configured to store a pointer to the extended register. Control logic connected to the primary register file and the extended register file determines whether the operand address references a primary register configured to store a pointer, and responsive to the determination, the control logic causes execution logic to selectively source and/or write data from/to the extended register pointed to by the pointer stored in the referenced primary register.
    Type: Application
    Filed: July 22, 2013
    Publication date: January 22, 2015
    Applicant: International Business Machines Corporation
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20150020078
    Abstract: A system, method, and program product for scheduling processes of a workload on a plurality of hardware threads configured in a plurality of processing elements of a multithreading parallel computing system for processing thereby. Process dimensions for each process are determined based on processing attributes associated with each process, and a place and route algorithm is utilized to map the processes to a processor space representative of the processing resources of the computing system based at least in part on the process dimensions to thereby distribute the processes of the workload.
    Type: Application
    Filed: July 10, 2013
    Publication date: January 15, 2015
    Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
  • Patent number: 8935694
    Abstract: A hypervisor and one or more programs, e.g., guest operating systems and/or user processes or applications hosted by the hypervisor to configured to selectively save and restore the state of branch prediction logic through separate hypervisor-mode and guest-mode and/or user-mode instructions. By doing so, different branch prediction strategies may be employed for different operating systems and user applications hosted thereby to provide finer grained optimization of the branch prediction logic.
    Type: Grant
    Filed: January 23, 2012
    Date of Patent: January 13, 2015
    Assignee: International Business Machines Corporation
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 8898396
    Abstract: Memory sharing in a software pipeline on a network on chip (‘NOC’), the NOC including integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controllers, with each IP block adapted to a router through a memory communications controller and a network interface controller, where each memory communications controller controlling communications between an IP block and memory, and each network interface controller controlling inter-IP block communications through routers, including segmenting a computer software application into stages of a software pipeline, the software pipeline comprising one or more paths of execution; allocating memory to be shared among at least two stages including creating a smart pointer, the smart pointer including data elements for determining when the shared memory can be deallocated; determining, in dependence upon the data elements for determining when the shared memory can be deallocated, that the shared memory can be deallocated; and d
    Type: Grant
    Filed: April 23, 2012
    Date of Patent: November 25, 2014
    Assignee: International Business Machines Corporation
    Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer
  • Patent number: 8892851
    Abstract: A circuit arrangement and method support compression and expansion of instruction opcodes by detecting successive address targeting and decoding a first opcode of an instruction into a second opcode in response to detecting successive address targeting. The circuit arrangement and method execute instructions in an instruction stream and detect successive address targeting by two or more instructions in the instruction stream without the targeted address being utilized as a source address in an instruction executed between the first and second instructions in the instruction stream. Then, based on that detection, the opcode of the second instruction is modified, changed, or appended to such that a different opcode is indicated by the second instruction, such that executing the second instruction causes a different unique type of operation to be performed.
    Type: Grant
    Filed: November 2, 2011
    Date of Patent: November 18, 2014
    Assignee: International Business Machines Corporation
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 8836709
    Abstract: Frequently accessed state data used in a multithreaded graphics processing architecture is cached within a vector register file of a processing unit to optimize accesses to the state data and minimize memory bus utilization associated therewith. A processing unit may include a fixed point execution unit as well as a vector floating point execution unit, and a vector register file utilized by the vector floating point execution unit may be used to cache state data used by the fixed point execution unit and transferred as needed into the general purpose registers accessible by the fixed point execution unit, thereby reducing the need to repeatedly retrieve and write back the state data from and to an L1 or lower level cache accessed by the fixed point execution unit.
    Type: Grant
    Filed: August 18, 2011
    Date of Patent: September 16, 2014
    Assignee: International Business Machines Corporation
    Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20140229713
    Abstract: A method and circuit arrangement tightly couple together decode logic associated with multiple types of execution units and having varying priorities to enable instructions that are decoded as valid instructions for multiple types of execution units to be forwarded to a highest priority type of execution unit among the multiple types of execution units. Among other benefits, when an auxiliary execution unit is coupled to a general purpose processing core with the decode logic for the auxiliary execution unit tightly coupled with the decode logic for the general purpose processing core, the auxiliary execution unit may be used to effectively overlay new functionality for an existing instruction that is normally executed by the general purpose processing core, e.g., to patch a design flaw in the general purpose processing core or to provide improved performance for specialized applications.
    Type: Application
    Filed: March 11, 2013
    Publication date: August 14, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20140229711
    Abstract: A method, circuit arrangement, and program product for selectively predicating instructions in an instruction stream by determining a first register address from an instruction, determining a second register address based on a value stored at the first register address, and determining whether to predicate the instruction based at least in part on a value stored at the second register address. Predication logic may analyze the instruction to determine the first register address, analyze a register corresponding to the first register address to determine the second register address, and communicate a predication signal to an execution unit based at least in part on the value stored at the second register address.
    Type: Application
    Filed: February 13, 2013
    Publication date: August 14, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20140229712
    Abstract: A method, circuit arrangement, and program product for selectively predicating instructions in an instruction stream by determining a first register address from an instruction, determining a second register address based on a value stored at the first register address, and determining whether to predicate the instruction based at least in part on a value stored at the second register address. Predication logic may analyze the instruction to determine the first register address, analyze a register corresponding to the first register address to determine the second register address, and communicate a predication signal to an execution unit based at least in part on the value stored at the second register address.
    Type: Application
    Filed: February 27, 2013
    Publication date: August 14, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20140230077
    Abstract: A method and circuit arrangement utilize secure clear instructions defined in an instruction set architecture (ISA) for a processing unit to clear, overwrite or otherwise restrict unauthorized access to the internal architected state of the processing unit in association with context switch operations. The secure clear instructions are executable by a hypervisor, operating system, or other supervisory program code in connection with a context switch operation, and the processing unit includes security logic that is responsive to such instructions to restrict access by an operating system or process associated with an incoming context to architected state information associated with an operating system or process associated with an outgoing context.
    Type: Application
    Filed: February 14, 2013
    Publication date: August 14, 2014
    Applicant: International Business Machines Corporation
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20140229714
    Abstract: A method and circuit arrangement utilize a register file of an execution unit as a local instruction loop buffer to enable suitable algorithms, such as DSP algorithms, to be fetched and executed directly within the execution unit, and often enabling other logic circuits utilized for other, general purpose workloads to either be powered down or freed up to handle other workloads.
    Type: Application
    Filed: March 12, 2013
    Publication date: August 14, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20140229709
    Abstract: A circuit arrangement, method, and program product for dynamically providing a status of a hardware thread/hardware resource independent of the operation of the hardware thread/hardware resource using an inter-thread communication protocol. A master hardware thread may be configured to communicate status requests to associated slave hardware threads and/or hardware resources. Each slave hardware thread/hardware resource may be configured with hardware logic configured to automatically determine status information for the slave hardware thread/hardware resource and communicate a status response to the master hardware thread independent of the operation of the slave hardware thread/hardware resource.
    Type: Application
    Filed: February 14, 2013
    Publication date: August 14, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
  • Publication number: 20140229706
    Abstract: A circuit arrangement, method, and program product for dynamically providing a status of a hardware thread/hardware resource independent of the operation of the hardware thread/hardware resource using an inter-thread communication protocol. A master hardware thread may be configured to communicate status requests to associated slave hardware threads and/or hardware resources. Each slave hardware thread/hardware resource may be configured with hardware logic configured to automatically determine status information for the slave hardware thread/hardware resource and communicate a status response to the master hardware thread independent of the operation of the slave hardware thread/hardware resource.
    Type: Application
    Filed: March 11, 2013
    Publication date: August 14, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
  • Publication number: 20140229708
    Abstract: A method and circuit arrangement tightly couple together decode logic associated with multiple types of execution units and having varying priorities to enable instructions that are decoded as valid instructions for multiple types of execution units to be forwarded to a highest priority type of execution unit among the multiple types of execution units. Among other benefits, when an auxiliary execution unit is coupled to a general purpose processing core with the decode logic for the auxiliary execution unit tightly coupled with the decode logic for the general purpose processing core, the auxiliary execution unit may be used to effectively overlay new functionality for an existing instruction that is normally executed by the general purpose processing core, e.g., to patch a design flaw in the general purpose processing core or to provide improved performance for specialized applications.
    Type: Application
    Filed: February 13, 2013
    Publication date: August 14, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Patent number: 8776035
    Abstract: A compiler may optimize source code and any referenced libraries to execute on a plurality of different processor architecture implementations. For example, if a compute node has three different types of processors with three different architecture implementations, the compiler may compile the source code and generate three versions of object code where each version is optimized for one of the three different processor types. After compiling the source code, the resultant executable code may contain the necessary information for selecting between the three versions. For example, when a program loader assigns the executable code to the processor, the system determines the processor's type and ensures only the optimized version that corresponds to that type is executed. Thus, the operating system is free to assign the executable code to any processor based on, for example, the current status of the processor (i.e.
    Type: Grant
    Filed: December 10, 2012
    Date of Patent: July 8, 2014
    Assignee: International Business Machines Corporation
    Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
  • Publication number: 20140173308
    Abstract: A circuit arrangement, method, and program product communicate data over a communication bus by selectively encoding data values queued for communication over the communication bus based at least in part on at least one data value queued to be communicated thereafter and at least one previously communicated encoded data value to reduce bit transitions for communication of the encoded data values. By reducing bit transitions in the data communicated over the communication bus, power consumption by the communication bus is likewise reduced.
    Type: Application
    Filed: March 5, 2013
    Publication date: June 19, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20140173296
    Abstract: A circuit arrangement, method, and program product communicate data over a communication bus by selectively encoding data values queued for communication over the communication bus based at least in part on at least one data value queued to be communicated thereafter and at least one previously communicated encoded data value to reduce bit transitions for communication of the encoded data values. By reducing bit transitions in the data communicated over the communication bus, power consumption by the communication bus is likewise reduced.
    Type: Application
    Filed: December 14, 2012
    Publication date: June 19, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
  • Publication number: 20140164464
    Abstract: A method, circuit arrangement, and program product for executing instructions including denormal values for one or more operands in a vector execution unit. A denormal value operand may be prenormalized by a first processing lane of the vector execution unit upon detecting the denormal value. The prenormalized value and any other operands of the instruction may be communicated to a dot product adder of the vector execution unit. The dot product adder performs at least a portion of the floating point operation with the prenormalized value and any other operands of the instruction.
    Type: Application
    Filed: December 6, 2012
    Publication date: June 12, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs