Patents by Inventor Paul E. Schardt

Paul E. Schardt has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

INSTRUCTION SET ARCHITECTURE WITH OPCODE LOOKUP USING MEMORY ATTRIBUTE

Publication number: 20150032999

Abstract: A method and circuit arrangement decode instructions based in part on one or more decode-related attributes stored in a memory address translation data structure such as an Effective To Real Translation (ERAT) or Translation Lookaside Buffer (TLB). A memory address translation data structure may be accessed, for example, in connection with a decode of an instruction stored in a page of memory, such that one or more attributes associated with the page in the data structure may be used to control how that instruction is decoded.

Type: Application

Filed: July 23, 2013

Publication date: January 29, 2015

Applicant: International Business Machines Corporation

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
GENERAL PURPOSE PROCESSING UNIT WITH LOW POWER DIGITAL SIGNAL PROCESSING (DSP) MODE

Publication number: 20150026500

Abstract: A method and circuit arrangement utilize a general purpose processing unit having a low power DSP mode for reconfiguring the general purpose processing unit to efficiently execute DSP workloads with reduced power consumption. When in a DSP mode, one or more of a data cache, an execution unit, and simultaneous multithreading may be disabled to reduce power consumption and improve performance for DSP workloads. Furthermore, partitioning of a register file to support multithreading, and register renaming functionality, may be disabled to provide an expanded set of registers for use with DSP workloads. As a result, a general purpose processing unit may be provided with enhanced performance for DSP workloads with reduced power consumption, while also not sacrificing performance for other non-DSP/general purpose workloads.

Type: Application

Filed: July 22, 2013

Publication date: January 22, 2015

Applicant: International Business Machines Corporation

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
INSTRUCTION SET ARCHITECTURE WITH EXTENSIBLE REGISTER ADDRESSING

Publication number: 20150026435

Abstract: A method and circuit arrangement selectively source and/or write data from/to extended registers of an extended register file based in part on whether an operand address of an instruction references a primary register of primary register file configured to store a pointer to the extended register. Control logic connected to the primary register file and the extended register file determines whether the operand address references a primary register configured to store a pointer, and responsive to the determination, the control logic causes execution logic to selectively source and/or write data from/to the extended register pointed to by the pointer stored in the referenced primary register.

Type: Application

Filed: July 22, 2013

Publication date: January 22, 2015

Applicant: International Business Machines Corporation

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
THREAD SCHEDULING ACROSS HETEROGENEOUS PROCESSING ELEMENTS WITH RESOURCE MAPPING

Publication number: 20150020078

Abstract: A system, method, and program product for scheduling processes of a workload on a plurality of hardware threads configured in a plurality of processing elements of a multithreading parallel computing system for processing thereby. Process dimensions for each process are determined based on processing attributes associated with each process, and a place and route algorithm is utilized to map the processes to a processor space representative of the processing resources of the computing system based at least in part on the process dimensions to thereby distribute the processes of the workload.

Type: Application

Filed: July 10, 2013

Publication date: January 15, 2015

Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
System and method for selectively saving and restoring state of branch prediction logic through separate hypervisor-mode and guest-mode and/or user-mode instructions

Patent number: 8935694

Abstract: A hypervisor and one or more programs, e.g., guest operating systems and/or user processes or applications hosted by the hypervisor to configured to selectively save and restore the state of branch prediction logic through separate hypervisor-mode and guest-mode and/or user-mode instructions. By doing so, different branch prediction strategies may be employed for different operating systems and user applications hosted thereby to provide finer grained optimization of the branch prediction logic.

Type: Grant

Filed: January 23, 2012

Date of Patent: January 13, 2015

Assignee: International Business Machines Corporation

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
Software pipelining on a network on chip

Patent number: 8898396

Abstract: Memory sharing in a software pipeline on a network on chip (‘NOC’), the NOC including integrated processor (‘IP’) blocks, routers, memory communications controllers, and network interface controllers, with each IP block adapted to a router through a memory communications controller and a network interface controller, where each memory communications controller controlling communications between an IP block and memory, and each network interface controller controlling inter-IP block communications through routers, including segmenting a computer software application into stages of a software pipeline, the software pipeline comprising one or more paths of execution; allocating memory to be shared among at least two stages including creating a smart pointer, the smart pointer including data elements for determining when the shared memory can be deallocated; determining, in dependence upon the data elements for determining when the shared memory can be deallocated, that the shared memory can be deallocated; and d

Type: Grant

Filed: April 23, 2012

Date of Patent: November 25, 2014

Assignee: International Business Machines Corporation

Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer
Changing opcode of subsequent instruction when same destination address is not used as source address by intervening instructions

Patent number: 8892851

Abstract: A circuit arrangement and method support compression and expansion of instruction opcodes by detecting successive address targeting and decoding a first opcode of an instruction into a second opcode in response to detecting successive address targeting. The circuit arrangement and method execute instructions in an instruction stream and detect successive address targeting by two or more instructions in the instruction stream without the targeted address being utilized as a source address in an instruction executed between the first and second instructions in the instruction stream. Then, based on that detection, the opcode of the second instruction is modified, changed, or appended to such that a different opcode is indicated by the second instruction, such that executing the second instruction causes a different unique type of operation to be performed.

Type: Grant

Filed: November 2, 2011

Date of Patent: November 18, 2014

Assignee: International Business Machines Corporation

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
Vector register file caching of context data structure for maintaining state data in a multithreaded image processing pipeline

Patent number: 8836709

Abstract: Frequently accessed state data used in a multithreaded graphics processing architecture is cached within a vector register file of a processing unit to optimize accesses to the state data and minimize memory bus utilization associated therewith. A processing unit may include a fixed point execution unit as well as a vector floating point execution unit, and a vector register file utilized by the vector floating point execution unit may be used to cache state data used by the fixed point execution unit and transferred as needed into the general purpose registers accessible by the fixed point execution unit, thereby reducing the need to repeatedly retrieve and write back the state data from and to an L1 or lower level cache accessed by the fixed point execution unit.

Type: Grant

Filed: August 18, 2011

Date of Patent: September 16, 2014

Assignee: International Business Machines Corporation

Inventors: Eric O. Mejdrich, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
EXTENSIBLE EXECUTION UNIT INTERFACE ARCHITECTURE

Publication number: 20140229713

Abstract: A method and circuit arrangement tightly couple together decode logic associated with multiple types of execution units and having varying priorities to enable instructions that are decoded as valid instructions for multiple types of execution units to be forwarded to a highest priority type of execution unit among the multiple types of execution units. Among other benefits, when an auxiliary execution unit is coupled to a general purpose processing core with the decode logic for the auxiliary execution unit tightly coupled with the decode logic for the general purpose processing core, the auxiliary execution unit may be used to effectively overlay new functionality for an existing instruction that is normally executed by the general purpose processing core, e.g., to patch a design flaw in the general purpose processing core or to provide improved performance for specialized applications.

Type: Application

Filed: March 11, 2013

Publication date: August 14, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
INDIRECT INSTRUCTION PREDICATION

Publication number: 20140229711

Abstract: A method, circuit arrangement, and program product for selectively predicating instructions in an instruction stream by determining a first register address from an instruction, determining a second register address based on a value stored at the first register address, and determining whether to predicate the instruction based at least in part on a value stored at the second register address. Predication logic may analyze the instruction to determine the first register address, analyze a register corresponding to the first register address to determine the second register address, and communicate a predication signal to an execution unit based at least in part on the value stored at the second register address.

Type: Application

Filed: February 13, 2013

Publication date: August 14, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
INDIRECT INSTRUCTION PREDICATION

Publication number: 20140229712

Abstract: A method, circuit arrangement, and program product for selectively predicating instructions in an instruction stream by determining a first register address from an instruction, determining a second register address based on a value stored at the first register address, and determining whether to predicate the instruction based at least in part on a value stored at the second register address. Predication logic may analyze the instruction to determine the first register address, analyze a register corresponding to the first register address to determine the second register address, and communicate a predication signal to an execution unit based at least in part on the value stored at the second register address.

Type: Application

Filed: February 27, 2013

Publication date: August 14, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
INSTRUCTION SET ARCHITECTURE WITH SECURE CLEAR INSTRUCTIONS FOR PROTECTING PROCESSING UNIT ARCHITECTED STATE INFORMATION

Publication number: 20140230077

Abstract: A method and circuit arrangement utilize secure clear instructions defined in an instruction set architecture (ISA) for a processing unit to clear, overwrite or otherwise restrict unauthorized access to the internal architected state of the processing unit in association with context switch operations. The secure clear instructions are executable by a hypervisor, operating system, or other supervisory program code in connection with a context switch operation, and the processing unit includes security logic that is responsive to such instructions to restrict access by an operating system or process associated with an incoming context to architected state information associated with an operating system or process associated with an outgoing context.

Type: Application

Filed: February 14, 2013

Publication date: August 14, 2014

Applicant: International Business Machines Corporation

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
LOCAL INSTRUCTION LOOP BUFFER UTILIZING EXECUTION UNIT REGISTER FILE

Publication number: 20140229714

Abstract: A method and circuit arrangement utilize a register file of an execution unit as a local instruction loop buffer to enable suitable algorithms, such as DSP algorithms, to be fetched and executed directly within the execution unit, and often enabling other logic circuits utilized for other, general purpose workloads to either be powered down or freed up to handle other workloads.

Type: Application

Filed: March 12, 2013

Publication date: August 14, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
DYNAMIC THREAD STATUS RETRIEVAL USING INTER-THREAD COMMUNICATION

Publication number: 20140229709

Abstract: A circuit arrangement, method, and program product for dynamically providing a status of a hardware thread/hardware resource independent of the operation of the hardware thread/hardware resource using an inter-thread communication protocol. A master hardware thread may be configured to communicate status requests to associated slave hardware threads and/or hardware resources. Each slave hardware thread/hardware resource may be configured with hardware logic configured to automatically determine status information for the slave hardware thread/hardware resource and communicate a status response to the master hardware thread independent of the operation of the slave hardware thread/hardware resource.

Type: Application

Filed: February 14, 2013

Publication date: August 14, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
DYNAMIC THREAD STATUS RETRIEVAL USING INTER-THREAD COMMUNICATION

Publication number: 20140229706

Abstract: A circuit arrangement, method, and program product for dynamically providing a status of a hardware thread/hardware resource independent of the operation of the hardware thread/hardware resource using an inter-thread communication protocol. A master hardware thread may be configured to communicate status requests to associated slave hardware threads and/or hardware resources. Each slave hardware thread/hardware resource may be configured with hardware logic configured to automatically determine status information for the slave hardware thread/hardware resource and communicate a status response to the master hardware thread independent of the operation of the slave hardware thread/hardware resource.

Type: Application

Filed: March 11, 2013

Publication date: August 14, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
EXTENSIBLE EXECUTION UNIT INTERFACE ARCHITECTURE

Publication number: 20140229708

Abstract: A method and circuit arrangement tightly couple together decode logic associated with multiple types of execution units and having varying priorities to enable instructions that are decoded as valid instructions for multiple types of execution units to be forwarded to a highest priority type of execution unit among the multiple types of execution units. Among other benefits, when an auxiliary execution unit is coupled to a general purpose processing core with the decode logic for the auxiliary execution unit tightly coupled with the decode logic for the general purpose processing core, the auxiliary execution unit may be used to effectively overlay new functionality for an existing instruction that is normally executed by the general purpose processing core, e.g., to patch a design flaw in the general purpose processing core or to provide improved performance for specialized applications.

Type: Application

Filed: February 13, 2013

Publication date: August 14, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
Providing performance tuned versions of compiled code to a CPU in a system of heterogeneous cores

Patent number: 8776035

Abstract: A compiler may optimize source code and any referenced libraries to execute on a plurality of different processor architecture implementations. For example, if a compute node has three different types of processors with three different architecture implementations, the compiler may compile the source code and generate three versions of object code where each version is optimized for one of the three different processor types. After compiling the source code, the resultant executable code may contain the necessary information for selecting between the three versions. For example, when a program loader assigns the executable code to the processor, the system determines the processor's type and ensures only the optimized version that corresponds to that type is executed. Thus, the operating system is free to assign the executable code to any processor based on, for example, the current status of the processor (i.e.

Type: Grant

Filed: December 10, 2012

Date of Patent: July 8, 2014

Assignee: International Business Machines Corporation

Inventors: Jamie R. Kuesel, Mark G. Kupferschmidt, Paul E. Schardt, Robert A. Shearer
CHIP LEVEL POWER REDUCTION USING ENCODED COMMUNICATIONS

Publication number: 20140173308

Abstract: A circuit arrangement, method, and program product communicate data over a communication bus by selectively encoding data values queued for communication over the communication bus based at least in part on at least one data value queued to be communicated thereafter and at least one previously communicated encoded data value to reduce bit transitions for communication of the encoded data values. By reducing bit transitions in the data communicated over the communication bus, power consumption by the communication bus is likewise reduced.

Type: Application

Filed: March 5, 2013

Publication date: June 19, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
CHIP LEVEL POWER REDUCTION USING ENCODED COMMUNICATIONS

Publication number: 20140173296

Abstract: A circuit arrangement, method, and program product communicate data over a communication bus by selectively encoding data values queued for communication over the communication bus based at least in part on at least one data value queued to be communicated thereafter and at least one previously communicated encoded data value to reduce bit transitions for communication of the encoded data values. By reducing bit transitions in the data communicated over the communication bus, power consumption by the communication bus is likewise reduced.

Type: Application

Filed: December 14, 2012

Publication date: June 19, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
VECTOR EXECUTION UNIT WITH PRENORMALIZATION OF DENORMAL VALUES

Publication number: 20140164464

Abstract: A method, circuit arrangement, and program product for executing instructions including denormal values for one or more operands in a vector execution unit. A denormal value operand may be prenormalized by a first processing lane of the vector execution unit upon detecting the denormal value. The prenormalized value and any other operands of the instruction may be communicated to a dot product adder of the vector execution unit. The dot product adder performs at least a portion of the floating point operation with the prenormalized value and any other operands of the instruction.

Type: Application

Filed: December 6, 2012

Publication date: June 12, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs

prev … 4 5 6 7 8 9 10 11 12 … next