Patents Examined by Daniel Pan

Parallel slice processor with dynamic instruction stream mapping

Patent number: 9665372

Abstract: A processor core having multiple parallel instruction execution slices and coupled to multiple dispatch queues by a dispatch routing network provides flexible and efficient use of internal resources. The dispatch routing network is controlled to dynamically vary the relationship between the slices and instruction streams according to execution requirements for the instruction streams and the availability of resources in the instruction execution slices. The instruction execution slices may be dynamically reconfigured as between single-instruction-multiple-data (SIMD) instruction execution and ordinary instruction execution on a per-instruction basis, permitting the mixture of those instruction types. Instructions having an operand width greater than the width of a single instruction execution slice may be processed by multiple instruction execution slices configured to act in concert for the particular instructions.

Type: Grant

Filed: May 12, 2014

Date of Patent: May 30, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, Jr.
Techniques for increasing instruction issue rate and reducing latency in an out-of order processor

Patent number: 9658853

Abstract: A technique for operating a processor includes storing a first result to a writeback buffer, in response to a first execution unit of the processor attempting to write the first result of a first completed instruction to a register file of the processor at a same processor time as a second execution unit of the processor is attempting to write a second result of a second completed instruction to the register file. The writeback buffer is positioned in a dataflow between the first execution unit and the register file. A buffer full indicator logic is used to detect that the writeback buffer is unavailable. A buffer unavailable signal is transmitted, from the buffer full indicator logic, in response to detecting the writeback buffer is unavailable. In response to receiving the buffer unavailable signal, a buffer retrieving logic writes the first result from the writeback buffer to the register file.

Type: Grant

Filed: July 31, 2014

Date of Patent: May 23, 2017

Assignee: GLOBALFOUNDRIES INC

Inventors: Harry Barowski, Tim Niggemeier
Stack pointer value prediction

Patent number: 9652240

Abstract: Methods and apparatus for predicting the value of a stack pointer which store data when an instruction is seen which grows the stack. The information which is stored includes a size parameter which indicates by how much the stack is grown and one or both of: the register ID currently holding the stack pointer value or the current stack pointer value. When a subsequent instruction shrinking the stack is seen, the stored data is searched for one or more entries which has a corresponding size parameter. If such an entry is identified, the other information stored in that entry is used to predict the value of the stack pointer instead of using the instruction to calculate the new stack pointer value. Where register renaming is used, the information in the entry is used to remap the stack pointer to a different physical register.

Type: Grant

Filed: January 14, 2015

Date of Patent: May 16, 2017

Assignee: Imagination Technologies Limited

Inventor: Hugh Jackson
Data processing apparatus with instruction encodings to enable near and far memory access modes

Patent number: 9652241

Abstract: Apparatus comprises a processor configured for operation under a sequence of instructions from an instruction set, wherein said processor comprises: means for conditionally inhibiting at least one type of trap, interrupt or exception (TIE) event, wherein, when operating under a sequence of instructions, said inhibition means is inaccessible by said instructions to inhibit the or each type of TIE event, without interrupting said sequence. A data processing apparatus includes a processor adapted to operate under control of program code comprising instructions selected from an instruction set, the apparatus comprising: a predefined memory space providing a predefined addressable memory for storing program code and data, a larger memory space providing a larger addressable memory, means for accessing program code and data within the predefined memory space, and means for controlling the access means so as to enable the access means to access program code located within the larger memory space.

Type: Grant

Filed: April 10, 2007

Date of Patent: May 16, 2017

Assignee: Cambridge Consultants Ltd.

Inventors: Alistair G. Morfey, Karl Leighton Swepson, Neil Edward Johnson
GPU predication

Patent number: 9633409

Abstract: Techniques are disclosed relating to predication. In one embodiment, a graphics processing unit is disclosed that includes a first set of architecturally-defined registers configured to store predication information. The graphics processing unit further includes a second set of registers configured to mirror the first set of registers and an execution pipeline configured to discontinue execution of an instruction sequence based on predication information in the second set of registers. In one embodiment, the second set of registers includes one or more registers proximal to an output of the execution pipeline. In some embodiments, the execution pipeline writes back a predicate value determined for a predicate writer to the second set of registers. The first set of architecturally-defined registers is then updated with the predicate value written back to the second set of registers. In some embodiments, the execution pipeline discontinues execution of the instruction sequence without stalling.

Type: Grant

Filed: August 26, 2013

Date of Patent: April 25, 2017

Assignee: Apple Inc.

Inventors: Andrew M. Havlir, Brian K. Reynolds, Michael A. Geary
Relative offset branching in a fixed-width reduced instruction set computing architecture

Patent number: 9626188

Abstract: Embodiments relate to a method and computer program product for relative offset branching in a reduced instruction set computing (RISC) architecture. One aspect is a method that includes fetching a branch instruction from an instruction stream having a fixed instruction width. A relative offset value is acquired from the instruction stream. The relative offset value is formatted as an offset relative to a program counter value and sized as a multiple of the fixed instruction width. The relative offset value is added with the program counter value to form a branch target address value. The branch target address value is loaded into a program counter based on the branch instruction. Execution of the instruction stream is redirected to a next instruction based on the branch target address value in the program counter.

Type: Grant

Filed: September 5, 2014

Date of Patent: April 18, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Michael K. Gschwind
Systems, apparatuses, and methods for performing a horizontal add or subtract in response to a single instruction

Patent number: 9619226

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor vector packed horizontal add or subtract of packed data elements in response to a single vector packed horizontal add or subtract instruction that includes a destination vector register operand, a source vector register operand, and an opcode are describes.

Type: Grant

Filed: December 23, 2011

Date of Patent: April 11, 2017

Assignee: Intel Corporation

Inventors: Mostafa Hagog, Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Amit Gradstein, Simon Rubanovich, Zeev Sperber
Processor system with predicate register, computer system, method for managing predicates and computer program product

Patent number: 9606802

Abstract: A processor system is adapted to carry out a predicate swap instruction of an instruction set to swap, via a data pathway, predicate data in a first predicate data location of a predicate register with data in a corresponding additional predicate data location of a first additional predicate data container and to swap, via a data pathway, predicate data in a second predicate storage location of the predicate register with data in a corresponding additional predicate data location in a second additional predicate data container.

Type: Grant

Filed: March 25, 2011

Date of Patent: March 28, 2017

Assignee: NXP USA, INC.

Inventors: Yuval Peled, Itzhak Barak, Uri Dayan, Amir Kleen, Idan Rozenberg
Highly integrated scalable, flexible DSP megamodule architecture

Patent number: 9606803

Abstract: This invention addresses implements a range of interesting technologies into a single block. Each DSP CPU has a streaming engine. The streaming engines include: a SE to L2 interface that can request 512 bits/cycle from L2; a loose binding between SE and L2 interface, to allow a single stream to peak at 1024 bits/cycle; one-way coherence where the SE sees all earlier writes cached in system, but not writes that occur after stream opens; full protection against single-bit data errors within its internal storage via single-bit parity with semi-automatic restart on parity error.

Type: Grant

Filed: July 15, 2014

Date of Patent: March 28, 2017

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Timothy D. Anderson, Joseph Zbiciak, Duc Quang Bui, Abhijeet A. Chachad, Kai Chirca, Naveen Bhoria, Matthew D. Pierson, Daniel Wu, Ramakrishnan Venkatasubramanian
Absolute address branching in a fixed-width reduced instruction set computing architecture

Patent number: 9606804

Abstract: Embodiments relate to a method and computer program product for absolute address branching in a reduced instruction set computing (RISC) architecture. One aspect is a method that includes fetching a branch instruction from an instruction stream having a fixed instruction width. A branch target address value is acquired from the instruction stream. The branch target address value represents a target address of the branch instruction. The branch target address value is formatted as an absolute address and sized as a multiple of the fixed instruction width. The branch target address value is loaded into a program counter based on the branch instruction. Execution of the instruction stream is redirected to a next instruction based on the branch target address value in the program counter.

Type: Grant

Filed: September 5, 2014

Date of Patent: March 28, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Michael K. Gschwind
Integrating sign extensions for loads

Patent number: 9600194

Abstract: An address and a data size are provided to a rotator. The rotator stores, based on the address and the data size, a data element in a location having a defined number of positions. The data element includes one or more data units and the one or more data units are aligned correctly in one or more positions of the location based on a predefined position in the location to receive a selected data unit of the one or more data units. The rotator replicates a value of a chosen data unit of the one or more data units to one or more other positions of the location.

Type: Grant

Filed: November 25, 2015

Date of Patent: March 21, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Michael K. Gschwind
Honoring hardware entitlement of a hardware thread

Patent number: 9582323

Abstract: A method for scheduling the execution of a computer instruction, receive an entitlement processor resource percentage for a logical partition on a computer system. The logical partition is associated with a hardware thread of a processor of the computer system. The entitlement processor resource percentage for the logical partition is stored in a register of the hardware thread associated with the logical partition. An instruction is received from the logical partition of the computer system and the processor dispatches the instruction based on the entitlement processor resource percentage stored in the register of the hardware thread associated with the logical partition.

Type: Grant

Filed: June 19, 2014

Date of Patent: February 28, 2017

Assignee: International Business Machines Corporation

Inventors: Nitin Gupta, Mehulkumar J. Patel, Deepak C. Shetty
Systems, apparatuses, and methods for performing a double blocked sum of absolute differences

Patent number: 9582464

Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor vector double block packed sum of absolute differences (SAD) in response to a single vector double block packed sum of absolute differences instruction that includes a destination vector register operand, first and second source operands, an immediate, and an opcode are described.

Type: Grant

Filed: December 23, 2011

Date of Patent: February 28, 2017

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Mostafa Hagog, Robert Valentine, Amit Gradstein, Simon Rubanovich, Zeev Sperber
Architected store and verify guard word instructions

Patent number: 9582274

Abstract: Corruption of call stacks is detected by using guard words placed in the call stacks. A store guard word instruction is used to store a guard word on a stack frame of a caller routine, and a verify guard word instruction issued by one or more callee routines is used to verify the guard word is an expected value. If the guard word is an unexpected value, corruption is indicated.

Type: Grant

Filed: January 6, 2016

Date of Patent: February 28, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Michael K. Gschwind
Translation entry invalidation in a multithreaded data processing system

Patent number: 9575815

Abstract: In a multithreaded data processing system including a plurality of processor cores, storage-modifying requests of a plurality of concurrently executing hardware threads are received in a shared queue. The storage-modifying requests include a translation invalidation request of an initiating hardware thread. The translation invalidation request is removed from the shared queue and buffered in sidecar logic in one of a plurality of sidecars each associated with a respective one of the plurality of hardware threads. While the translation invalidation request is buffered in the sidecar, the sidecar logic broadcasts the translation invalidation request so that it is received and processed by the plurality of processor cores. In response to confirmation of completion of processing of the translation invalidation request by the initiating processor core, the sidecar logic removes the translation invalidation request from the sidecar.

Type: Grant

Filed: December 22, 2015

Date of Patent: February 21, 2017

Assignee: International Business Machines Corporation

Inventors: Guy L. Guthrie, Hugh Shen, Derek E. Williams
Propagation of updates to per-core-instantiated architecturally-visible storage resource

Patent number: 9575541

Abstract: A microprocessor a plurality of processing cores, wherein each of the plurality of processing cores instantiates a respective architecturally-visible storage resource. A first core of the plurality of processing cores is configured to encounter an architectural instruction that instructs the first core to update the respective architecturally-visible storage resource of the first core with a value specified by the architectural instruction. The first core is further configured to, in response to encountering the architectural instruction, provide the value to each of the other of the plurality of processing cores and update the respective architecturally-visible storage resource of the first core with the value. Each core of the plurality of processing cores other than the first core is configured to update the respective architecturally-visible storage resource of the core with the value provided by the first core without encountering the architectural instruction.

Type: Grant

Filed: May 19, 2014

Date of Patent: February 21, 2017

Assignee: VIA TECHNOLOGIES, INC.

Inventors: G. Glenn Henry, Stephan Gaskins
Dynamic thread sharing in branch prediction structures

Patent number: 9563430

Abstract: Embodiments relate to multithreaded branch prediction. An aspect includes a system for dynamically evaluating how to share entries of a multithreaded branch prediction structure. The system includes a first-level branch target buffer coupled to a processor circuit. The processor circuit is configured to perform a method. The method includes receiving a search request to locate branch prediction information associated with the search request, and searching for an entry corresponding to the search request in the first-level branch prediction structure. The entry is not allowed based on a thread state of the entry indicating that the entry has caused a problem on a thread associated with the thread state.

Type: Grant

Filed: March 19, 2014

Date of Patent: February 7, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: James J. Bonanno, Daniel Lipetz, Brian R. Prasky, Anthony Saporito
Relative offset branching in a fixed-width reduced instruction set computing architecture

Patent number: 9563427

Abstract: Embodiments relate to a system for relative offset branching in a reduced instruction set computing (RISC) architecture. One aspect is a system that includes memory and a processing circuit communicatively coupled to the memory. The system is configured to perform a method that includes fetching a branch instruction from an instruction stream having a fixed instruction width. A relative offset value is acquired from the instruction stream. The relative offset value is formatted as an offset relative to a program counter value and sized as a multiple of the fixed instruction width. The relative offset value is added with the program counter value to form a branch target address value. The branch target address value is loaded into a program counter based on the branch instruction. Execution of the instruction stream is redirected to a next instruction based on the branch target address value in the program counter.

Type: Grant

Filed: May 30, 2014

Date of Patent: February 7, 2017

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Michael K. Gschwind
Techniques for implementing barriers to efficiently support cumulativity in a weakly-ordered memory system

Patent number: 9563558

Abstract: A technique for operating a cache memory of a data processing system includes creating respective pollution vectors to track which of multiple concurrent threads executed by an associated processor core are currently polluted by a store operation resident in the cache memory. Dependencies in a dependency data structure of a store queue of the cache memory are set based on the pollution vectors to reduce unnecessary ordering effects. Store operations are dispatched from the store queue in accordance with the dependencies indicated by the dependency data structure.

Type: Grant

Filed: August 28, 2014

Date of Patent: February 7, 2017

Assignee: International Business Machines Corporation

Inventors: Guy L. Guthrie, Hugh Shen, William J. Starke, Derek E. Williams
Vector indexed memory access plus arithmetic and/or logical operation processors, methods, systems, and instructions

Patent number: 9552205

Abstract: A processor including a decode unit to receive a vector indexed load plus arithmetic and/or logical (A/L) operation plus store instruction. The instruction is to indicate a source packed memory indices operand that is to have a plurality of packed memory indices. The instruction is also to indicate a source packed data operand that is to have a plurality of packed data elements. The processor also includes an execution unit coupled with the decode unit. The execution unit, in response to the instruction, is to load a plurality of data elements from memory locations corresponding to the plurality of packed memory indices, perform A/L operations on the plurality of packed data elements of the source packed data operand and the loaded plurality of data elements, and store a plurality of result data elements in the memory locations corresponding to the plurality of packed memory indices.

Type: Grant

Filed: September 27, 2013

Date of Patent: January 24, 2017

Assignee: Intel Corporation

Inventors: Igor Ermolaev, Bret L. Toll, Robert Valentine, Jesus Corbal San Adrian, Gautam B. Doshi, Rama Kishan V. Malladi, Prasenjit Chakraborty

prev 1 2 3 4 5 6 7 … next