Simultaneous Issuance Of Multiple Instructions Patents (Class 712/215)

Hardware apparatuses and methods to fuse instructions

Patent number: 10324724

Abstract: Methods and apparatuses relating to a fusion manager to fuse instructions are described. In one embodiment, a hardware processor includes a hardware binary translator to translate an instruction stream into a translated instruction stream, a hardware fusion manager to fuse multiple instructions of the translated instruction stream into a single fused instruction, a hardware decode unit to decode the single fused instruction into a decoded, single fused instruction, and a hardware execution unit to execute the decoded, single fused instruction.

Type: Grant

Filed: December 16, 2015

Date of Patent: June 18, 2019

Assignee: Intel Corporation

Inventors: Patrick P. Lai, Tyler N. Sondag, Sebastian Winkel, Polychronis Xekalakis, Ethan Schuchman, Jayesh Iyer
Multi-slice processor issue of a dependent instruction in an issue queue based on issue of a producer instruction

Patent number: 10282207

Abstract: Operation of a multi-slice processor that includes execution slices and load/store slices coupled via a results bus includes: receiving, by an execution slice, a producer instruction, including: storing, in an entry of an issue queue, the producer instruction; and storing, in a register, an issue queue entry identifier representing the entry of the issue queue in which the producer instruction is stored; receiving, by the execution slice, a source instruction, the source instruction dependent upon the result of the producer instruction, including: storing, in another entry of the issue queue, the source instruction and the issue queue entry identifier of the producer instruction; determining in dependence upon the issue queue entry identifier of the producer instruction that the producer instruction has issued from the issue queue; and responsive to the determination that the producer instruction has issued from the issue queue, issuing the source instruction from the issue queue.

Type: Grant

Filed: February 18, 2016

Date of Patent: May 7, 2019

Assignee: International Business Machines Corporation

Inventors: Brian D. Barrick, Sundeep Chadha, Michael J. Genden, Jerry Y. Lu, Dung Q. Nguyen, Nasrin Sultana, David R. Terry, David S. Walder
Multi-slice processor issue of a dependent instruction in an issue queue based on issue of a producer instruction

Patent number: 10268482

Abstract: Operation of a multi-slice processor that includes execution slices and load/store slices coupled via a results bus includes: receiving, by an execution slice, a producer instruction, including: storing, in an entry of an issue queue, the producer instruction; and storing, in a register, an issue queue entry identifier representing the entry of the issue queue in which the producer instruction is stored; receiving, by the execution slice, a source instruction, the source instruction dependent upon the result of the producer instruction, including: storing, in another entry of the issue queue, the source instruction and the issue queue entry identifier of the producer instruction; determining in dependence upon the issue queue entry identifier of the producer instruction that the producer instruction has issued from the issue queue; and responsive to the determination that the producer instruction has issued from the issue queue, issuing the source instruction from the issue queue.

Type: Grant

Filed: December 15, 2015

Date of Patent: April 23, 2019

Assignee: International Business Machines Corporation

Inventors: Brian D. Barrick, Sundeep Chadha, Michael J. Genden, Jerry Y. Lu, Dung Q. Nguyen, Nasrin Sultana, David R. Terry, David S. Walder
Dynamic thread execution arbitration

Patent number: 10269088

Abstract: A mechanism is described for facilitating thread execution arbitration for thread scheduling relating to graphics processors at computing devices. A method of embodiments, as described herein, includes assigning priority levels to threads based on stall signals communicated from the one or more shared function units to one or more execution units of a processor including a graphics processor, and selecting a first thread to be scheduled and a second thread to be ignored based on the stall signals.

Type: Grant

Filed: April 21, 2017

Date of Patent: April 23, 2019

Assignee: INTEL CORPORATION

Inventors: Joydeep Ray, Abhishek R. Appu, Subramaniam M. Maiyuran, Eric J. Hoekstra, Prasoonkumar Surti, Balaji Vembu, Altug Koker
Operation of a multi-slice processor with reduced flush and restore latency

Patent number: 10248421

Abstract: Operation of a multi-slice processor that includes execution slices and load/store slices coupled via a results bus, including: for a target instruction targeting a logical register, determining whether an entry in a general purpose register representing the logical register is pending a flush; if the entry in the general purpose register representing the logical register is pending a flush: cancelling the flush in the entry of the general purpose register; storing the target instruction in the entry of the general purpose register representing the logical register, and if an entry in a history buffer targeting the logical register is pending a restore, cancelling the restore for the entry of the history buffer.

Type: Grant

Filed: February 16, 2016

Date of Patent: April 2, 2019

Assignee: International Business Machines Corporation

Inventors: Salma Ayub, Brian D. Barrick, Joshua W. Bowman, Sundeep Chadha, Cliff Kucharski, Dung Q. Nguyen, David R. Terry, Jing Zhang
Reducing power consumption in a processor

Patent number: 10241557

Abstract: A processor includes a mechanism for disabling a memory array of a branch prediction unit. The processor may include a next fetch prediction unit that may include a number of entries. Each entry may correspond to a next instruction fetch group and may store an indication of whether or not the corresponding the next fetch group includes a conditional branch instruction. In response to an indication that the next fetch group does not include a conditional branch instruction, the fetch prediction unit may be configured to disable, in a next instruction execution cycle, the memory array of the branch prediction unit.

Type: Grant

Filed: December 12, 2013

Date of Patent: March 26, 2019

Assignee: Apple Inc.

Inventors: Conrado Blasco, Ronald P Hall, Ramesh B Gunna, Ian D Kountanis, Shyam Sundar, André Seznec
Operation of a multi-slice processor with reduced flush and restore latency

Patent number: 10241790

Abstract: Operation of a multi-slice processor that includes execution slices and load/store slices coupled via a results bus, including: for a target instruction targeting a logical register, determining whether an entry in a general purpose register representing the logical register is pending a flush; if the entry in the general purpose register representing the logical register is pending a flush: cancelling the flush in the entry of the general purpose register; storing the target instruction in the entry of the general purpose register representing the logical register, and if an entry in a history buffer targeting the logical register is pending a restore, cancelling the restore for the entry of the history buffer.

Type: Grant

Filed: December 15, 2015

Date of Patent: March 26, 2019

Assignee: International Business Machines Corporation

Inventors: Salma Ayub, Brian D. Barrick, Joshua W. Bowman, Sundeep Chadha, Cliff Kucharski, Dung Q. Nguyen, David R. Terry, Jing Zhang
Out-of-order processor and method for back to back instruction issue

Patent number: 10235181

Abstract: An out-of-order (OOO) processor includes ready logic that provides a signal indicating an instruction is ready when all operands for the instruction are ready, or when all operands are either ready or are marked back-to-back to a current instruction. By marking a second instruction that consumes an operand as ready when it is back-to-back with a first instruction that produces the operand, but the first instruction has not yet produced the operand, latency due to missed cycles in executing back-to-back instructions is minimized.

Type: Grant

Filed: February 3, 2017

Date of Patent: March 19, 2019

Assignee: International Business Machines Corporation

Inventor: Brian W. Thompto
Processor with approximate computing execution unit that includes an approximation control register having an approximation mode flag, an approximation amount, and an error threshold, where the approximation control register is writable by an instruction set instruction

Patent number: 10235232

Abstract: A processor includes an indicator configured to indicate a first mode or a second mode and a functional unit configured to perform computations with a full degree of accuracy when the indicator indicates the first mode and to perform computations with less than the full degree of accuracy when the indicator indicates the second mode.

Type: Grant

Filed: October 23, 2014

Date of Patent: March 19, 2019

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD

Inventors: G. Glenn Henry, Terry Parks, Rodney E. Hooker
Queuing memory access requests

Patent number: 10229066

Abstract: A data processing apparatus is provided including queue circuitry to respond to control signals each associated with a memory access instruction, and to queue a plurality of requests for data, each associated with a reference to a storage location. Resolution circuitry acquires a request for data, and issues the request for data, the resolution circuitry having a resolution circuitry limit. When a current capacity of the resolution circuitry is below the resolution circuitry limit, the resolution circuitry acquires the request for data by receiving the request for data from the queue circuitry, stores the request for data in association with the storage location, issues the request for data, and causes a result of issuing the request for data to be provided to said storage location.

Type: Grant

Filed: September 30, 2016

Date of Patent: March 12, 2019

Assignee: ARM Limited

Inventors: Miles Robert Dooley, Matthew Andrew Rafacz, Huzefa Moiz Sanjeliwala, Michael Filippo
Hyper-threaded processor allocation to nodes in multi-tenant distributed software systems

Patent number: 10228982

Abstract: A mechanism is provided for allocating a hyper-threaded processor to nodes of multi-tenant distributed software systems. Responsive to receiving a request to provision a node of the multi-tenant distributed software system on the host data processing system, a cluster to which the node belongs is identified. Responsive to the node being a second type of node, responsive to determining that another second type of node in the same cluster has been provisioned on the host data processing system, and responsive to the number of unallocated VPs on different physical processors from that of the other second type of node being greater than or equal to the requested number of VPs for the second type of node, the requested number of VPs for the second type of node is allocated each to a different physical processor from that of the other second type of node.

Type: Grant

Filed: January 25, 2018

Date of Patent: March 12, 2019

Assignee: International Business Machines Corporation

Inventors: Rachit Arora, Dharmesh K. Jain, Padmanabhan Krishnan, Shrinivas S. Kulkarni, Subin Shekhar
Out-of-order processor and method for back to back instruction issue

Patent number: 10223126

Abstract: An out-of-order (OOO) processor includes ready logic that provides a signal indicating an instruction is ready when all operands for the instruction are ready, or when all operands are either ready or are marked back-to-back to a current instruction. By marking a second instruction that consumes an operand as ready when it is back-to-back with a first instruction that produces the operand, but the first instruction has not yet produced the operand, latency due to missed cycles in executing back-to-back instructions is minimized.

Type: Grant

Filed: January 6, 2017

Date of Patent: March 5, 2019

Assignee: International Business Machines Corporation

Inventor: Brian W. Thompto
Hyper-threaded processor allocation to nodes in multi-tenant distributed software systems

Patent number: 10216547

Abstract: A mechanism is provided for allocating a hyper-threaded processor to nodes of multi-tenant distributed software systems. Responsive to receiving a request to provision a node of the multi-tenant distributed software system on the host data processing system, a cluster to which the node belongs is identified. Responsive to the node being a second type of node, responsive to determining that another second type of node in the same cluster has been provisioned on the host data processing system, and responsive to the number of unallocated VPs on different physical processors from that of the other second type of node being greater than or equal to the requested number of VPs for the second type of node, the requested number of VPs for the second type of node is allocated each to a different physical processor from that of the other second type of node.

Type: Grant

Filed: November 22, 2016

Date of Patent: February 26, 2019

Assignee: International Business Machines Corporation

Inventors: Rachit Arora, Dharmesh K. Jain, Padmanabhan Krishnan, Shrinivas S. Kulkarni, Subin Shekhar
Processor core having a saturating event counter for making performance measurements

Patent number: 10169187

Abstract: A performance monitor including a saturating counter provides a relative measure of event frequency without requiring a minimum polling rate or periodic reset to avoid or account for counter overflow. The saturating counter is incremented upon detection of an event and decremented if an event is not detected within a predetermined period. The period of detecting may be programmable and may be determined by real time clock, processor or instruction cycles. Multiple event types may be selected from for detection and input to a single counter, or alternatively multiple event counters may be provided for various event types. The saturating counter may additionally be periodically reset in a selected operating mode, in combination with the decrementing action performed on the counter.

Type: Grant

Filed: August 18, 2010

Date of Patent: January 1, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Venkat Rajeev Indukuru, Alexander Erik Mericas
Processing core having shared front end unit

Patent number: 10140129

Abstract: A processor having one or more processing cores is described. Each of the one or more processing cores has front end logic circuitry and a plurality of processing units. The front end logic circuitry is to fetch respective instructions of threads and decode the instructions into respective micro-code and input operand and resultant addresses of the instructions. Each of the plurality of processing units is to be assigned at least one of the threads, is coupled to said front end unit, and has a respective buffer to receive and store microcode of its assigned at least one of the threads.

Type: Grant

Filed: December 28, 2012

Date of Patent: November 27, 2018

Assignee: Intel Corporation

Inventors: Ilan Pardo, Dror Markovich, Oren Ben-Kiki, Yuval Yosef
Zero-overhead loop in an embedded digital signal processor

Patent number: 10114644

Abstract: A decoding logic method is arranged to execute a zero-overhead loop in an embedded digital signal processor (DSP). In the method, instruction data is fetched from a memory, and a plurality of instruction tokens, which are derived from the instruction data, are stored in a token buffer. A first portion of one or more instruction tokens from the token buffer are passed to a first decode module, which may be an instruction decode module, and a second portion of the one or more instruction tokens from the token buffer are passed to a second decode module, which may be a loop decode module. The second decode module detects a special loop instruction token, and based on the detection of the special loop instruction token, a loop counter is conditionally tested. Using the first decode module, at least one instruction token of an iterative algorithm is assembled into a single instruction, which is executable in a single execution cycle.

Type: Grant

Filed: July 26, 2016

Date of Patent: October 30, 2018

Assignee: STMICROELECTRONICS (BEIJING) R&D CO. LTD

Inventors: PengFei Zhu, Xiao Kang Jiao
Command message generation and execution using a machine code-instruction

Patent number: 10114638

Abstract: In one embodiment, command message generation and execution using a machine code-instruction is performed. One embodiment includes a particular machine executing a single machine-code instruction including a reference into a command-message-building data structure stored in memory. This executing the single machine-code instruction includes generating a command message and initiating communication of the command message to a hardware accelerator, including copying command information from the command-message-building data structure based on the reference into the command message. The hardware accelerator receives and executes the command message. In one embodiment, the command message is message-switched from a processor to a hardware accelerator, such as, but not limited to, a memory controller, a table lookup unit, or a prefix lookup unit. In one embodiment, a plurality of threads share the command-message-building data structure.

Type: Grant

Filed: December 15, 2014

Date of Patent: October 30, 2018

Assignee: Cisco Technology, Inc.

Inventor: Donald Edward Steiss
Speculative retirement of post-lock instructions

Patent number: 10095637

Abstract: Techniques for improving execution of a lock instruction are provided herein. A lock instruction and younger instructions are allowed to speculatively retire prior to the store portion of the lock instruction committing its value to memory. These instructions thus do not have to wait for the lock instruction to complete before retiring. In the event that the processor detects a violation of the atomic or fencing properties of the lock instruction prior to committing the value of the lock instruction, the processor rolls back state and executes the lock instruction in a slow mode in which younger instructions are not allowed to retire until the stored value of the lock instruction is committed. Speculative retirement of these instructions results in increased processing speed, as instructions no longer need to wait to retire after execution of a lock instruction.

Type: Grant

Filed: September 15, 2016

Date of Patent: October 9, 2018

Assignee: ADVANCED MICRO DEVICES, INC.

Inventors: Gregory W. Smaus, John M. King, Michael D. Achenbach, Kevin M. Lepak, Matthew A. Rafacz, Noah Bamford
Method and/or apparatus for generating signal processing pipelines

Patent number: 10089081

Abstract: A method and apparatus for generating a signal processing pipeline based, at least in part, on manipulation of a GUI, the signal processing pipeline comprising one or more signal processing operations.

Type: Grant

Filed: December 31, 2014

Date of Patent: October 2, 2018

Assignee: Excalibur IP, LLC

Inventors: Guangxin Yang, Ji Zhou, Shuo Yang, Yan Xia, Delu Zhu
Reconfigurable processor with load-store slices supporting reorder and controlling access to cache slices

Patent number: 10083039

Abstract: A processor core having multiple parallel instruction execution slices and coupled to multiple dispatch queues by a dispatch routing network provides flexible and efficient use of internal resources. The configuration of the execution slices is selectable so that capabilities of the processor core can be adjusted according to execution requirements for the instruction streams. Two or more execution slices can be combined as super-slices to handle wider data, wider operands and/or vector operations, according to one or more mode control signal that also serves as a configuration control signal. The mode control signal is also used to partition clusters of the execution slices within the processor core according to whether single-threaded or multi-threaded operation is selected, and additionally according to a number of hardware threads that are active.

Type: Grant

Filed: January 30, 2018

Date of Patent: September 25, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lee Evan Eisen, Hung Qui Le, Jentje Leenstra, Jose Eduardo Moreira, Bruce Joseph Ronchetti, Brian William Thompto, Albert James Van Norstrand, Jr.
Techniques to wake-up dependent instructions for back-to-back issue in a microprocessor

Patent number: 10078516

Abstract: Techniques are disclosed for back-to-back issue of instructions in a processor. A first instruction is stored in a queue position in an issue queue. The issue queue stores instructions in a corresponding queue position. The first instruction includes a target instruction tag and at least a source instruction tag. The target instruction tag is stored in a table storing a plurality of target instruction tags associated with a corresponding instruction. Each stored target instruction tag specifies a logical register that stores a target operand. Upon determining, based on the source instruction tag associated with the first instruction and the target instruction tag associated with a second instruction, that the first instruction is dependent on the second instruction, a pointer to the first instruction is associated with the second instruction. The pointer is used to wake up the first instruction upon issue of the second instruction.

Type: Grant

Filed: August 24, 2015

Date of Patent: September 18, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jeffrey C. Brownscheidle, Sundeep Chadha, Maureen A. Delaney, Dung Q. Nguyen
Operation of a multi-slice processor preventing early dependent instruction wakeup

Patent number: 9983875

Abstract: Operation of a multi-slice processor that includes a plurality of execution slices, a plurality of load/store slices, and an instruction sequencing unit, where operation includes: receiving, at a load/store slice, a load instruction to be issued; determining, at the load/store slice, that the load instruction has not completed and is to be reissued; and responsive to determining that the load instruction is to be reissued, delaying a signal, from the load/store slice to the instruction sequencing unit, that allows the instruction sequencing unit to issue one or more instructions dependent upon the load instruction.

Type: Grant

Filed: March 4, 2016

Date of Patent: May 29, 2018

Assignee: International Business Machines Corporation

Inventors: Sundeep Chadha, David A. Hrusecky, Elizabeth A. McGlone, Jennifer L. Molnar
Dynamic assignment across dispatch pipes of source ports to be used to obtain indication of physical registers

Patent number: 9971601

Abstract: Dynamic resource allocation is provided in which additional resources, such as additional architected registers, are provided to an instruction, if it is determined that resources in addition to those configured to be provided to the instruction are to be used for the particular instruction. An instruction to be executed is dispatched on a pipe of a pipeline and that pipe is configured to have a set number of architected registers for use by the instruction. However, if one or more other architected registers are needed, those additional architected registers are dynamically allocated to the instruction by assigning one or more source ports of an additional pipe to the instruction.

Type: Grant

Filed: February 13, 2015

Date of Patent: May 15, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Gregory W. Alexander, Brian D. Barrick, Fadi Y. Busaba, Wen H. Li, Edward T. Malley
Techniques to wake-up dependent instructions for back-to-back issue in a microprocessor

Patent number: 9971600

Abstract: Techniques are disclosed for back-to-back issue of instructions in a processor. A first instruction is stored in a queue position in an issue queue. The issue queue stores instructions in a corresponding queue position. The first instruction includes a target instruction tag and at least a source instruction tag. The target instruction tag is stored in a table storing a plurality of target instruction tags associated with a corresponding instruction. Each stored target instruction tag specifies a logical register that stores a target operand. Upon determining, based on the source instruction tag associated with the first instruction and the target instruction tag associated with a second instruction, that the first instruction is dependent on the second instruction, a pointer to the first instruction is associated with the second instruction. The pointer is used to wake up the first instruction upon issue of the second instruction.

Type: Grant

Filed: June 26, 2015

Date of Patent: May 15, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jeffrey C. Brownscheidle, Sundeep Chadha, Maureen A. Delaney, Dung Q. Nguyen
Independent mapping of threads

Patent number: 9870229

Abstract: Embodiments of the present invention provide systems and methods for mapping the architected state of one or more threads to a set of distributed physical register files to enable independent execution of one or more threads in a multiple slice processor. In one embodiment, a system is disclosed including a plurality of dispatch queues which receive instructions from one or more threads and an even number of parallel execution slices, each parallel execution slice containing a register file. A routing network directs an output from the dispatch queues to the parallel execution slices and the parallel execution slices independently execute the one or more threads.

Type: Grant

Filed: September 29, 2015

Date of Patent: January 16, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sam G. Chu, Markus Kaltenbach, Hung Q. Le, Jentje Leenstra, Jose E. Moreira, Dung Q. Nguyen, Brian W. Thompto
Method and system for yield operation supporting thread-like behavior

Patent number: 9811343

Abstract: A method, system, and computer program product synchronize a group of workitems executing an instruction stream on a processor. The processor is yielded by a first workitem responsive to a synchronization instruction in the instruction stream. A first one of a plurality of program counters is updated to point to a next instruction following the synchronization instruction in the instruction stream to be executed by the first workitem. A second workitem is run on the processor after the yielding.

Type: Grant

Filed: May 26, 2017

Date of Patent: November 7, 2017

Assignee: Advanced Micro Devices, Inc.

Inventors: Lee W. Howes, Benedict R. Gaster, Michael C. Houston
Method for executing multithreaded instructions grouped into blocks

Patent number: 9811377

Abstract: A method for executing multithreaded instructions grouped into blocks. The method includes receiving an incoming instruction sequence using a global front end; grouping the instructions to form instruction blocks, wherein the instructions of the instruction blocks are interleaved with multiple threads; scheduling the instructions of the instruction block to execute in accordance with the multiple threads; and tracking execution of the multiple threads to enforce fairness in an execution pipeline.

Type: Grant

Filed: March 14, 2014

Date of Patent: November 7, 2017

Assignee: INTEL CORPORATION

Inventor: Mohammad Abdallah
Computer processor employing split crossbar circuit for operand routing and slot-based organization of functional units

Patent number: 9747238

Abstract: A computer processor including a plurality of functional units that performs operations that produce result operands at different characteristic latencies over multiple cycles. An interconnect network provides data paths for transfer of operand data between functional units. The interconnect network includes first and second crossbar parts. The first crossbar part is configured to route result operands produced with the lowest characteristic latency to any other functional unit. The second crossbar part is configured to route result operands with higher characteristic latency relative to the lowest characteristic latency to the first crossbar part where such result operands are in turn routed to any functional unit. In another aspect, the functional units can be organized as multiple slots where each slot can produce multiple result operands of different characteristic latencies in the same cycle, and wherein each slot employs separate result registers for each characteristic latency present on the slot.

Type: Grant

Filed: June 23, 2014

Date of Patent: August 29, 2017

Assignee: MILL COMPUTING, INC.

Inventors: Roger Rawson Godard, Arthur David Kahlich, Sebastien Paul Maurice Mirolo, David Arthur Yost
Mechanism for instruction set based thread execution on a plurality of instruction sequencers

Patent number: 9720697

Abstract: In an embodiment, a method is provided. The method includes managing user-level threads on a first instruction sequencer in response to executing user-level instructions on a second instruction sequencer that is under control of an application level program. A first user-level thread is run on the second instruction sequencer and contains one or more user level instructions. A first user level instruction has at least 1) a field that makes reference to one or more instruction sequencers or 2) implicitly references with a pointer to code that specifically addresses one or more instruction sequencers when the code is executed.

Type: Grant

Filed: September 10, 2012

Date of Patent: August 1, 2017

Assignee: INTEL CORPORATION

Inventors: Hong Wang, John Shen, Ed Grochowski, James Paul Held, Bryant Bigbee, Shivnandan D. Kaushik, Gautham Chinya, Xiang Zou, Per Hammarlund, Xinmin Tian, Anil Aggarwal, Scott Dion Rodgers, Prashant Sethi, Baiju V. Patel, Richard Andrew Hankins
Extensible execution unit interface architecture with multiple decode logic and multiple execution units

Patent number: 9710274

Abstract: Various methods tightly couple together decode logic associated with multiple types of execution units and having varying priorities to enable instructions that are decoded as valid instructions for multiple types of execution units to be forwarded to a highest priority type of execution unit among the multiple types of execution units. Among other benefits, when an auxiliary execution unit is coupled to a general purpose processing core with the decode logic for the auxiliary execution unit tightly coupled with the decode logic for the general purpose processing core, the auxiliary execution unit may be used to effectively overlay new functionality for an existing instruction that is normally executed by the general purpose processing core, e.g., to patch a design flaw in the general purpose processing core or to provide improved performance for specialized applications.

Type: Grant

Filed: April 11, 2016

Date of Patent: July 18, 2017

Assignee: International Business Machines Corporation

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
Very-long instruction word (VLIW) processor and compiler for executing instructions in parallel

Patent number: 9697004

Abstract: A Very Long Instruction Word (VLIW) processor having an instruction set with a reduced size resulting in a small number of bits being necessary to specify registers. The VLIW processor includes a register file, and first through third operation units, and executes a very long instruction word. Further, the very long instruction word includes a register specifying field which specifies a least one of the registers in the register file and a plurality of instructions. The operand of each instruction includes bits src1, src2, and dst, which indicate whether or not the registers specified by the register specifying field are to be used as the source register and the destination register.

Type: Grant

Filed: April 8, 2014

Date of Patent: July 4, 2017

Assignee: SOCIONEXT INC.

Inventors: Takahiro Kageyama, Hideshi Nishida, Takeshi Tanaka, Kouji Nakajima
Parallel streaming of external data

Patent number: 9684671

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for streaming external data in parallel from a second distributed system to a first distributed system. One of the methods includes receiving a query that requests a join of first rows of a first table in a first distributed system with second rows of an external table, the external table representing data in a second distributed system. Each of the segment nodes communicates with a respective extension service that obtains fragments from one or more data nodes of the second distributed system according to location information for the respective fragments, and provides to the segment node a stream of data corresponding to second rows of the external table. Each of the segment nodes computes joined rows between the first rows of the first table and the stream of data corresponding to second rows of the external table.

Type: Grant

Filed: February 28, 2014

Date of Patent: June 20, 2017

Assignee: Pivotal Software, Inc.

Inventors: Dov Yaron Dorin, Alon Goldshuv, Alex Shacked
Method of placement and routing in a reconfiguration of a dynamically reconfigurable processor

Patent number: 9633160

Abstract: A method and system are provided for deriving a resultant compiled software code with increased compatibility for placement and routing of a dynamically reconfigurable processor.

Type: Grant

Filed: March 15, 2013

Date of Patent: April 25, 2017

Inventor: Robert Keith Mykland
VLIW processor, instruction structure, and instruction execution method

Patent number: 9606798

Abstract: A processor, includes a first comparison operation unit; a second comparison operation unit; a first operation unit; a second operation unit; a third operation unit; and a register, wherein the first comparison operation unit receives a first comparison operation signal, a first input signal, and a second input signal, performs a comparison operation indicated by the first comparison operation signal on the first input signal and the second input signal, and outputs a result of the comparison operation, the second comparison operation unit receives a second comparison operation signal, a third input signal, and a fourth input signal, performs a comparison operation indicated by the second comparison operation signal on the third input signal and the fourth input signal, and outputs a result of the comparison operation, the first operation unit receives the comparison result of the first comparison operation unit.

Type: Grant

Filed: January 6, 2016

Date of Patent: March 28, 2017

Assignee: Renesas Electronics Corporation

Inventor: Yuki Kobayashi
Extensible execution unit interface architecture with multiple decode logic and multiple execution units

Patent number: 9594562

Abstract: Various circuit arrangements tightly couple together decode logic associated with multiple types of execution units and having varying priorities to enable instructions that are decoded as valid instructions for multiple types of execution units to be forwarded to a highest priority type of execution unit among the multiple types of execution units. Among other benefits, when an auxiliary execution unit is coupled to a general purpose processing core with the decode logic for the auxiliary execution unit tightly coupled with the decode logic for the general purpose processing core, the auxiliary execution unit may be used to effectively overlay new functionality for an existing instruction that is normally executed by the general purpose processing core, e.g., to patch a design flaw in the general purpose processing core or to provide improved performance for specialized applications.

Type: Grant

Filed: April 11, 2016

Date of Patent: March 14, 2017

Assignee: International Business Machines Corporation

Inventors: Adam J. Muff, Paul E. Schardt, Robert A. Shearer, Matthew R. Tubbs
Causing an interrupt based on event count

Patent number: 9575766

Abstract: Some implementations provide techniques and arrangements for causing an interrupt in a processor in response to an occurrence of a number of events. A first event counter counts the occurrences of a type of event within the processor and outputs a signal to activate a second event counter in response to reaching a first predefined count. The second event counter counts the occurrences of the type of event within the processor and causes an interrupt of the processor in response to reaching a second predefined count.

Type: Grant

Filed: December 29, 2011

Date of Patent: February 21, 2017

Assignee: Intel Corporation

Inventors: Ahmad Yasin, Peggy J. Irelan, Ofer Levy, Emile Ziedan, Grant Zhou
Error recovery following speculative execution with an instruction processing pipeline

Patent number: 9519538

Abstract: An instruction processing pipeline having error detection and error recovery circuitry associated with one or more of the pipeline stages. If an error is detected within a signal value within that pipeline stage, then it can be repaired. Part of the error recovery may be to flush upstream program instructions from the instruction pipeline. When multi-threading, only those instructions from a thread including an instruction which has been lost as a consequence of the error recovery need be flushed from the instruction pipeline. The instruction pipeline may additionally/alternatively be provided with more than one main storage element associated with each signal value with these main storage elements used in an alternating fashion such that if a signal value has been erroneously captured and needs to be repaired, there is still available a main storage element to properly capture the signal value corresponding to the following program instruction.

Type: Grant

Filed: June 6, 2011

Date of Patent: December 13, 2016

Assignee: ARM Limited

Inventors: Emre Özer, Shidhartha Das, David Michael Bull
Tiered caching and migration in differing granularities

Patent number: 9471506

Abstract: For data processing in a computing storage environment by a processor device, the environment incorporating at least high-speed and lower-speed caches, and managed tiered levels of storage, groups of data segments are migrated between the tiered levels of storage such that clumped uniformly hot ones of the groups of data segments are migrated to use a Solid State Drive (SSD) portion of the tiered levels of storage; uniformly hot groups of data segments are determined using a first, largest granulated, heat map for a selected one of the group of the data segments; a second heat map, which is smaller than the first and having the largest granularity of the first heat map, is used to determine the clumped hot groups; and sparsely hot groups are determined when neither the first heat map nor the second heat map are hotter than the first and second predetermined thresholds, respectively.

Type: Grant

Filed: April 22, 2015

Date of Patent: October 18, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael T. Benhase, Lokesh M. Gupta, Cheng-Chung Song
Run-time code parallelization using out-of-order renaming with pre-allocation of physical registers

Patent number: 9430244

Abstract: A method includes processing a sequence of instructions of program code that are specified using one or more architectural registers, by a hardware-implemented pipeline that renames the architectural registers in the instructions so as to produce operations specified using one or more physical registers. At least first and second segments of the sequence of instructions are selected, wherein the second segment occurs later in the sequence than the first segment. One or more of the architectural registers in the instructions of the second segment are renamed, before completing renaming the architectural registers in the instructions of the first segment, by pre-allocating one or more of the physical registers to one or more of the architectural registers.

Type: Grant

Filed: October 28, 2015

Date of Patent: August 30, 2016

Assignee: CENTIPEDE SEMI LTD.

Inventors: Omri Tennenhaus, Alberto Mandler, Noam Mizrahi
Speculative finish of instruction execution in a processor core

Patent number: 9389867

Abstract: In a processor core, high latency operations are tracked in entries of a data structure associated with an execution unit of the processor core. In the execution unit, execution of an instruction dependent on a high latency operation tracked by an entry of the data structure is speculatively finished prior to completion of the high latency operation. Speculatively finishing the instruction includes reporting an identifier of the entry to completion logic of the processor core and removing the instruction from an execution pipeline of the execution unit. The completion logic records dependence of the instruction on the high latency operation and commits execution results of the instruction to an architected state of the processor only after successful completion of the high latency operation.

Type: Grant

Filed: August 31, 2015

Date of Patent: July 12, 2016

Assignee: International Business Machines Corporation

Inventors: Sundeep Chadha, Bryan Lloyd, Dung Q. Nguyen, David S. Ray, Benjamin W. Stolt
Speculative finish of instruction execution in a processor core

Patent number: 9384002

Abstract: In a processor core, high latency operations are tracked in entries of a data structure associated with an execution unit of the processor core. In the execution unit, execution of an instruction dependent on a high latency operation tracked by an entry of the data structure is speculatively finished prior to completion of the high latency operation. Speculatively finishing the instruction includes reporting an identifier of the entry to completion logic of the processor core and removing the instruction from an execution pipeline of the execution unit. The completion logic records dependence of the instruction on the high latency operation and commits execution results of the instruction to an architected state of the processor only after successful completion of the high latency operation.

Type: Grant

Filed: November 16, 2012

Date of Patent: July 5, 2016

Assignee: International Business Machines Corporation

Inventors: Sundeep Chadha, Bryan Lloyd, Dung Q. Nguyen, David S. Ray, Benjamin W. Stolt
Lock spin wait operation for multi-threaded applications in a multi-core computing environment

Patent number: 9378069

Abstract: A method, system and computer-usable medium are disclosed for a lock-spin-wait operation for managing multi-threaded applications in a multi-core computing environment. A target processor core, referred to as a “spin-wait core” (SWC), is assigned (or reserved) for primarily running spin-waiting threads. Threads operating in the multi-core computing environment that are identified as spin-waiting are then moved to a run queue associated with the SWC to acquire a lock. The spin-waiting threads are then allocated a lock response time that is less than the default lock response time of the operating system (OS) associated with the SWC. If a spin-waiting fails to acquire a lock within the allocated lock response time, the SWC is relinquished, ceding its availability for other spin-waiting threads in the run queue to acquire a lock. Once a spin-waiting thread acquires a lock, it is migrated to its original, or an available, processor core.

Type: Grant

Filed: March 5, 2014

Date of Patent: June 28, 2016

Assignee: International Business Machines Corporation

Inventors: Men-Chow Chiang, Ken V. Vu
Collecting and attaching a bug trace to a problem information technology ticket

Patent number: 9372777

Abstract: Methods and arrangements for enhancing a ticket relative to user interaction with a system. An information technology ticket related to user interaction with an information technology system is received, and a system trace is activated, wherein additional input related to the user interaction with the information technology system is accepted. Information derived from the trace of the information technology system is associated with the information technology ticket. Other variants and embodiments are broadly contemplated herein.

Type: Grant

Filed: February 28, 2013

Date of Patent: June 21, 2016

Assignee: International Business Machines Corporation

Inventors: Pankaj Dhoolia, Diptikalyan Saha, Ram Viswanathan
Microprocessor with integrated NOP slide detector

Patent number: 9330011

Abstract: A microprocessor includes an instruction cache and a hardware state machine configured to detect a no operation (NOP) slide by counting a continuous sequence of NOP instructions within a stream of instructions fetched from the instruction cache. The microprocessor is configured to suspend execution of the stream of instructions, and transfer control to another routine, in response to detecting the NOP slide.

Type: Grant

Filed: October 10, 2013

Date of Patent: May 3, 2016

Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.

Inventor: Terry Parks
Dynamic write port re-arbitration

Patent number: 9286069

Abstract: Within a processing pipeline 14, issue control circuitry 12 serves to arbitrate write port availability when floating point multiplication instructions are issued into a floating point pipeline 14. If not operating in a flush-to-zero mode, then depending upon the output operands generated denormal handling may or may not be required. A pessimistic assumption is made upon issue that denormal handling will be required and accordingly the write port reserved is a first predetermined number of processing cycles after the start cycle to take account of use of the denormal handling pipeline stage 20. Partway along the processing pipeline 14, state becomes available which indicates whether or not denormal handling is actually required. If denormal handling is not required and a write port is available one processing cycle earlier, then bypass circuitry 22 serves to bypass the denormal handling pipeline stage 20 such that the output operand will be written to the register bank 16 one processing cycle earlier.

Type: Grant

Filed: December 21, 2012

Date of Patent: March 15, 2016

Assignee: ARM Limited

Inventors: Cédric Denis Robert Airaud, Luca Scalabrino, Frederic Jean Denis Arsanto, Guillaume Schon
Redundant transactions for detection of timing sensitive errors

Patent number: 9251014

Abstract: A method for detecting a software-race condition in a program includes copying a state of a transaction of the program from a first core of a multi-core processor to at least one additional core of the multi-core processor, running the transaction, redundantly, on the first core and the at least one additional core given the state, outputting a result of the first core and the at least one additional core, and detecting a difference in the results between the first core and the at least one additional core, wherein the difference indicates the software-race condition.

Type: Grant

Filed: August 8, 2013

Date of Patent: February 2, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Harold W. Cain, III, David M. Daly, Michael C. Huang, Kattamuri Ekanadham, Jose E. Moreira, Mauricio J. Serrano
VLIW processor, instruction structure, and instruction execution method

Patent number: 9250898

Abstract: In a processor, a first operation unit outputs as a first operation result, an output of a first comparison operation unit, or an AND or OR of the output and a value already held in a register according to a first control signal. A second operation unit outputs, as a second operation result, an output of a second comparison operation unit, or an AND or OR of the output and a value already held in the register according to a second control signal. A third operation unit outputs, as an execution result, the first operation result, or an AND or OR of the first operation result and the second operation result to the register according to a third control signal. The register newly holds and outputs the execution result from the third operation unit.

Type: Grant

Filed: November 27, 2012

Date of Patent: February 2, 2016

Assignee: Renesas Electronics Corporation

Inventor: Yuki Kobayashi
Computing device with asynchronous auxiliary execution unit

Patent number: 9201801

Abstract: A computing device includes: an instruction cache storing primary execution unit instructions and auxiliary execution unit instructions in a sequential order; a primary execution unit configured to receive and execute the primary execution unit instructions from the instruction cache; an auxiliary execution unit configured to receive and execute only the auxiliary execution unit instructions from the instruction cache in a manner independent from and asynchronous to the primary execution unit; and completion circuitry configured to coordinate completion of the primary execution unit instructions by the primary execution unit and the auxiliary execution unit instructions according to the sequential order.

Type: Grant

Filed: September 15, 2010

Date of Patent: December 1, 2015

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bechara F. Boury, Michael Bryan Mitchell, Paul Michael Steinmetz, Kenichi Tsuchiya
Method and apparatus for fast decoding and enhancing execution speed of an instruction

Patent number: 9176738

Abstract: Method and apparatus for fast decoding of microinstructions are disclosed. An integrated circuit is disclosed wherein microinstructions are queued for execution in an execution unit having multiple pipelines where each pipeline is configured to execute a set of supported microinstructions. The execution unit receives microinstruction data including an operation code (opcode) or a complex opcode. The execution unit executes the microinstruction multiple times wherein the microinstruction is executed at least once to get an address value and at least once to get a result of an operation. The execution unit processes complex opcodes by utilizing both a load/store support and a simple opcode support by splitting the complex opcode into load/store and simple opcode components and creating an internal source/destination between the two components.

Type: Grant

Filed: January 12, 2011

Date of Patent: November 3, 2015

Assignee: Advanced Micro Devices, Inc.

Inventors: Ganesh Venkataramanan, Emil Talpes
Techniques for reducing memory access bandwidth in a graphics processing system based on destination alpha values

Patent number: 9087409

Abstract: This disclosure describes techniques for reducing memory access bandwidth in a graphics processing system based on destination alpha values. The techniques may include retrieving a destination alpha value from a bin buffer, the destination alpha value being generated in response to processing a first pixel associated with a first primitive. The techniques may further include determining, based on the destination alpha value, whether to perform an action that causes one or more texture values for a second pixel to not be retrieved from a texture buffer. In some examples, the action may include discarding the second pixel from a pixel processing pipeline prior to the second pixel arriving at a texture mapping stage of the pixel processing pipeline. The second pixel may be associated with a second primitive different than the first primitive.

Type: Grant

Filed: March 1, 2012

Date of Patent: July 21, 2015

Assignee: QUALCOMM Incorporated

Inventor: Andrew Gruber

prev 1 2 3 4 5 6 … next