Abstract: A Conditional Transaction End (CTEND) instruction is provided that allows a program executing in a nonconstrained transactional execution mode to inspect a storage location that is modified by either another central processing unit or the Input/Output subsystem. Based on the inspected data, transactional execution may be ended or aborted, or the decision to end/abort may be delayed, e.g., until a predefined event occurs. For instance, when the instruction executes, the processor is in a nonconstrained transaction execution mode, and the transaction nesting depth is one at the beginning of the instruction, a second operand of the instruction is inspected, and based on the inspected data, transaction execution may be ended or aborted, or the decision to end/abort may be delayed, e.g., until a predefined event occurs, such as the value of the second operand becomes a prespecified value or a time interval is exceeded.
Type:
Grant
Filed:
June 12, 2019
Date of Patent:
March 23, 2021
Assignee:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventors:
Dan F. Greiner, Christian Jacobi, Marcel Mitran, Donald W. Schmidt, Timothy J. Slegel
Abstract: A processor includes a decode unit to decode a packed data alignment plus compute instruction. The instruction is to indicate a first set of one or more source packed data operands that is to include first data elements, a second set of one or more source packed data operands that is to include second data elements, at least one data element offset. An execution unit, in response to the instruction, is to store a result packed data operand that is to include result data elements that each have a value of an operation performed with a pair of a data element of the first set of source packed data operands and a data element of the second set of source packed data operands. The execution unit is to apply the at least one data element offset to at least a corresponding one of the first and second sets of source packed data operands. The at least one data element offset is to counteract any lack of correspondence between the data elements of each pair in the first and second sets of source packed data operands.
Type:
Grant
Filed:
April 6, 2018
Date of Patent:
March 2, 2021
Assignee:
Intel Corporation
Inventors:
Edwin Jan Van Dalen, Alexander Augusteijn, Martinus C. Wezelenburg, Steven Roos
Abstract: Branch prediction techniques are described that can improve the performance of pipelined microprocessors. A microprocessor with a hierarchical branch prediction structure is presented. The hierarchy of branch predictors includes: a multi-cycle predictor that provides very accurate branch predictions, but with a latency of multiple cycles; a small and simple branch predictor that can provide branch predictions for a sub-set of instructions with zero-cycle latency; and a fast, intermediate level branch predictor that provides relatively accurate branch prediction, while still having a low, but non-zero instruction prediction latency of only one cycle, for example. To improve operation, the higher accuracy, higher latency branch direction predictor and the fast, lower latency branch direction predictor can share a common target predictor.
Type:
Grant
Filed:
April 11, 2018
Date of Patent:
February 23, 2021
Assignee:
Futurewei Technologies, Inc.
Inventors:
Shiwen Hu, Wei Yu Chen, Michael Chow, Qian Wang, Yongbin Zhou, Lixia Yang, Ning Yang
Abstract: A computer system, processor, and method for processing information is disclosed that includes determining whether an instruction is a designated instruction, determining whether an instruction following the designated instruction is a subsequent store instruction, speculatively releasing the subsequent store instruction while the designated instruction is pending and before the subsequent store instruction is complete. Preferably, in response to determining that an instruction is the designated instruction, initiating or advancing a speculative tail pointer in an instruction completion table (ICT) to look through the instructions in the ICT following the designated instruction.
Type:
Grant
Filed:
February 6, 2019
Date of Patent:
February 23, 2021
Assignee:
International Business Machines Corporation
Inventors:
Kenneth L. Ward, Hung Q. Le, Dung Q. Nguyen, Bryan Lloyd
Abstract: A computer processor is disclosed. The computer processor may comprise a vector unit comprising a vector register file comprising at least one register to hold a varying number of elements. The computer processor may further comprise processing logic configured to operate on the varying number of elements in the vector register file using one or more graphics processing instructions. The computer processor may be implemented as a monolithic integrated circuit.
Type:
Grant
Filed:
May 21, 2015
Date of Patent:
February 16, 2021
Assignee:
Optimum Semiconductor Technologies Inc.
Inventors:
Mayan Moudgill, Gary J. Nacer, C. John Glossner, Arthur Joseph Hoane, Vitaly Kalashnikov, Sitij Agrawal
Abstract: Provided is a data processing method including the operations of storing, in a register, a first immediate portion included in a first instruction, from among the first immediate portion and a second immediate portion that constitute an immediate value, which is an operand; determining the immediate value by catenating the second immediate portion included in a second instruction with the stored first immediate portion; and performing an operation by using a value indicated by the second instruction and the determined immediate value.
Type:
Grant
Filed:
October 14, 2015
Date of Patent:
February 9, 2021
Assignee:
Samsung Electronics Co., Ltd.
Inventors:
Ki-seok Kwon, Min-wook Ahn, Suk-jin Kim, Young-hwan Park
Abstract: An apparatus and method are described for generating performance metrics of a processor. For example, one embodiment of a processor comprises: one or more simultaneous multithreading cores to simultaneously execute multiple instruction threads; a plurality of performance monitor counters, each to maintain a count of events occurring as a result of the execution of the multiple instruction threads; and a performance monitor unit to generate a plurality of performance metric values using the event counts stored in the performance monitor counters and in response to receipt of a request from software for the performance metric values.
Type:
Grant
Filed:
December 30, 2016
Date of Patent:
February 2, 2021
Assignee:
Intel Corporation
Inventors:
Ahmad Yasin, Moshe Cohen, Jacob Jack Doweck
Abstract: A streaming engine employed in a digital data processor specifies a fixed read only data stream defined by plural nested loops. An address generator produces address of data elements. A steam head register stores data elements next to be supplied to functional units for use as operands. Upon a stream break instruction specifying one of the nested loops, the stream engine ends a current iteration of the loop. If the specified loop was not the outermost loop, the streaming engine begins an iteration of a next outer loop. If the specified loop was the outermost nested loop, the streaming engine ends the stream. The streaming engine places a vector of data elements in order in lanes within a stream head register. A stream break instruction is operable upon a vector break.
Abstract: A computer processor may include a plurality of hardware threads. The computer processor may further include state processor logic for a state of a hardware thread. The state processor logic may include per thread logic that contains state that is replicated in each hardware thread of the plurality of hardware threads and common logic that is independent of each hardware thread of the plurality of hardware threads. The computer processor may further include single threaded mode logic to execute instructions in a single threaded mode from only one hardware thread of the plurality of hardware threads. The computer processor may further include second mode logic to execute instructions in a second mode from more than one hardware thread of the plurality of hardware threads simultaneously. The computer processor may further include switching mode logic to switch between the first mode and the second mode.
Type:
Grant
Filed:
May 16, 2016
Date of Patent:
February 2, 2021
Assignee:
Optimum Semiconductor Technologies Inc.
Inventors:
Mayan Moudgill, Gary Nacer, C. John Glossner, Arthur Joseph Hoane, Paul Hurtley, Murugappan Senthilvelan
Abstract: A conditional instruction end facility is provided that allows completion of an instruction to be delayed. In executing the machine instruction, an operand is obtained, and a determination is made as to whether the operand has a predetermined relationship with respect to a value. Based on determining that the operand does not have the predetermined relationship with respect to the value, the obtaining and the determining are repeated. Based on determining that the operand has the predetermined relationship with respect to the value, execution of the instruction is completed.
Type:
Grant
Filed:
July 17, 2019
Date of Patent:
January 26, 2021
Assignee:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventors:
Dan F. Greiner, Christian Jacobi, Marcel Mitran, Donald W. Schmidt, Timothy J. Slegel
Abstract: Systems and methods for controlling machine operations are provided. A number of data entries are organized into a stack. Each data entry includes a type, a flag, a length, and a value or pointer entry. For each data entry in the stack, the type of data is determined from the type entry, the presence of an address or value is determined by the respective flag entry, and a length of the address or value is determined from the respective length entry. The data to be utilized or an address for the same at the electronic storage area is provided at the respective value or pointer entry.
Abstract: A processor includes an array of resistive processing units connected between row and column lines with a resistive element. A first single instruction, multiple data processing unit (SIMD) is connected to the row lines. A second SIMD is connected to the column lines. A first instruction issuer is connected to the first SIMD to issue instructions to the first SIMD, and a second instruction issuer is connected to the second SIMD to issue instructions to the second SIMD such that the processor is programmable and configurable for specific operations depending on an issued instruction set.
Type:
Grant
Filed:
October 30, 2015
Date of Patent:
January 26, 2021
Assignee:
International Business Machines Corporation
Abstract: Examples of the present disclosure provide apparatuses and methods for determining a vector population count in a memory. An example method comprises determining, using sensing circuitry, a vector population count of a number of fixed length elements of a vector stored in a memory array.
Abstract: Aspects include monitoring a number of instructions of a first type dispatched to a first shared port of an issue queue of a processor and determining whether the number of instructions of the first type dispatched to the first shared port exceeds a port selection threshold. An instruction of a third type is dispatched to a second shared port of the issue queue associated with a plurality of instructions of a second type based on determining that the number of instructions of the first type dispatched to the first shared port exceeds the port selection threshold. The instruction of the third type is dispatched to the first shared port of the issue queue associated with a plurality of instructions of the first type based on determining that the number of instructions of the first type dispatched to the first shared port does not exceed the port selection threshold.
Type:
Grant
Filed:
November 30, 2017
Date of Patent:
January 5, 2021
Assignee:
INTERNATIONAL BUSINESS MACHINES CORPORATION
Inventors:
Balaram Sinharoy, Joel A. Silberman, Brian W. Thompto
Abstract: A method for a plurality of pipelines, each having a processing element having first and second inputs and first and second lines, wherein at least one of the pipelines includes first and second logic operable to select a respective line so that data is received at the first and second inputs respectively. A first mode is selected and for the at least one pipeline, the first and second lines of that pipeline are selected such that the processing element of that pipeline receives data via the first and second lines of that pipeline, the first line being capable of supplying data that is different to the second line. A second mode is selected and for the at least one pipeline a line of another pipeline is selected, the second line of the at least one pipeline is selected and the same data at the second line is supplied as the first line.
Abstract: An integrated circuit (IC) may include a scheduler for hardware acceleration. The scheduler may include a command queue having a plurality of slots and configured to store commands offloaded from a host processor for execution by compute units of the IC. The scheduler may include a status register having bit locations corresponding to the slots of the command queue. The scheduler may also include a controller coupled to the command queue and the status register. The controller may be configured to schedule the compute units of the IC to execute the commands stored in the slots of the command queue and update the bit locations of the status register to indicate which commands from the command queue are finished executing.
Type:
Grant
Filed:
May 24, 2018
Date of Patent:
December 29, 2020
Assignee:
Xilinx, Inc.
Inventors:
Soren T. Soe, Idris I. Tarwala, Umang Parekh, Sonal Santan, Hem C. Neema
Abstract: A processing system includes a processing pipeline which includes fetch circuitry for fetching instructions to be executed from a memory. Buffer control circuitry is responsive to a programmable trigger, such as explicit hint instructions delimiting an instruction burst, or predetermined configuration data specifying parameters of a burst together with a synchronising instruction, to trigger the buffer control circuitry to stall a stallable portion of the processing pipeline (e.g. issue circuitry), to accumulate within one or more buffers fetched instructions starting from a predetermined starting instruction, and, when those instructions have been accumulated, to restart the stallable portion of the pipeline.
Type:
Grant
Filed:
May 9, 2018
Date of Patent:
December 15, 2020
Assignee:
ARM LIMITED
Inventors:
Jatin Bhartia, Kauser Yakub Johar, Antony John Penton
Abstract: A method for scheduling micro-instructions, performed by a qualifier, is provided. The method includes the following steps: detecting a load write-back signal broadcasted by a load execution unit; determining whether to trigger a load-detection counting logic according to content of the load write-back signal; determining whether an execution status of a load micro-instruction is cache hit when the triggered load-detection counting logic reaches a predetermined value; and driving a release circuit to remove the first micro-instruction in a reservation station queue when the execution status of the load micro-instruction is cache hit and the first micro-instruction has been dispatched to an arithmetic and logic unit for execution.
Abstract: An apparatus is provided comprising rewritable storage circuitry to store at least one mapping between at least one instruction identifier and a behaviour modification. Selection circuitry selects, from the rewritable storage circuitry, a selected mapping having an instruction identifier that identifies a received instruction. The received instruction causes a data processing unit to perform a default behaviour. Control circuitry causes the data processing unit to behave in accordance with the default behaviour modified by the behaviour modification.
Type:
Grant
Filed:
October 30, 2015
Date of Patent:
December 8, 2020
Assignee:
ARM Limited
Inventors:
Karel Hubertus Gerardus Walters, Adam Raymond Duley
Abstract: A computer processor is disclosed. The computer processor comprise a vector unit comprising a vector register file comprising at least one vector register to hold a varying number of elements. The computer processor further comprises out-of-order issue logic that holds a pool of vector instructions, selects a vector instruction from the pool, and sends the vector instruction for execution. The vector instruction operates on the varying number of elements of the at least one vector register.
Type:
Grant
Filed:
May 21, 2015
Date of Patent:
November 24, 2020
Assignee:
Optimum Semiconductor Technologies Inc.
Inventors:
Mayan Moudgill, Gary J. Nacer, C. John Glossner, Arthur Joseph Hoane, Murugappan Senthilvelan, Pablo Balzola