Patents Examined by John Lindlof
-
Patent number: 9910670Abstract: A processor is described having an instruction execution pipeline. The instruction execution pipeline includes an instruction fetch stage to fetch an instruction. The instruction format of the instruction specifies a first input vector, a second input vector and a third input operand. The instruction execution pipeline comprises an instruction decode stage to decode the instruction. The instruction execution pipeline includes a functional unit to execute the instruction. The functional unit includes a routing network to route a first contiguous group of elements from a first end of one of the input vectors to a second end of the instruction's resultant vector, and, route a second contiguous group of elements from a second end of the other of the input vectors to a first end of the instruction's resultant vector. The first and second ends are opposite vector ends. The first and second groups of contiguous elements are defined from the third input operand.Type: GrantFiled: July 9, 2014Date of Patent: March 6, 2018Assignee: INTEL CORPORATIONInventors: Mikhail Plotnikov, Igor Ermolaev
-
Patent number: 9880848Abstract: A processing core of a plurality of processing cores is configured to execute a speculative region of code as a single atomic memory transaction with respect one or more others of the plurality of processing cores. In response to determining an abort condition for an issued one of the plurality of program instructions and in response to determining that the issued program instruction is not part of a mispredicted execution path, the processing core is configured to abort an attempt to execute the speculative region of code.Type: GrantFiled: June 11, 2010Date of Patent: January 30, 2018Assignee: Advanced Micro Devices, Inc.Inventors: Jaewoong Chung, David S. Christie, Michael P. Hohmuth, Stephan Diestelhorst, Martin T. Pohlack, Luke Yen
-
Patent number: 9880839Abstract: A processor is described having an instruction execution pipeline. The instruction execution pipeline has an instruction fetch stage to fetch an instruction specifying multiple target resultant registers. The instruction execution pipeline has an instruction decode stage to decode the instruction. The instruction execution pipeline has a functional unit to prepare resultant content specific to each of the multiple target resultant registers. The instruction execution pipeline has a write-back stage to write back said resultant content specific to each of said multiple target resultant registers.Type: GrantFiled: April 24, 2014Date of Patent: January 30, 2018Assignee: INTEL CORPORATIONInventors: Wei-Yu Chen, Guei-Yuan Lueh, Subramaniam Maiyuran, Supratim Pal
-
Patent number: 9870226Abstract: A data processing apparatus includes a first execution mechanism, such as an out-of-order processing circuitry, and a second execution mechanism 6 such as an in-order processing circuitry. Switching control circuitry controls switching between which of the first execution circuitry and the second execution circuitry is active at a given time. Latency indicating signals indicative of the latency associated with a candidate switching operation to be performed are supplied to the switching control circuitry and used to control the switching operation. The control of the switching operation may be to accelerate the switching operation, prevent the switching operation, perform early architectural state data transfer or other possibilities.Type: GrantFiled: July 3, 2014Date of Patent: January 16, 2018Assignee: The Regents of the University of MichiganInventors: Shruti Padmanabha, Andrew Lukefahr, Reetuparna Das, Scott Mahlke
-
Patent number: 9870230Abstract: A data processing apparatus comprising a processor for executing a data processing process and a processor for executing a tuning process is disclosed. The data processing apparatus is arranged such that the tuning process which is a different process to the data processing process can access the parameters of speculative mechanisms of the data processing process and tune the parameters so that the mechanisms speculate differently and in this way the performance of this data processing process can be improved.Type: GrantFiled: June 12, 2009Date of Patent: January 16, 2018Assignee: ARM LimitedInventors: Simon Andrew Ford, Stephen John Hill
-
Patent number: 9830158Abstract: One embodiment of the present invention sets forth a technique for speculatively issuing instructions to allow a processing pipeline to continue to process some instructions during rollback of other instructions. A scheduler circuit issues instructions for execution assuming that, several cycles later, when the instructions reach multithreaded execution units, that dependencies between the instructions will be resolved, resources will be available, operand data will be available, and other conditions will not prevent execution of the instructions. When a rollback condition exists at the point of execution for an instruction for a particular thread group, the instruction is not dispatched to the multithreaded execution units. However, other instructions issued by the scheduler circuit for execution by different thread groups, and for which a rollback condition does not exist, are executed by the multithreaded execution units.Type: GrantFiled: November 4, 2011Date of Patent: November 28, 2017Assignee: NVIDIA CORPORATIONInventors: Jack Hilaire Choquette, Olivier Giroux, Robert J. Stoll, Xiaogang Qiu
-
Patent number: 9823924Abstract: A Vector Element Rotate and Insert Under Mask instruction. Each element of a second operand of the instruction is rotated in a specified direction by a specified number of bits. For each bit in a third operand of the instruction that is set to one, the corresponding bit of the rotated elements in the second operand replaces the corresponding bit in a first operand of the instruction.Type: GrantFiled: January 23, 2013Date of Patent: November 21, 2017Assignee: International Business Machines CorporationInventors: Jonathan D. Bradbury, Robert F. Enenkel, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 9823926Abstract: A Vector Element Rotate and Insert Under Mask instruction. Each element of a second operand of the instruction is rotated in a specified direction by a specified number of bits. For each bit in a third operand of the instruction that is set to one, the corresponding bit of the rotated elements in the second operand replaces the corresponding bit in a first operand of the instruction.Type: GrantFiled: December 5, 2014Date of Patent: November 21, 2017Assignee: International Business Machines CorporationInventors: Jonathan D. Bradbury, Robert F. Enenkel, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 9811450Abstract: A tester instruction generation unit generates a tester instruction for terminals of a plurality of devices connected to a tester based on an instruction of a user program and causes an instruction storage unit to store the tester instruction. A transfer mode setting unit sets a transfer mode to either a successive transfer mode or a batch transfer mode, based on the number of tester instructions in the instruction storage unit or an instruction of the user program. A transfer control unit transmits the tester instruction in the instruction storage unit to the tester in accordance with the set transfer mode.Type: GrantFiled: April 29, 2014Date of Patent: November 7, 2017Assignee: RENESAS ELECTRONICS CORPORATIONInventors: Yukikazu Matsuo, Yasuyuki Tanaka, Masaru Sugimoto, Kyosaku Nobunaga
-
Patent number: 9792123Abstract: Methods and indirect branch predictor logic units to predict the target addresses of indirect branch instructions. The method comprises storing in a table predicted target addresses for indirect branch instructions indexed by a combination of the indirect path history for previous indirect branch instructions and the taken/not-taken history for previous conditional branch instructions. When a new indirect branch instruction is received for prediction, the indirect path history and the taken/not-taken history are combined to generate an index for the indirect branch instruction. The generated index is then used to identify a predicted target address in the table. If the identified predicted target address is valid, then the target address of the indirect branch instruction is predicted to be the predicted target address.Type: GrantFiled: January 31, 2014Date of Patent: October 17, 2017Assignee: Imagination Technologies LImitedInventor: Manouk Manoukian
-
Patent number: 9772843Abstract: Processing of character data is facilitated. A Find Element Equal instruction is provided that compares data of multiple vectors for equality and provides an indication of equality, if equality exists. An index associated with the equal element is stored in a target vector register. Further, the same instruction, the Find Element Equal instruction, also searches a selected vector for null elements, also referred to as zero elements. A result of the instruction is dependent on whether the null search is provided, or just the compare.Type: GrantFiled: March 3, 2013Date of Patent: September 26, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jonathan D. Bradbury, Michael K. Gschwind, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 9766887Abstract: A processor fetches a multi-register gather instruction that includes a destination operand that specifies a destination vector register, and a source operand that identifies content that indicates multiple vector registers, a first set of indexes of each of the vector registers that each identifies a source data element, and a second set of indexes of the destination vector register for each identified source element. The instruction is decoded and executed, causing, for each of the first set of indexes of each of the vector registers, the source data element that corresponds to that index of that vector register to be stored in a set of destination data elements that correspond to the second set of identified indexes of the destination vector register for that source data element.Type: GrantFiled: December 23, 2011Date of Patent: September 19, 2017Assignee: Intel CorporationInventor: Ashish Jha
-
Patent number: 9753724Abstract: A data processing apparatus, method and computer program that perform an operation on one data element such as a register and conditionally select either that register or a further register on which no operation has been performed. The apparatus includes an instruction decoder configured to decode at least one conditional select instruction specifying a primary source register, a secondary source register, a destination register, a condition, and an operation to be performed on a data element from the secondary source register. Data processing operations are controlled by the instruction decoder and the data processor is responsive to the decoded at least one conditional select instruction where the condition does not have the predetermined outcome to form the resultant data element from the data element from the primary register and to store the resultant data element in the destination register.Type: GrantFiled: September 23, 2011Date of Patent: September 5, 2017Assignee: ARM LimitedInventors: Simon John Craske, Richard Roy Grisenthwaite, Nigel John Stephens
-
Patent number: 9753730Abstract: A data processing apparatus, method and computer program are described that are capable of decoding instructions from different instruction sets. The method comprising: receiving an instruction; if an operation code of said instruction is an operation code of an instruction from a base set of instructions decoding said instruction according to decode rules for said base set of instructions; and if said operation code of said instruction is an operation code of an instruction from at least one further set of instructions decoding said instruction according to a set of decode rules determined by an indicator value indicating which of said at least one further set of instructions is currently to be decoded.Type: GrantFiled: September 23, 2011Date of Patent: September 5, 2017Assignee: ARM LimitedInventor: Simon John Craske
-
Patent number: 9747132Abstract: A multi-core processor includes a plurality of former-stage cores that perform parallel processing using a plurality of pipelines covering a plurality of stages. In the pipelines, the former-stage cores perform stages ending with an instruction decode stage; stages starting with an instruction execution stage are executed by a latter-stage core. A dynamic load distribution block refers to decode results in the instruction decode stage and controls to assign the latter-stage core with a latter-stage-needed decode result being a decode result whose processing needs to be executed in the latter-stage core.Type: GrantFiled: April 4, 2014Date of Patent: August 29, 2017Assignee: DENSO CORPORATIONInventors: Hirofumi Yamamoto, Takeshi Kondo, Shinichirou Taguchi, Takatoshi Nomura, Daihan Wang, Tomoyoshi Funazaki, Yukoh Matsumoto
-
Patent number: 9727333Abstract: Processing of character data is facilitated. A Find Element Equal instruction is provided that compares data of multiple vectors for equality and provides an indication of equality, if equality exists. An index associated with the equal element is stored in a target vector register. Further, the same instruction, the Find Element Equal instruction, also searches a selected vector for null elements, also referred to as zero elements. A result of the instruction is dependent on whether the null search is provided, or just the compare.Type: GrantFiled: March 3, 2013Date of Patent: August 8, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jonathan D. Bradbury, Michael K. Gschwind, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 9715383Abstract: Processing of character data is facilitated. A Find Element Equal instruction is provided that compares data of multiple vectors for equality and provides an indication of equality, if equality exists. An index associated with the equal element is stored in a target vector register. Further, the same instruction, the Find Element Equal instruction, also searches a selected vector for null elements, also referred to as zero elements. A result of the instruction is dependent on whether the null search is provided, or just the compare.Type: GrantFiled: March 15, 2012Date of Patent: July 25, 2017Assignee: International Business Machines CorporationInventors: Jonathan D. Bradbury, Michael K. Gschwind, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 9710266Abstract: A Load Count to Block Boundary instruction is provided that provides a distance from a specified memory address to a specified memory boundary. The memory boundary is a boundary that is not to be crossed in loading data. The boundary may be specified a number of ways, including, but not limited to, a variable value in the instruction text, a fixed instruction text value encoded in the opcode, or a register based boundary; or it may be dynamically determined.Type: GrantFiled: March 15, 2012Date of Patent: July 18, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jonathan D. Bradbury, Michael K. Gschwind, Christian Jacobi, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 9710267Abstract: A Load Count to Block Boundary instruction is provided that provides a distance from a specified memory address to a specified memory boundary. The memory boundary is a boundary that is not to be crossed in loading data. The boundary may be specified a number of ways, including, but not limited to, a variable value in the instruction text, a fixed instruction text value encoded in the opcode, or a register based boundary; or it may be dynamically determined.Type: GrantFiled: March 3, 2013Date of Patent: July 18, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Jonathan D. Bradbury, Michael K. Gschwind, Christian Jacobi, Eric M. Schwarz, Timothy J. Slegel
-
Patent number: 9710268Abstract: Systems, methods, and apparatuses for reducing the load to load/store address latency in an out-of-order processor. When a producer load is detected in the processor pipeline, the processor predicts whether the producer load is going to hit in the store queue. If the producer load is predicted not to hit in the store queue, then a dependent load or store can be issued early. The result data of the producer load is then bypassed forward from the data cache directly to the address generation unit. This result data is then used to generate an address for the dependent load or store, reducing the latency of the dependent load or store by one clock cycle.Type: GrantFiled: April 29, 2014Date of Patent: July 18, 2017Assignee: Apple Inc.Inventors: Stephan G. Meier, Pradeep Kanapathipillai, Sandeep Gupta