Patents Examined by William Nguyen

Transaction abort instruction specifying a reason for abort

Patent number: 9996360

Abstract: A TRANSACTION ABORT instruction is used to abort a transaction that is executing in a computing environment. The TRANSACTION ABORT instruction includes at least one field used to specify a user-defined abort code that indicates the specific reason for aborting the transaction. Based on executing the TRANSACTION ABORT instruction, a condition code is provided that indicates whether re-execution of the transaction is recommended.

Type: Grant

Filed: August 9, 2016

Date of Patent: June 12, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dan F. Greiner, Christian Jacobi, Marcel Mitran, Timothy J. Slegel
Transaction abort instruction specifying a reason for abort

Patent number: 9983883

Abstract: A TRANSACTION ABORT instruction is used to abort a transaction that is executing in a computing environment. The TRANSACTION ABORT instruction includes at least one field used to specify a user-defined abort code that indicates the specific reason for aborting the transaction. Based on executing the TRANSACTION ABORT instruction, a condition code is provided that indicates whether re-execution of the transaction is recommended.

Type: Grant

Filed: August 9, 2016

Date of Patent: May 29, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dan F. Greiner, Christian Jacobi, Marcel Mitran, Timothy J. Slegel
Computer processor employing double-ended instruction decoding

Patent number: 9959119

Abstract: A computer processor including an instruction buffer configured to store at least one variable-length instruction having a bit bundle bounded by a head end and a tail end with a plurality of slots each defining a corresponding operation, wherein the plurality of slots and corresponding operations are logically partitioned into a plurality of distinct blocks with a first group of blocks extending from the head end of the bit bundle toward the tail end of the bit bundle and a second group of blocks extending from the tail end of the bit bundle toward the head end of the bit bundle, wherein the second group of blocks includes a tail end block disposed adjacent the tail end of the bit bundle. A decode stage is operably coupled to the instruction buffer and configured to process a given variable-length instruction stored by the instruction buffer by decoding at least one operation of a particular block belonging to the first group of blocks in parallel with decoding at least one operation of the tail end block.

Type: Grant

Filed: May 29, 2014

Date of Patent: May 1, 2018

Assignee: MILL COMPUTING, INC.

Inventors: Roger Rawson Godard, Arthur David Kahlich, David Arthur Yost
Apparatus and method of improved permute instructions with multiple granularities

Patent number: 9946540

Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

Type: Grant

Filed: May 22, 2017

Date of Patent: April 17, 2018

Assignee: INTEL CORPORATION

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
Tree-based thread management

Patent number: 9921847

Abstract: In one embodiment of the present invention, a streaming multiprocessor (SM) uses a tree of nodes to manage threads. Each node specifies a set of active threads and a program counter. Upon encountering a conditional instruction that causes an execution path to diverge, the SM creates child nodes corresponding to each of the divergent execution paths. Based on the conditional instruction, the SM assigns each active thread included in the parent node to at most one child node, and the SM temporarily discontinues executing instructions specified by the parent node. Instead, the SM concurrently executes instructions specified by the child nodes. After all the divergent paths reconverge to the parent path, the SM resumes executing instructions specified by the parent node. Advantageously, the disclosed techniques enable the SM to execute divergent paths in parallel, thereby reducing undesirable program behavior associated with conventional techniques that serialize divergent paths across thread groups.

Type: Grant

Filed: January 21, 2014

Date of Patent: March 20, 2018

Assignee: NVIDIA Corporation

Inventor: John Erik Lindholm
Instruction output dependent on a random number-based selection or non-selection of a special command from a group of commands

Patent number: 9904616

Abstract: Generating instructions, in particular for mailbox verification in a simulation environment. A sequence of instructions is received, as well as selection data representative of a plurality of commands including a special command. Repeatedly selecting one of the plurality of commands and outputting an instruction based on the selected command. The outputting of an instruction includes outputting a next instruction in the sequence of instructions if the selected command is the special command, and outputting an instruction associated with the command if the selected command is not the special command.

Type: Grant

Filed: November 1, 2012

Date of Patent: February 27, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Joerg Deutschle, Ursel Hahn, Joerg Walter, Ernst-Dieter Weissenberger
Heterogeneous magnetic memory architecture

Patent number: 9858111

Abstract: Technologies are generally described for systems, devices and methods relating to multicore processors. The multicore processors may include first and second tiles with first and second caches, respectively. The first cache may include first magnetoresistive random access memory (MRAM) cells with first storage characteristics. The second cache may include second MRAM cells with second storage characteristics different from the first storage characteristics. In some examples, an interconnect structure may be coupled to the first and second tiles and may be configured to provide communication between the first tile and the second tile. Methods for handling migration between tiles and cores are also described.

Type: Grant

Filed: June 18, 2014

Date of Patent: January 2, 2018

Assignee: EMPIRE TECHNOLOGIES DEVELOPMENT LLC

Inventor: Yan Solihin
Tree-based thread management

Patent number: 9830161

Abstract: In one embodiment of the present invention, a streaming multiprocessor (SM) uses a tree of nodes to manage threads. Each node specifies a set of active threads and a program counter. Upon encountering a conditional instruction that causes an execution path to diverge, the SM creates child nodes corresponding to each of the divergent execution paths. Based on the conditional instruction, the SM assigns each active thread included in the parent node to at most one child node, and the SM temporarily discontinues executing instructions specified by the parent node. Instead, the SM concurrently executes instructions specified by the child nodes. After all the divergent paths reconverge to the parent path, the SM resumes executing instructions specified by the parent node. Advantageously, the disclosed techniques enable the SM to execute divergent paths in parallel, thereby reducing undesirable program behavior associated with conventional techniques that serialize divergent paths across thread groups.

Type: Grant

Filed: January 21, 2014

Date of Patent: November 28, 2017

Assignee: NVIDIA Corporation

Inventors: John Erik Lindholm, Michael C. Shebanow
Computer processor employing instructions with elided nop operations

Patent number: 9785441

Abstract: A computer processor that operates on distinct first and second instruction streams that have a predefined timed semantic relationship. At least one of the first and second instruction streams includes variable-length instructions having a header and associated bundle bounded by a head end and a tail end. An alignment hole within the bundle encodes information representing at least one nop operation. The computer processor includes first and second multi-stage instruction processing components configured to process in parallel the first and second instruction streams. At least one of the first and second multi-stage instruction processing components includes an instruction buffer operably coupled to a decode stage. The decode stage is configured to process a variable-length instruction by isolating and interpreting the alignment hole of the variable length instruction in order to initiate zero or more nop operations that follow the timed semantic relationship between the first and second instruction streams.

Type: Grant

Filed: May 29, 2014

Date of Patent: October 10, 2017

Assignee: Mill Computing, Inc.

Inventors: Roger Rawson Godard, Arthur David Kahlich, David Arthur Yost
Packed data operation mask register arithmetic combination processors, methods, systems, and instructions

Patent number: 9760371

Abstract: A method of an aspect includes receiving a packed data operation mask register arithmetic combination instruction. The packed data operation mask register arithmetic combination instruction indicates a first packed data operation mask register, indicates a second packed data operation mask register, and indicates a destination storage location. An arithmetic combination of at least a portion of bits of the first packed data operation mask register and at least a corresponding portion of bits of the second packed data operation mask register is stored in the destination storage location in response to the packed data operation mask register arithmetic combination instruction. Other methods, apparatus, systems, and instructions are disclosed.

Type: Grant

Filed: December 22, 2011

Date of Patent: September 12, 2017

Assignee: Intel Corporation

Inventors: Bret L. Toll, Robert Valentine, Jesus Corbal San Adrian, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney
Apparatus and method of improved permute instructions

Patent number: 9658850

Abstract: An apparatus is described having instruction execution logic circuitry. The instruction execution logic circuitry has input vector element routing circuitry to perform the following for each of three different instructions: for each of a plurality of output vector element locations, route into an output vector element location an input vector element from one of a plurality of input vector element locations that are available to source the output vector element. The output vector element and each of the input vector element locations are one of three available bit widths for the three different instructions. The apparatus further includes masking layer circuitry coupled to the input vector element routing circuitry to mask a data structure created by the input vector routing element circuitry. The masking layer circuitry is designed to mask at three different levels of granularity that correspond to the three available bit widths.

Type: Grant

Filed: December 23, 2011

Date of Patent: May 23, 2017

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
Apparatus and method of improved insert instructions

Patent number: 9619236

Abstract: An apparatus is described having instruction execution logic circuitry to execute first, second, third and fourth instruction. Both the first instruction and the second instruction insert a first group of input vector elements to one of multiple first non overlapping sections of respective first and second resultant vectors. The first group has a first bit width. Each of the multiple first non overlapping sections have a same bit width as the first group. Both the third instruction and the fourth instruction insert a second group of input vector elements to one of multiple second non overlapping sections of respective third and fourth resultant vectors. The second group has a second bit width that is larger than said first bit width. Each of the multiple second non overlapping sections have a same bit width as the second group.

Type: Grant

Filed: December 23, 2011

Date of Patent: April 11, 2017

Assignee: Intel Corporation

Inventors: Elmoustapha Ould-Ahmed-Vall, Robert Valentine, Jesus Corbal, Bret L. Toll, Mark J. Charney, Zeev Sperber, Amit Gradstein
Transaction abort instruction

Patent number: 9529598

Abstract: A TRANSACTION ABORT instruction is used to abort a transaction that is executing in a computing environment. The TRANSACTION ABORT instruction includes at least one field used to specify a user-defined abort code that indicates the specific reason for aborting the transaction. Based on executing the TRANSACTION ABORT instruction, a condition code is provided that indicates whether re-execution of the transaction is recommended.

Type: Grant

Filed: March 8, 2013

Date of Patent: December 27, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dan F. Greiner, Christian Jacobi, Marcel Mitran, Timothy J. Slegel
Transaction abort instruction

Patent number: 9436477

Abstract: A TRANSACTION ABORT instruction is used to abort a transaction that is executing in a computing environment. The TRANSACTION ABORT instruction includes at least one field used to specify a user-defined abort code that indicates the specific reason for aborting the transaction. Based on executing the TRANSACTION ABORT instruction, a condition code is provided that indicates whether re-execution of the transaction is recommended.

Type: Grant

Filed: June 15, 2012

Date of Patent: September 6, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dan F. Greiner, Christian Jacobi, Marcel M. Mitran, Timothy J. Slegel
Techniques for enabling bit-parallel wide string matching with a SIMD register

Patent number: 9424031

Abstract: Various embodiments are generally directed to overcoming limitations of vector registers in their use with bit-parallel string matching algorithms. An apparatus includes a processor element; and logic to receive a pattern comprising a first string of elements to employ in a string matching operation, instantiate a test bitmask in a first vector register of the processor element, the first vector register comprising multiple lanes, copy bit values at MSB bit positions of the multiple lanes of the first vector register to a first vector mask as a vector value, bit-shift the vector value as a scalar value, bit-shift the first vector register, employ the vector value of the first vector mask to selectively fill LSB bit positions of lanes of a second vector register of the processor element; and OR the second vector register into the first vector register. Other embodiments are described and claimed.

Type: Grant

Filed: March 13, 2013

Date of Patent: August 23, 2016

Assignee: INTEL CORPORATION

Inventors: Hariharan Thantry, Mani Azimi
Confidence-driven selective predication of processor instructions

Patent number: 9389868

Abstract: An apparatus includes a network interface, memory, and a processor. The processor is coupled with the network interface and memory. The processor is configured to determine that an instruction instance is a branch instruction instance. Responsive to a determination that an instruction instance is a branch instruction instance, the processor is configured to obtain a branch prediction for the branch instruction instance and a confidence value of the branch prediction. The processor is further configured to determine that the confidence for the branch prediction is low based on the confidence value, and responsive to such a determination, generate predicated instruction instances based on the branch instruction instance.

Type: Grant

Filed: November 1, 2012

Date of Patent: July 12, 2016

Assignee: International Business Machines Corporation

Inventor: Michael Karl Gschwind
Identifying a largest logical plane from a plurality of logical planes formed of compute nodes of a subcommunicator in a parallel computer

Patent number: 9390054

Abstract: In a parallel computer, a largest logical plane from a plurality of logical planes formed of compute nodes of a subcommunicator may be identified by: identifying, by each compute node of the subcommunicator, all logical planes that include the compute node; calculating, by each compute node for each identified logical plane that includes the compute node, an area of the identified logical plane; initiating, by a root node of the subcommunicator, a gather operation; receiving, by the root node from each compute node of the subcommunicator, each node's calculated areas as contribution data to the gather operation; and identifying, by the root node in dependence upon the received calculated areas, a logical plane of the subcommunicator having the greatest area.

Type: Grant

Filed: October 14, 2013

Date of Patent: July 12, 2016

Assignee: International Business Machines Corporation

Inventors: Kristan D. Davis, Daniel A. Faraj
Processor assist facility

Patent number: 9367323

Abstract: An operation is provided to signal a processor that action is to be taken to facilitate execution of a transaction that has aborted one or more times. The operation is specified within an instruction or is itself an instruction. The instruction is executed based on detecting an abort of the transactions, and includes a field indicating how many times the transaction has aborted. The processor uses this information to determine what action is to be taken.

Type: Grant

Filed: June 15, 2012

Date of Patent: June 14, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Dan F. Greiner, Christian Jacobi, Randall W. Philley, Peter J. Relson, Timothy J. Slegel
Processor with hybrid pipeline capable of operating in out-of-order and in-order modes

Patent number: 9354884

Abstract: A method and circuit arrangement provide support for a hybrid pipeline that dynamically switches between out-of-order and in-order modes. The hybrid pipeline may selectively execute instructions from at least one instruction stream that require the high performance capabilities provided by out-of-order processing in the out-of-order mode. The hybrid pipeline may also execute instructions that have strict power requirements in the in-order mode where the in-order mode conserves more power compared to the out-of-order mode. Each stage in the hybrid pipeline may be activated and fully functional when the hybrid pipeline is in the out-of-order mode. However, stages in the hybrid pipeline not used for the in-order mode may be deactivated and bypassed by the instructions when the hybrid pipeline dynamically switches from the out-of-order mode to the in-order mode. The deactivated stages may then be reactivated when the hybrid pipeline dynamically switches from the in-order mode to the out-of-order mode.

Type: Grant

Filed: March 13, 2013

Date of Patent: May 31, 2016

Assignee: International Business Machines Corporation

Inventors: Miguel Comparan, Andrew D. Hilton, Hans M. Jacobson, Brian M. Rogers, Robert A. Shearer, Ken V. Vu, Alfred T. Watson, III
Enhanced loop streaming detector to drive logic optimization

Patent number: 9354875

Abstract: An enhanced loop streaming detection mechanism is provided in a processor to reduce power consumption. The processor includes a decoder to decode instructions in a loop into micro-operations, and a loop streaming detector to detect the presence of the loop in the micro-operations. The processor also includes a loop characteristic tracker unit to identify hardware components downstream from the decoder that are not to be used by the micro-operations in the loop, and to disable the identified hardware components. The processor also includes execution circuitry to execute the micro-operations in the loop with the identified hardware components disabled.

Type: Grant

Filed: December 27, 2012

Date of Patent: May 31, 2016

Assignee: Intel Corporation

Inventors: Matthew C. Merten, Justin M. Deinlein, Yury N. Ilin, Alexandre J. Farcy, Tong Li, Srikanth T. Srinivasan

1 2 next