Operand Address Generation Patents (Class 711/214)

Program counter (PC)-relative load and store addressing for fused instructions

Patent number: 11392386

Abstract: Load store addressing can include a processor, which fuses two consecutive instruction determined to be prefix instructions and treats the two instructions as a single fused instruction. The prefix instruction of the fused instruction is auto-finished at dispatch time in an issue unit of the processor. A suffix instruction of the fused instruction and its fields and the prefix instruction's fields are issued from an issue queue of the issue unit, wherein an opcode of the suffix instruction is issued to a load store unit of the processor, and fields of the fused instruction are issued to the execution unit of the processor. The execution unit forms operands of the suffix instruction, at least one operand formed based on a current instruction address of the single fused instruction. The load store unit executes the suffix instruction using the operands formed by the execution unit.

Type: Grant

Filed: August 14, 2020

Date of Patent: July 19, 2022

Assignee: International Business Machines Corporation

Inventors: Nicholas R. Orzol, Christian Gerhard Zoellin, Brian W. Thompto, Dung Q. Nguyen, Niels Fricke, Sheldon Bernard Levenstein, Phillip G. Williams, Brian D. Barrick
Method and apparatus for predicting and scheduling copy instruction for software pipelined loops

Patent number: 11366646

Abstract: A method for scheduling instructions for execution on a computer system includes scanning a plurality of loop instructions that are modulo scheduled to identify a first instruction and a second instruction that both utilize a register of the computer system upon execution of the plurality of instructions. The loop has a first initiation interval. The first instruction defines a first value of the register in a first iteration of the loop and the second instruction redefines the value of the register to a second value in a subsequent iteration of the loop prior to a use of the first value in the first iteration of the loop. A copy instruction is inserted in the loop instructions to copy the first value prior to execution of the second instruction. A schedule is determined after the insertion of the one or more copy instructions giving a second initiation interval.

Type: Grant

Filed: January 23, 2020

Date of Patent: June 21, 2022

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Ehsan Amiri, Ning Xie
Splitting load hit store table for out-of-order processor

Patent number: 10942743

Abstract: According to one or more embodiments, an example computer-implemented method for executing one or more out-of-order instructions by a processing unit, includes decoding an instruction to be executed, and based on a determination that the instruction is a store instruction, identifying a split load-hit-store (LHS) table for the store instruction, wherein a LHS table of the processing unit includes multiple split LHS tables. Identifying the split LHS table includes determining, for the store instruction, a first split LHS table by performing a mod operation using one or more operands from the store instruction, and adding one or more parameters of the store instruction in the first split LHS table by generating an ITAG for the store instruction. The method further includes dispatching the store instruction for execution to an issue queue with the ITAG.

Type: Grant

Filed: April 28, 2020

Date of Patent: March 9, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ehsan Fatehi, Richard J. Eickemeyer, Edmund J. Gieske
SM3 hash function message expansion processors, methods, systems, and instructions

Patent number: 10503510

Abstract: A processor includes a decode unit to receive an instruction to indicate a first source packed data operand and a second source packed data operand. The source operands each to include elements. The data elements to include information selected from messages and logical combinations of messages that is sufficient to evaluate: P1(Wj?16 XOR Wj?9 XOR (Wj?3<<<15)) XOR(Wj?13<<<7)XOR Wj?6 P1 is a permutation function, P1(X)=X XOR (X<<<15) XOR (X<<<23). Wj?16, Wj?9, Wj?3, Wj?13, and Wj?6 are messages associated with a compression function of an SM3 hash function. XOR is an exclusive OR operation. <<< is a rotate operation. An execution unit coupled with the decode unit that is operable, in response to the instruction, to store a result packed data in a destination storage location. The result packed data to include a Wj message to be input to a round j of the compression function.

Type: Grant

Filed: December 27, 2013

Date of Patent: December 10, 2019

Assignee: Intel Corporation

Inventors: Gilbert M. Wolrich, Vinodh Gopal, Kirk S. Yap, Wajdi K. Feghali, Sean Gulley
Functional unit for instruction execution pipeline capable of shifting different chunks of a packed data operand by different amounts

Patent number: 10496411

Abstract: A method is described that includes fetching an instruction. The method further includes decoding the instruction. The instruction specifies an operation, a first operand and a second operand. The method further includes fetching the first and second operands of the instruction. The first and second operands are each composed of a plurality of larger chunks having constituent elements. The method further includes performing the operation specified by the instruction including generating a resultant composed of a plurality of larger chunks having constituent elements. The generating of the resultant includes selecting for each element in the resultant a contiguous group of bits from a same positioned chunk of the first operand as the chunk of the element in the resultant, the contiguous group of bits being identified by a same positioned element of the second operand as the element in the resultant.

Type: Grant

Filed: December 20, 2017

Date of Patent: December 3, 2019

Assignee: Intel Corporation

Inventors: Tal Uliel, Robert Valentine
Processor with a full instruction set decoder and a partial instruction set decoder

Patent number: 10437596

Abstract: An apparatus and method system and method for increasing performance in a processor or other instruction execution device while minimizing energy consumption. A processor includes a first execution pipeline and a second execution pipeline. The first execution pipeline includes a first decode unit and a first execution control unit coupled to the first decode unit. The first execution control unit is configured to control execution of all instructions executable by the processor. The second execution pipeline includes a second decode unit, and a second execution control unit coupled to the second decode unit. The second execution control unit is configured to control execution of a subset of the instructions executable via the first execution control unit.

Type: Grant

Filed: November 26, 2014

Date of Patent: October 8, 2019

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Christian Wiencke, Shrey Bhatia
Using location addressed storage as content addressed storage

Patent number: 10417181

Abstract: Some examples describe a method for using location addressed storage as content addressed storage (CAS). A checksum of a file may be generated during transition of the file to a retained state. The generated checksum, which may represent a content address of the file, may be stored in a database. The database may be queried with the content address of the file to retrieve a location address of the file corresponding to the content address of the file. The location address of the file is used to provide access to the file in the file system.

Type: Grant

Filed: July 22, 2014

Date of Patent: September 17, 2019

Assignee: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP

Inventors: Ramesh Kannan Karuppusamy, Rajkumar Kannan
Instructions and logic for get-multiple-vector-elements operations

Patent number: 10338920

Abstract: A processor includes an execution unit to execute instructions to get data elements of the same type from multiple data structures packed in vector registers. The execution unit includes logic to extract data elements from specific positions within each data structure dependent on an instruction encoding. A vector GET3 instruction encoding specifies that data elements be extracted from the first, second, or third position in each XYZ-type data structure. A vector GET4 instruction encoding specifies that data elements be extracted from the first, second, third, or fourth position in each XYZW-type data structure and that the extracted data elements be placed in the upper or lower half of a destination vector. The execution unit includes logic to place the extracted data elements in contiguous locations in the destination vector. The execution unit includes logic to store the destination vector to a destination vector register specified in the instruction.

Type: Grant

Filed: December 18, 2015

Date of Patent: July 2, 2019

Assignee: Intel Corporation

Inventor: Elmoustapha Ould-Ahmed-Vall
Multiplication-based method for stitching results of predicate evaluation in column stores

Patent number: 10296619

Abstract: A system joins predicate evaluated column bitmaps having varying lengths. The system includes a column unifier for querying column values with a predicate generating an indicator bit for each of the column values that is then joined with the respective column value. The system also includes a bitmap generator for creating a column-major linear bitmap from the column values and indicator bits. The column unifier also determines an offset between adjacent indicator bits. The system also includes a converter for multiplying the column-major linear bitmap with a multiplier to shift the indicator bits into consecutive positions in the linear bitmap.

Type: Grant

Filed: September 28, 2015

Date of Patent: May 21, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ronald J Barber, Min-Soo Kim, Jae Gil Lee, Sam S Lightstone, Guy M Lohman, Lin Qiao, Vijayshankar Raman, Richard S Sidle
Apparatus and method for instruction-based flop accounting

Patent number: 10228938

Abstract: An apparatus and method are described for floating point operation (FLOP) accounting. For example, one embodiment of a processor comprises: an instruction fetch unit to fetch instructions from system memory, the instructions including at least one masked vector floating point instruction to perform operations on a plurality of floating point data elements; a mask register to store a mask value associated with the masked vector floating point instruction; a decoder to decode the masked vector floating point instruction; and floating point operations (FLOP) accounting circuitry to read the mask register to determine a number of floating point operations to be performed during execution of the masked vector floating point instruction.

Type: Grant

Filed: December 30, 2016

Date of Patent: March 12, 2019

Assignee: Intel Corporation

Inventors: Karthik Raman, Ariel Slonim, Ady Tal
Source operand read suppression for graphics processors

Patent number: 10152452

Abstract: Techniques to suppress redundant reads to register addresses and to replicate read data are disclosed. The redundant reads are suppressed when multiple source operands specify the same register address to read. Additionally, the read data is replicated to a data stream or data location corresponding to the source operands where the data read was suppressed.

Type: Grant

Filed: May 29, 2015

Date of Patent: December 11, 2018

Assignee: INTEL CORPORATION

Inventors: Supratim Pal, Subramaniam Maiyuran, Mark C. Davis
Pipelined data replication for disaster recovery

Patent number: 10152398

Abstract: Pipelined data replication for disaster recovery is disclosed. An example pipelined data replication method for disaster recovery disclosed herein comprises sending replicated first data from a primary processing environment to a secondary processing environment for backup by the secondary processing environment, the replicated first data being a replica of first data in the primary processing environment, processing the first data in the primary processing environment prior to the backup of the replicated first data by the secondary processing environment being confirmed, and preventing a result of the processing of the first data from being released by the primary processing environment until the backup of the replicated first data by the secondary processing environment is confirmed.

Type: Grant

Filed: August 2, 2012

Date of Patent: December 11, 2018

Assignees: AT&T Intellectual Property I, L.P., University of Massachusetts

Inventors: Kadangode K. Ramakrishnan, Horacio Andres Lagar-Cavilla, Prashant Shenoy, Jacobus Van der Merwe, Timothy Wood
Instruction to load data up to a dynamically determined memory boundary

Patent number: 9959118

Abstract: A Load to Block Boundary instruction is provided that loads a variable number of bytes of data into a register while ensuring that a specified memory boundary is not crossed. The boundary is dynamically determined based on a specified type of boundary and one or more characteristics of the processor executing the instruction, such as cache line size or page size used by the processor.

Type: Grant

Filed: May 24, 2016

Date of Patent: May 1, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Michael K. Gschwind, Christian Jacobi, Eric M. Schwarz, Timothy J. Slegel
Instruction to load data up to a specified memory boundary indicated by the instruction

Patent number: 9959117

Abstract: A Load to Block Boundary instruction is provided that loads a variable number of bytes of data into a register while ensuring that a specified memory boundary is not crossed. The boundary may be specified a number of ways, including, but not limited to, a variable value in the instruction text, a fixed instruction text value encoded in the opcode, or a register based boundary.

Type: Grant

Filed: January 14, 2016

Date of Patent: May 1, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D Bradbury, Michael K. Gschwind, Christian Jacobi, Eric M. Schwarz, Timothy J. Slegel
Instruction to load data up to a dynamically determined memory boundary

Patent number: 9952862

Abstract: A Load to Block Boundary instruction is provided that loads a variable number of bytes of data into a register while ensuring that a specified memory boundary is not crossed. The boundary is dynamically determined based on a specified type of boundary and one or more characteristics of the processor executing the instruction, such as cache line size or page size used by the processor.

Type: Grant

Filed: May 24, 2016

Date of Patent: April 24, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Michael K. Gschwind, Christian Jacobi, Eric M. Schwartz, Timothy J. Slegel
Instruction to load data up to a specified memory boundary indicated by the instruction

Patent number: 9946542

Abstract: A Load to Block Boundary instruction is provided that loads a variable number of bytes of data into a register while ensuring that a specified memory boundary is not crossed. The boundary may be specified a number of ways, including, but not limited to, a variable value in the instruction text, a fixed instruction text value encoded in the opcode, or a register based boundary.

Type: Grant

Filed: January 14, 2016

Date of Patent: April 17, 2018

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Michael K. Gschwind, Christian Jacobi, Eric M. Schwarz, Timothy J. Slegel
Computer processor employing hardware-based pointer processing

Patent number: 9524163

Abstract: A computer processor is provided with execution logic that performs operations that utilize pointers stored in memory. In one aspect, each pointer is associated with a predefined number of event bits. The execution logic processes the event bits of a given pointer in conjunction with processing a predefined pointer-related operation involving the given pointer in order to selectively output an event-of-interest signal. In another aspect, each pointer is represented by an address field and a granularity field. The address field includes a chunk address and an offset. The granularity field represents granularity of the offset of the address field. The execution logic includes an address derivation unit that processes the granularity field of a base address for a given pointer in order to generate a valid address field for the derived pointer.

Type: Grant

Filed: October 15, 2014

Date of Patent: December 20, 2016

Assignee: Mill Computing, Inc.

Inventors: Roger Rawson Godard, Arthur David Kahlich
Instruction to load data up to a dynamically determined memory boundary

Patent number: 9471312

Abstract: A Load to Block Boundary instruction is provided that loads a variable number of bytes of data into a register while ensuring that a specified memory boundary is not crossed. The boundary is dynamically determined based on a specified type of boundary and one or more characteristics of the processor executing the instruction, such as cache line size or page size used by the processor.

Type: Grant

Filed: March 3, 2013

Date of Patent: October 18, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Michael K. Gschwind, Christian Jacobi, Eric M. Schwarz, Timothy J. Slegel
Instruction to load data up to a specified memory boundary indicated by the instruction

Patent number: 9459867

Abstract: A Load to Block Boundary instruction is provided that loads a variable number of bytes of data into a register while ensuring that a specified memory boundary is not crossed. The boundary may be specified a number of ways, including, but not limited to, a variable value in the instruction text, a fixed instruction text value encoded in the opcode, or a register based boundary.

Type: Grant

Filed: March 15, 2012

Date of Patent: October 4, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Michael K. Gschwind, Christian Jacobi, Eric M. Schwarz, Timothy J. Slegel
Instruction to load data up to a dynamically determined memory boundary

Patent number: 9459868

Abstract: A Load to Block Boundary instruction is provided that loads a variable number of bytes of data into a register while ensuring that a specified memory boundary is not crossed. The boundary is dynamically determined based on a specified type of boundary and one or more characteristics of the processor executing the instruction, such as cache line size or page size used by the processor.

Type: Grant

Filed: March 15, 2012

Date of Patent: October 4, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Michael K. Gschwind, Christian Jacobi, Eric M. Schwarz, Timothy J. Slegel
Instruction to load data up to a specified memory boundary indicated by the instruction

Patent number: 9383996

Abstract: A Load to Block Boundary instruction is provided that loads a variable number of bytes of data into a register while ensuring that a specified memory boundary is not crossed. The boundary may be specified a number of ways, including, but not limited to, a variable value in the instruction text, a fixed instruction text value encoded in the opcode, or a register based boundary.

Type: Grant

Filed: March 3, 2013

Date of Patent: July 5, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jonathan D. Bradbury, Michael K. Gschwind, Christian Jacobi, Eric M. Schwarz, Timothy J. Slegel
System, method, and computer program product for scheduling tasks associated with continuation thread blocks

Patent number: 9256623

Abstract: A system, method, and computer program product for scheduling tasks associated with continuation thread blocks. The method includes the steps of generating a first task metadata data structure in a memory, generating a second task metadata data structure in the memory, executing a first task corresponding to the first task metadata data structure in a processor, generating state information representing a continuation task related to the first task and storing the state information in the second task metadata data structure, executing the continuation task in the processor after the one or more child tasks have finished execution, and indicating that the first task has logically finished execution once the continuation task has finished execution. The second task metadata data structure is related to the first task metadata data structure, and at least one instruction in the first task causes one or more child tasks to be executed by the processor.

Type: Grant

Filed: May 8, 2013

Date of Patent: February 9, 2016

Assignee: NVIDIA Corporation

Inventors: Scott Ricketts, Luke David Durant, Brian Scott Pharris, Igor Sevastiyanov, Nicholas Wang
Method for cleaning cache of processor and associated processor

Patent number: 9158697

Abstract: A method for cleaning a cache of a processor includes: generating a specific command according to a request, wherein the specific command includes an operation command, a first field and a second field; obtaining an offset and a starting address according to the first field and the second field; selecting a specific segment from the cache according to the starting address and the offset; and cleaning data stored in the specific segment.

Type: Grant

Filed: December 2, 2012

Date of Patent: October 13, 2015

Assignee: Realtek Semiconductor Corp.

Inventors: Yen-Ju Lu, Ching-Yeh Yu, Chen-Tung Lin, Chao-Wei Huang
Method of Processing Data with an Array of Data Processors According to Application ID

Publication number: 20150026431

Abstract: A method wherein a plurality of data processors are associated with application IDs whereby the array processes a plurality of applications in parallel.

Type: Application

Filed: September 29, 2014

Publication date: January 22, 2015

Applicant: PACT XPP TECHNOLOGIES AG

Inventors: Martin Vorbach, Volker Baumgarte, Frank May, Armin Nuckel
Method and apparatus for realtime detection of heap memory corruption by buffer overruns

Patent number: 8930657

Abstract: One embodiment of the present invention relates to a heap overflow detection system that includes an arithmetic logic unit, a datapath, and address violation detection logic. The arithmetic logic unit is configured to receive an instruction having an opcode and an operand and to generate a final address and to generate a compare signal on the opcode indicating a heap memory access related instruction. The datapath is configured to provide the opcode and the operand to the arithmetic logic unit. The address violation detection logic determines whether a heap memory access is a violation according to the operand and the final address on receiving the compare signal from the arithmetic logic unit.

Type: Grant

Filed: July 18, 2011

Date of Patent: January 6, 2015

Assignee: Infineon Technologies AG

Inventor: Prakash Kalanjeri Balasubramanian
Serial flash memory and address transmission method thereof

Patent number: 8898439

Abstract: A serial flash memory and an address transmission method thereof. The serial flash memory selectively addresses a first memory space according to a first address length or addresses a second memory space according to a second address length longer than the first address length. If the first memory space is addressed according to the first address length, a first memory address is completely received within an address time duration so that data corresponding to the first memory address is initially outputted from a starting clock. In the address transmission method, if the second memory space is addressed according to the second address length, a portion of a second memory address is received within the address time duration. The other portion of the second memory address is received within a waiting time duration so that data corresponding to the second memory address is initially outputted from the starting clock.

Type: Grant

Filed: July 16, 2010

Date of Patent: November 25, 2014

Assignee: Macronix International Co., Ltd.

Inventors: Kuen-Long Chang, Yufe-Feng Lin, Chun-Hsiung Hung
Low access time indirect memory accesses

Patent number: 8880815

Abstract: An apparatus having a memory and a controller is disclosed. The controller may be configured to (i) receive a read request from a processor, the read request comprising a first value and a second value, (ii) where the read request is an indirect memory access, (a) generate a first address in response to the first value, (b) read data stored in the memory at the first address and (c) generate a second address in response to the second value and the data, (iii) where the read request is a direct memory access, generate the second address in response to the second value and (iv) read a requested data stored in the memory at the second address.

Type: Grant

Filed: February 20, 2012

Date of Patent: November 4, 2014

Assignee: Avago Technologies General IP (Singapore) Pte. Ltd.

Inventors: Nimrod Alexandron, Alexander Rabinovitch, Leonid Dubrovin
Method and apparatus for performing enhanced read and write operations in a FLASH memory system

Patent number: 8775772

Abstract: Methods and apparatus for enhanced READ and WRITE operations in a FLASH-based solid state storage system that includes a logical to physical translation table where the logical to physical translation table can include entries associating a logical block address with one or more data identifiers, where each data identifier is associated with a data string.

Type: Grant

Filed: December 21, 2009

Date of Patent: July 8, 2014

Assignee: International Business Machines Corporation

Inventors: James A. Fuxa, Lance W. Shelton, Justin C. Haggard
COALESCING ADJACENT GATHER/SCATTER OPERATIONS

Publication number: 20140181464

Abstract: According to one embodiment, a processor includes an instruction decoder to decode a first instruction to gather data elements from memory, the first instruction having a first operand specifying a first storage location and a second operand specifying a first memory address storing a plurality of data elements. The processor further includes an execution unit coupled to the instruction decoder, in response to the first instruction, to read contiguous a first and a second of the data elements from a memory location based on the first memory address indicated by the second operand, and to store the first data element in a first entry of the first storage location and a second data element in a second entry of a second storage location corresponding to the first entry of the first storage location.

Type: Application

Filed: December 26, 2012

Publication date: June 26, 2014

Inventors: Andrew T. Forsyth, Brian J. Hickmann, Jonathan C. Hall, Christopher J. Hughes
RELATIVE ADDRESSING USAGE FOR CPU PERFORMANCE

Publication number: 20140173245

Abstract: The embodiments provide a computing device for incorporating data into code such that the data is relative to the code and, thereby, available for relative addressing. The computing device may include a code generator configured to receive source code from a source code database, and generate executable object code from the source code. The executable object code may include at least one instruction referencing data having an absolute address from a data source. Also, the computing device may include a data incorporator configured to transfer the data from the data source into the executable object code, where the transferred data is relative to the at least one instruction. Further, the computing device may include a relative addresser configured to adjust the at least one instruction to include a relative address for the transferred data including converting the absolute address to the relative address.

Type: Application

Filed: December 19, 2012

Publication date: June 19, 2014

Applicant: BMC SOFTWARE, INC.

Inventor: Mark P. Ruhe
STORE OPERATION WITH CONDITIONAL PUSH

Publication number: 20140143519

Abstract: According to one embodiment, a method for a store operation with a conditional push of a tag value to a queue is provided. The method includes configuring a queue that is accessible by an application, setting a value at an address in a memory device including a memory and a controller, receiving a request for an operation using the value at the address and performing the operation. The method also includes the controller writing a result of the operation to the address, thus changing the value at the address, the controller determining if the result of the operation meets a condition and the controller pushing a tag value to the queue based on the condition being met, where the tag value in the queue indicates to the application that the condition is met.

Type: Application

Filed: November 20, 2012

Publication date: May 22, 2014

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Philip Heidelberger, Burkhard Steinmacher-Burow
METHOD AND APPARATUS FOR ENCODING DATA ADDRESS

Publication number: 20140089633

Abstract: The present invention relates to the field of communication technologies and discloses a method and an apparatus for encoding a data address, so that attacks can be effectively prevented and resources and costs required to handle a bank conflict are reduced. In solutions provided by embodiments of the present invention, an exclusive-OR operation is performed on one or more bits of a received uncoded address by using multiple preset transform polynomials; and an encoded address is obtained according to a result of the exclusive-OR operation. The solutions provided by the embodiments of the present invention are applicable to designs that require a large-capacity DRAM, high performance and high reliability, and have an anti-attack demand.

Type: Application

Filed: November 27, 2013

Publication date: March 27, 2014

Applicant: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Chunlei Fan, Wenhua Du, Zixue Bi
Block driven computation using a caching policy specified in an operand data structure

Patent number: 8458439

Abstract: A processor has an associated memory hierarchy including a cache memory. The processor includes an instruction sequencing unit that fetches instructions for processing, an operand data structure including a plurality of entries corresponding to operands of operations to be performed by the processor, and a computation engine. A first entry among the plurality of entries in the operand data structure specifies a first caching policy for a first operand, and a second entry specifies a second caching policy for a second operand. The computation engine computes and stores operands in the memory hierarchy in accordance with the cache policies indicated within the operand data structure.

Type: Grant

Filed: December 16, 2008

Date of Patent: June 4, 2013

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Balaram Sinharoy
INSTRUCTION ADDRESS ADJUSTMENT IN RESPONSE TO LOGICALLY NON-SIGNIFICANT OPERATIONS

Publication number: 20130111186

Abstract: A method, apparatus, and program product execute instructions of an instruction stream and detect logically non-significant operations in the instruction stream. Then, based on that detection, a target or source address of a subsequent instruction is adjusted. In some instances, doing so enables a greater number of addresses, e.g., registers, to be accessed in a given number of bit positions within an instruction format.

Type: Application

Filed: October 26, 2011

Publication date: May 2, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Mark J. Hickey, Adam J. Muff, Matthew R. Tubbs, Charles D. Wait
Programmable signal processing circuit and method of interleaving

Patent number: 8433881

Abstract: A programmable signal processing circuit is used to (de-)interleave a data stream. Data from the signal stream is stored in a data memory (28) and read in a different sequence. The programmable signal processing circuit is used for computing addresses, for use in said storing and/or reading. The programmable signal processing circuit has an instruction set that contains an instruction to compute the addresses from preceding addresses that have been used for said storing and/or reading. In response to the instruction the programmable signal processing circuit permutes positions of a plurality of bits from the old address operand and forms of a bit of the new address result as a logic function of a combination of bits from the old address operand. Successive addresses are formed by means of repeated execution of a program loop that contains an address update instruction for computing the addresses.

Type: Grant

Filed: January 24, 2012

Date of Patent: April 30, 2013

Assignee: Intel Benelux B.V.

Inventors: Paulus W. F. Gruijters, Marcus M. G. Quax, Ingolf Held
Microprocessor and method for register addressing therein

Patent number: 8364934

Abstract: A microprocessor architecture comprising a microprocessor operably coupled to a plurality of registers and arranged to execute at least one instruction. The microprocessor is arranged to determine a class of data operand. The at least one instruction comprises one or more codes in a register specifier that indicates whether relative addressing or absolute addressing is used in accessing a register. In this manner, absolute and relative register addressing is supported within a single instruction word.

Type: Grant

Filed: July 11, 2006

Date of Patent: January 29, 2013

Assignee: Freescale Semiconductor, Inc.

Inventor: Martin Raubuch
Block driven computation with an address generation accelerator

Patent number: 8285971

Abstract: A processor includes at least one execution unit that executes instructions, at least one register file, coupled to the at least one execution unit, that buffers operands for access by the at least one execution unit, an instruction sequencing unit that fetches instructions for execution by the at least one execution unit, and an address generation accelerator. The address generation accelerator, responsive to an initiation signal received from the instruction sequencing unit, computes and outputs first and second effective addresses of operands of an operation.

Type: Grant

Filed: December 16, 2008

Date of Patent: October 9, 2012

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Balaram Sinharoy
Specifying an addressing relationship in an operand data structure

Patent number: 8281106

Abstract: A processor includes at least one execution unit that executes instructions, at least one register file, coupled to the at least one execution unit, that buffers operands for access by the at least one execution unit, and an instruction sequencing unit that fetches instructions for execution by the execution unit. The processor further includes an operand data structure and an address generation accelerator. The operand data structure specifies a first relationship between addresses of sequential accesses within a first address region and a second relationship between addresses of sequential accesses within a second address region. The address generation accelerator computes a first address of a first memory access in the first address region by reference to the first relationship and a second address of a second memory access in the second address region by reference to the second relationship.

Type: Grant

Filed: December 16, 2008

Date of Patent: October 2, 2012

Assignee: International Business Machines Corporation

Inventors: Ravi K. Arimilli, Balaram Sinharoy
CIRCUIT MODULE DEVICE WITH ADDRESS GENERATION FUNCTIONS

Publication number: 20120216010

Abstract: The present invention relates generally to a kind of circuit module device with address generation functions, which comprises: A plurality of circuit modules, wherein, each circuit module is a control unit, one signal input end and one signal output end; and thereat, the said control unit has an address generation function; and the signal input ends are being electrically connected in series with signal output ends at a plurality of said circuit modules; a plurality of said circuit modules at least consist of one primary circuit module and one secondary circuit module, in which, the signal output end of said primary circuit module is being electrically connected to the signal input end of said secondary circuit module; and wherein, when signal input end of the said primary circuit module is receiving one primary addressing command, the control unit of said primary circuit module will respond to the said primary addressing command and generate one primary address, and then it will send out one secondary addres

Type: Application

Filed: February 23, 2011

Publication date: August 23, 2012

Inventors: Lin Cheng-Lung, Che-Chuan Lin
Address generation unit with pseudo sum to accelerate load/store operations

Patent number: 8171258

Abstract: In an embodiment, an address generation unit (AGU) is configured to generate a pseudo sum from an index portion of two or more operands. The pseudo sum may equal the index if the carry-in of the actual sum to the least significant bit of the index is a selected value (e.g. zero). The AGU may also include circuitry coupled to receive the operands and to generate the actual carry-in to the least significant bit of the index. The AGU may transmit the pseudo sum and the carry-in to a decode block for a memory array. The decode block may decode the pseudo sum into one or more one-hot vectors. The one-hot vectors may be input to muxes, and the one-hot vectors rotated by one position may be the other input. The actual carry-in may be the selection control of the mux.

Type: Grant

Filed: July 21, 2009

Date of Patent: May 1, 2012

Assignee: Apple Inc.

Inventors: Rajat Goel, Chen-Ju Hsieh
Scalar precision float implementation on the “W” lane of vector unit

Patent number: 8169439

Abstract: Embodiments of the invention are generally related to image processing, and more specifically to vector units for supporting image processing. A combined vector/scalar unit is provided wherein one or more processing lanes of the vector unit are used for performing scalar operations. An integrated register file is also provided for storing vector and scalar data. Therefore, the transfer of data to memory to exchange data between independent vector and scalar units is obviated and a significant amount of chip area is saved.

Type: Grant

Filed: October 23, 2007

Date of Patent: May 1, 2012

Assignee: International Business Machines Corporation

Inventors: David Arnold Luick, Eric Oliver Mejdrich, Adam James Muff
Storage device controller with a plurality of I/O processors requesting data from a plurality of stripe units of a logical volume

Patent number: 8099551

Abstract: Provided is a storage controller capable of improving the access performance to the storage device by preventing an I/O access request to the storage device from being concentrated on certain I/O processors among a plurality of I/O processor, and causing the plurality of I/O processors to issue the I/O access request in a well balanced manner. With this storage control system, a plurality of stripe units are formed by striping the logical volume into a stripe size of an arbitrary storage capacity, and information regarding which I/O processor among the plurality of I/O processors will output the I/O request to which stripe unit among the plurality of stripe units is stored as the control information in the memory.

Type: Grant

Filed: September 24, 2010

Date of Patent: January 17, 2012

Assignee: Hitachi, Ltd.

Inventors: Naotaka Kobayashi, Kunihito Matsuki, Hiroshi Ogasawara, Youichi Gotoh
Referencing a constant pool in a java virtual machine

Patent number: 8099723

Abstract: A method, apparatus, and computer instructions for referencing a constant pool. A determination is made as to whether a bytecode references the constant pool. A relative offset to the constant pool is identified for the bytecode, in response to the bytecode referencing the constant pool. The bytecode is then replaced with a new bytecode containing the relative offset. The relative offset is used to reference the constant pool.

Type: Grant

Filed: April 3, 2008

Date of Patent: January 17, 2012

Assignee: International Business Machines Corporation

Inventors: Peter Wiebe Burka, Graham Alan Chapman, Trent A. Gray-Donald, Karl Michael Taylor
Provision of extended addressing modes in a single instruction multiple data (SIMD) data processor

Patent number: 8060724

Abstract: Executing a first memory access instruction with update by an N-bit processor includes accessing at least one source register of a plurality of registers, wherein the accessing includes accessing a first register, wherein each register of the plurality of registers includes a main portion of N bits and an extension portion of M bits, wherein the main portion of the first register includes a first address operand. The execution of the first instruction further includes forming a memory access address using the first address operand; using the memory access address as an address for a memory access; producing an updated address operand; and writing the updated address operand to the main portion of the first register. The producing includes accessing an extension portion of a source register of the at least one source register to obtain modifying information and using the modifying information in the producing an updated address operand.

Type: Grant

Filed: August 15, 2008

Date of Patent: November 15, 2011

Assignee: Freescale Semiconductor, Inc.

Inventor: William C. Moyer
Method, system, and computer program product for out of order instruction address stride prefetch performance verification

Patent number: 7996203

Abstract: A method, system, and computer program product are provided for verifying out of order instruction address (IA) stride prefetch performance in a processor design having more than one level of cache hierarchies. Multiple instruction streams are generated and the instructions loop back to corresponding instruction addresses. The multiple instruction streams are dispatched to a processor and simulation application to process. When a particular instruction is being dispatched, the particular instruction's instruction address and operand address are recorded in the queue. The processor is monitored to determine if the processor executes fetch and prefetch commands in accordance with the simulation application. It is checked to determine if prefetch commands are issued for instructions having three or more strides.

Type: Grant

Filed: January 31, 2008

Date of Patent: August 9, 2011

Assignee: International Business Machines Corporation

Inventors: Wei-Yi Xiao, Dean G. Bair, Christopher A. Krygowski, Chung-Lung K. Shum
Row addressing

Patent number: 7933162

Abstract: Embodiments are provided that include a row decoder, including a row activation path, having a row address converter with an output coupled to an input of a section replacement detector. Further embodiments provide a method including mapping an external row address to an internal row address, wherein the internal row address comprises a section address, determining whether a section corresponding to the section address includes an error, and if the section includes an error, converting the internal row address to a redundant row address, wherein mapping the external row address to the internal row address is initiated prior to determining whether the section replacement should be performed. Further embodiments include a method for receiving a row address for a row in a memory section including a non-2^n number of normal rows and mapping the row address to a redundant row address by substracting a value from the row address.

Type: Grant

Filed: May 22, 2008

Date of Patent: April 26, 2011

Assignee: Micron Technology, Inc.

Inventors: Takuya Nakanishi, Takumi Nasu, Yoshinori Fujiwara
Method and apparatus for a double width load using a single width load port

Patent number: 7882325

Abstract: A single micro-instruction to perform either an N-bit or a 2N-bit load is provided. A microprocessor having an N-bit load port performs either an N-bit load or a 2N-bit load in a single cycle with the same micro-instruction being used for both the N-bit and the 2N-bit load.

Type: Grant

Filed: December 21, 2007

Date of Patent: February 1, 2011

Assignee: Intel Corporation

Inventors: Zeev Sperber, Robert Valentine, Ehud Cohen, Doron Orenstien, Benny Eitan
Serial Memory Interface for Extended Address Space

Publication number: 20110016291

Abstract: An integrated circuit memory device has a memory array and control logic with at least a first addressing mode in which the instruction includes a first instruction code and an address of a first length; and a second addressing mode in which the instruction includes the first instruction code and an address of a second length. The first length of the address is different from the second length of the address.

Type: Application

Filed: June 10, 2010

Publication date: January 20, 2011

Applicant: Macronix International Co., Ltd.

Inventors: Yulan Kuo, Kuen-Long Chang, Chun-Hsiung Hung
Efficient on-chip accelerator interfaces to reduce software overhead

Patent number: 7827383

Abstract: In one embodiment, a processor comprises execution circuitry and a translation lookaside buffer (TLB) coupled to the execution circuitry. The execution circuitry is configured to execute a store instruction having a data operand; and the execution circuitry is configured to generate a virtual address as part of executing the store instruction. The TLB is coupled to receive the virtual address and configured to translate the virtual address to a first physical address. Additionally, the TLB is coupled to receive the data operand and to translate the data operand to a second physical address. A hardware accelerator is also contemplated in various embodiments, as is a processor coupled to the hardware accelerator, a method, and a computer readable medium storing instruction which, when executed, implement a portion of the method.

Type: Grant

Filed: March 9, 2007

Date of Patent: November 2, 2010

Assignee: Oracle America, Inc.

Inventors: Lawrence A. Spracklen, Santosh G. Abraham, Adam R. Talcott
Content addressable memory architecture

Patent number: 7793040

Abstract: A content addressable memory (CAM) architecture comprises two components, a small, fast on-chip cache memory that stores data that is likely needed in the immediate future, and an off-chip main memory in normal RAM. The CAM allows data to be stored with an associated tag that is of any size and identifies the data. Via tags, waves of data are launched into a machine's computational hardware and re-associated with related tags upon return. Tags may be generated so that related data values have adjacent storage locations, facilitating fast retrieval. Typically, the CAM emits only complete operand sets. By using tags to identify unique operand sets, computations can be allowed to proceed out of order, and be recollected later for further processing. This allows greater computational speed via multiple parallel processing units that compute large sets of operand sets, or by opportunistically fetching and executing operand sets as they become available.

Type: Grant

Filed: June 1, 2005

Date of Patent: September 7, 2010

Assignee: Microsoft Corporation

Inventor: Ray A. Bittner, Jr.

1 2 3 4 next