Instruction Alignment Patents (Class 712/204)
  • Patent number: 7752494
    Abstract: Aligning execution point of duplicate copies of a user program by exchanging information about instructions executed. At least some of the exemplary embodiments may be a method of operating duplicate copies of a user program in a first and second processor, allowing at least one of the user programs to execute until retired instruction counter values in each processor are substantially the same, and then executing a number of instructions of each user program. Of the instructions executed, at least some of the instructions are decoded and the inputs of each decoded instruction determined (the decoding substantially simultaneously with executing in each processor).
    Type: Grant
    Filed: November 13, 2008
    Date of Patent: July 6, 2010
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Paul Del Vigna, Jr., Robert L. Jardine
  • Publication number: 20100138635
    Abstract: Systems, methods, and devices for managing endian-ness are disclosed. In one embodiment, a device is configured to selectively operate in one of a big-endian operating mode or a little-endian operating mode. The device may include a register in which the current endian mode of the device is indicated in at least two different bit positions within the register. The at least two different bit positions may be chosen such that a data bit in one of the bit positions would be read by a system if the device and system operate in the same endian mode, while a data bit in another of the chosen bit positions would be read by the system if the device and system are operating in different endian modes from one another. In some embodiments, the endian mode of the device may be controlled by a hardware input or a software input.
    Type: Application
    Filed: December 1, 2008
    Publication date: June 3, 2010
    Applicant: Micron Technology, Inc.
    Inventor: Harold B Noyes
  • Patent number: 7725659
    Abstract: A method of obtaining data, comprising at least one sector, for use by at least a first thread wherein each processor cycle is allocated to at least one thread, includes the steps of: requesting data for at least a first thread; upon receipt of at least a first sector of the data, determining whether the at least first sector is aligned with the at least first thread, wherein a given sector is aligned with a given thread when a processor cycle in which the given sector will be written is allocated to the given thread; responsive to a determination that the at least first sector is aligned with the at least first thread, bypassing the at least first sector, wherein bypassing a sector comprises reading the sector while it is being written; and responsive to a determination that the at least first sector is not aligned with the at least first thread, delaying the writing of the at least first sector until the occurrence of a processor cycle allocated to the at least first thread by retaining the at least first s
    Type: Grant
    Filed: September 5, 2007
    Date of Patent: May 25, 2010
    Assignee: International Business Machines Corporation
    Inventors: Michael Karl Gschwind, Hans Mikael Jacobson, Robert Alan Philhower
  • Patent number: 7694300
    Abstract: Described herein is an implementation of a technology for the construction, identification, and/or optimization of operating-system processes. At least one implementation, described herein, constructs an operating-system process having the contents as defined by a process manifest. Once constructed, the operating-system process is unalterable.
    Type: Grant
    Filed: April 29, 2005
    Date of Patent: April 6, 2010
    Assignee: Microsoft Corporation
    Inventors: Galen C. Hunt, James R. Larus, John D. DeTreville, Michael B. Jones, Trishul A. Chilimbi
  • Patent number: 7685407
    Abstract: The present invention is to provide a semiconductor device that can correctly switch endians on the outside even if the endian of a parallel interface is not recognized on the outside. The semiconductor device includes a switching circuit and a first register. The switching circuit switches between whether a parallel interface with the outside is to be used as a big endian or a little endian. A first register holds control data of the switching circuit. The switching circuit regards the parallel interface as the little endian when first predetermined control information, that is unchanged in the values of specific bit positions even if its high-order and low-order bit positions are transposed, is supplied to the first register, and regards the parallel interface as the big endian when second predetermined control information, that is unchanged in the values of specific bit positions even if its high-order and low-order bit positions are transposed, is supplied to the first register.
    Type: Grant
    Filed: May 31, 2006
    Date of Patent: March 23, 2010
    Assignee: Renesas Technology Corp.
    Inventors: Goro Sakamaki, Yuri Azuma
  • Publication number: 20100050026
    Abstract: A pipeline operation processor comprises a pipeline processing unit and an instruction insertion controller which inserts an instruction when access to an operation memory is requested, and corrects control information by reference to control information of stages. When a control program is in execution, on receiving an access request instruction requesting for access to the operation memory, the instruction insertion controller inserts an NOP instruction from the instruction decoding unit in place of the access request instruction. The access request instruction is executed while the pipeline processing unit executes no operation, and subsequently, the pipeline processing is continued.
    Type: Application
    Filed: June 4, 2009
    Publication date: February 25, 2010
    Inventor: Motohiko Okabe
  • Patent number: 7640417
    Abstract: Methods and apparatus relating to speculatively decoding instruction lengths in order to increase instruction throughput are described. In an embodiment, instructions are speculatively decoded within a pipelined microprocessor architecture such that up to four instruction lengths may be decoded within a maximum of two processor clock cycles. Other embodiments are also disclosed.
    Type: Grant
    Filed: October 1, 2007
    Date of Patent: December 29, 2009
    Assignee: Intel Corporation
    Inventor: Venkateswara Rao Madduri
  • Publication number: 20090300328
    Abstract: An apparatus for receiving one or more protocol data units (PDUs) from a word aligned queue including a media access control (MAC) physical-layer (PHY) coprocessor (MPC) logically residing between a physical-layer controller and a media access controller (MAC) processor. The MPC is configured to access a reception physical-layer queue storing a burst, such that the reception physical-layer queue includes a plurality of word lines. The burst includes one or more PDUs that each occupy one or more word lines of the reception physical-layer queue, such that a particular word line stores a portion of a first PDU and a portion of second PDU. The MPC is also configured to receive from the reception physical-layer queue the first PDU including the portion of the first PDU stored in the selected word line.
    Type: Application
    Filed: May 27, 2008
    Publication date: December 3, 2009
    Applicant: Fujitsu Limited
    Inventors: Kartik Raju, Mehmet Un
  • Patent number: 7624251
    Abstract: One embodiment of the present invention provides a processor that is configured to execute load-swapped-partial instructions. An instruction fetch unit within the processor is configured to fetch the load-swapped-partial instruction to be executed. Note that the load-swapped-partial instruction specifies a source address in a memory, which is possibly an unaligned address. Furthermore, an execution unit within the processor is configured to execute the load-swapped-partial instruction. This involves loading a partial-vector-sized datum from a naturally-aligned memory region encompassing the source address.
    Type: Grant
    Filed: January 18, 2007
    Date of Patent: November 24, 2009
    Assignee: Apple Inc.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Patent number: 7620787
    Abstract: A processing device includes an optimizer to migrate objects from an external memory of a network processing to local memory device to registers connected to a processor. The optimizer further aligns and eliminates redundant unitialization code of the objects.
    Type: Grant
    Filed: January 26, 2006
    Date of Patent: November 17, 2009
    Assignee: Intel Corporation
    Inventors: Jiangang Zhuang, Jinquan Dai, Long Li
  • Patent number: 7620797
    Abstract: One embodiment of the present invention provides a processor which is configured to execute load-swapped instructions, which are possibly directed to unaligned source address. The processor is configured to execute the load-swapped instruction by loading a vector from a naturally-aligned memory region encompassing the source address, and in doing so rotating the bytes of the vector to cause the byte at the specified source address to reside at the least-significant byte position within the vector for a little-endian memory transaction, or causing said byte to be positioned at the most-significant byte position within the vector for a big-endian memory transaction.
    Type: Grant
    Filed: November 1, 2006
    Date of Patent: November 17, 2009
    Assignee: Apple Inc.
    Inventors: Jeffry E. Gonion, Keith E. Diefendorff
  • Patent number: 7610466
    Abstract: Various load and store instructions may be used to transfer multiple vector elements between registers in a register file and memory. A cnt parameter may be used to indicate a total number of elements to be transferred to or from memory, and an rcnt parameter may be used to indicate a maximum number of vector elements that may be transferred to or from a single register within a register file. Also, the instructions may use a variety of different addressing modes. The memory element size may be specified independently from the register element size such that source and destination sizes may differ within an instruction. With some instructions, a vector stream may be initiated and conditionally enqueued or dequeued. Truncation or rounding fields may be provided such that source data elements may be truncated or rounded when transferred. Also, source data elements may be sign- or unsigned-extended when transferred.
    Type: Grant
    Filed: September 5, 2003
    Date of Patent: October 27, 2009
    Assignee: Freescale Semiconductor, Inc.
    Inventor: William C. Moyer
  • Publication number: 20090249032
    Abstract: An information apparatus comprises: a barrel shifter composed of a bidirectional 1-bit shifter, . . . , and a bidirectional 24-bit shifter which are connected in series; a control unit for outputting an endian conversion control signal SE indicating one of a shift operation and endian conversion; an endian conversion unit for generating data by endian conversion using data obtained by performing a shift operation in the bidirectional 8-bit shifter and the bidirectional 24-bit shifter; and a selector for selecting, when the endian conversion control signal SE indicates a shift operation, data outputted from the bidirectional 24-bit shifter, and selecting, when the endian conversion control signal SE indicates endian conversion, the data outputted from the endian conversion unit.
    Type: Application
    Filed: February 25, 2009
    Publication date: October 1, 2009
    Applicant: Panasonic Corporation
    Inventors: Takashi Nishihara, Toshifumi Hamaguchi
  • Patent number: 7587535
    Abstract: When data is transferred to an access destination in a different endian format, a transfer start address is aligned based on a transfer bus width, and a transfer size is adjusted according to the transfer bus width and a transfer address. Thus, it becomes possible to perform burst transfer in the access destination. Accordingly, in the case where burst transfer to an access destination in a different endian format is performed with a smaller data width than a transfer bus width, an inconvenience where burst transfer can not be performed because an address is converted and data access is no longer an ascending order access can be prevented.
    Type: Grant
    Filed: April 20, 2007
    Date of Patent: September 8, 2009
    Assignee: Panasonic Corporation
    Inventor: Takatsugu Sawai
  • Patent number: 7587532
    Abstract: A method and apparatus for adaptive buffer sizing adjusts the size of the buffer at different levels using a “high water mark” to different levels for different system conditions. The high water mark is used by the buffer logic as an indication of when to assert the buffer “Full” flag. In turn, the full flag is used by the instruction fetch logic as an indication of when to stop fetching further instructions.
    Type: Grant
    Filed: January 31, 2005
    Date of Patent: September 8, 2009
    Assignee: Texas Instruments Incorporated
    Inventors: Jeffrey L. Nye, Sam B. Sandbote
  • Publication number: 20090217001
    Abstract: The present invention provides a system and method for managing load and store operations necessary for reading from and writing to memory or I/O in a superscalar RISC architecture environment. To perform this task, a load store unit is provided whose main purpose is to make load requests out of order whenever possible to get the load data back for use by an instruction execution unit as quickly as possible. A load operation can only be performed out of order if there are no address collisions and no write pendings. An address collision occurs when a read is requested at a memory location where an older instruction will be writing. Write pending refers to the case where an older instruction requests a store operation, but the store address has not yet been calculated. The data cache unit returns 8 bytes of unaligned data. The load/store unit aligns this data properly before it is returned to the instruction execution unit.
    Type: Application
    Filed: May 6, 2009
    Publication date: August 27, 2009
    Inventors: Cheryl D. Senter, Johannes Wang
  • Patent number: 7577777
    Abstract: A computer system providing endian information and a method of data transmission thereof are disclosed. The method of data transmission in the computer system of the present invention comprises: reading endian information stored in a base address register of peripheral devices; deciding whether the endian information of the computer system is identical with endian information of the peripheral devices; byte-swapping data of the peripheral devices when the endian information of the computer system is different from the endian information of the peripheral devices, and transmitting the byte-swapped data to a system bus of the computer system; and transmitting the data of the peripheral devices to the system bus when the endian information of the computer system is identical with the endian information of the peripheral devices.
    Type: Grant
    Filed: October 24, 2007
    Date of Patent: August 18, 2009
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Jeong-Ju Lee
  • Patent number: 7568070
    Abstract: A fixed number of variable-length instructions are stored in each line of an instruction cache. The variable-length instructions are aligned along predetermined boundaries. Since the length of each instruction in the line, and hence the span of memory the instructions occupy, is not known, the address of the next following instruction is calculated and stored with the cache line. Ascertaining the instruction boundaries, aligning the instructions, and calculating the next fetch address are performed in a predecoder prior to placing the instructions in the cache.
    Type: Grant
    Filed: July 29, 2005
    Date of Patent: July 28, 2009
    Assignee: QUALCOMM Incorporated
    Inventors: Jeffrey Todd Bridges, James Norris Dieffenderfer, Rodney Wayne Smith, Thomas Andrew Sartorius
  • Publication number: 20090187739
    Abstract: Load and store operations in computer systems are extended to provide for Stream Load and Store and Masked Load and Store. In Stream operations, a CPU executes a Stream instruction that indicates, by appropriate arguments, a first address in memory or a first register in a register file from whence to begin reading data entities, and a first address or register from whence to begin storing the entities, and a number of entities to be read and written. In Masked Load and Masked Store operations stored masks are used to indicate patterns relative to first addresses and registers for loading and storing. Bit-string vector methods are taught for masks.
    Type: Application
    Filed: March 26, 2009
    Publication date: July 23, 2009
    Inventors: Mario NEMIROVSKY, Enrique Musoll, Narendra Sankar, Stephen Melvin
  • Patent number: 7565510
    Abstract: A load/store unit includes a Top register for storing a value retained before loading to a load destination register and a saved register capable of storing data retained to the Top register. When an unaligned instruction evaluation unit determines that a load instruction issued from a instruction decode unit is an unaligned instruction, data stored to the Top register are stored to the saved register in order to make the Top register available to subsequent load instructions issued from the instruction decode unit.
    Type: Grant
    Filed: April 25, 2006
    Date of Patent: July 21, 2009
    Assignee: NEC Electronics Corporation
    Inventor: Shuichi Kunie
  • Publication number: 20090182981
    Abstract: A rotate then operate instruction having a T bit is fetched and executed wherein a first operand in a first register is rotated by an amount and a Boolean operation is performed on a selected portion of the rotated first operand and a second operand in of a second register. If the T bit is ‘0’ the selected portion of the result of the Boolean operation is inserted into corresponding bits of a second operand of a second register. If the T bit is ‘1’, in addition to the inserted bits, the bits other than the selected portion of the rotated first operand are saved in the second register.
    Type: Application
    Filed: January 11, 2008
    Publication date: July 16, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dan F. Greiner, Timothy J. Slegel, Joachim von Buttlar
  • Publication number: 20090182982
    Abstract: A rotate then operate instruction having a Z bit is fetched and executed wherein a first operand in a first register is rotated by an amount. If the Z bit is ‘0’ the selected portion of the result of the Boolean operation is inserted into corresponding bits of a second operand of a second register. If the Z bit is ‘1’, in addition to the inserted bits bits other than the inserted bits of the second operand are set to zeros.
    Type: Application
    Filed: January 11, 2008
    Publication date: July 16, 2009
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dan F. Greiner, Timothy J. Slegel, Joachim von Buttlar
  • Patent number: 7525457
    Abstract: A computer implemented method converts a data set of a first type to a data set type of a second type. The method includes casting up a first data set of a first type to a prescribed data set type that is large enough to encompass a data set of a second type. The method then includes casting down the casted up first data set from the prescribed data set type to the second data set of the second data set type.
    Type: Grant
    Filed: January 12, 2007
    Date of Patent: April 28, 2009
    Assignee: Star Bridge Systems, Inc.
    Inventor: Kent L. Gilson
  • Publication number: 20090063818
    Abstract: A method of obtaining data, comprising at least one sector, for use by at least a first thread wherein each processor cycle is allocated to at least one thread, includes the steps of: requesting data for at least a first thread; upon receipt of at least a first sector of the data, determining whether the at least first sector is aligned with the at least first thread, wherein a given sector is aligned with a given thread when a processor cycle in which the given sector will be written is allocated to the given thread; responsive to a determination that the at least first sector is aligned with the at least first thread, bypassing the at least first sector, wherein bypassing a sector comprises reading the sector while it is being written; and responsive to a determination that the at least first sector is not aligned with the at least first thread, delaying the writing of the at least first sector until the occurrence of a processor cycle allocated to the at least first thread by retaining the at least first s
    Type: Application
    Filed: September 5, 2007
    Publication date: March 5, 2009
    Inventors: Michael Karl Gschwind, Hans Mikael Jacobson, Robert Alan Philhower
  • Publication number: 20090037694
    Abstract: Embodiments of the invention provide logic within the store data path between a processor and a memory array. The logic may be configured to misalign vector data as it is stored to memory. By misaligning vector data as it is stored to memory, memory bandwidth may be maximized while processing bandwidth required to store vector data misaligned is minimized. Furthermore, embodiments of the invention provide logic within the load data path which allows vector data which is stored misaligned to be aligned as it is loaded into a vector register. By aligning misaligned vector data as it is loaded into a vector register, memory bandwidth may be maximized while processing bandwidth required to align misaligned vector data may be minimized.
    Type: Application
    Filed: July 31, 2007
    Publication date: February 5, 2009
    Inventors: David Arnold Luick, Eric Oliver Mejdrich, Adam James Muff
  • Patent number: 7480783
    Abstract: Disclosed are systems for loading an unaligned word from a specified unaligned word address in a memory, the unaligned word comprising a plurality of indexed portions crossing a word boundry, a method of operating the system comprising: loading a first aligned word commencing at an aligned word address rounded from the specified unaligned word address; identifying an index representing the location of the unaligned word address relative to the aligned word address; loading a second aligned word commencing at an aligned word address rounded from a second unaligned word address; and combining indexed portions of the first and second alinged words using the indentified index to construct the unaligned word.
    Type: Grant
    Filed: August 19, 2004
    Date of Patent: January 20, 2009
    Assignees: STMicroelectronics Limited, Hewlett-Packard Company
    Inventors: Mark O. Homewood, Paolo Faraboschi
  • Patent number: 7473293
    Abstract: A conversion table converts a packed instruction (pre-conversion code) contained in the instruction code fetched from an instruction memory into a plurality of instruction codes (converted codes). An instruction decoder decodes the plurality of the instruction codes converted by a conversion table. A plurality of ALUs perform the operation in accordance with the decoding result of the instruction decoder. Therefore, the number of instructions that can be executed in parallel per cycle may be increased while at the same time the capacity of the instruction memory is reduced.
    Type: Grant
    Filed: September 14, 2006
    Date of Patent: January 6, 2009
    Assignee: Renesas Technology Corp.
    Inventor: Masami Nakajima
  • Patent number: 7467327
    Abstract: A method and system of aligning execution point of duplicate copies of a user program by exchanging information about instructions executed. At least some of the exemplary embodiments may be a method comprising operating duplicate copies of a user program in a first and second processor, allowing at least one of the user programs to execute until retired instruction counter values in each processor are substantially the same, and then executing a number of instructions of each user program. Of the instructions executed, at least some of the instructions are decoded and the inputs of each decoded instruction determined (the decoding substantially simultaneously with executing in each processor).
    Type: Grant
    Filed: January 25, 2005
    Date of Patent: December 16, 2008
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: Paul Del Vigna, Jr., Robert L. Jardine
  • Publication number: 20080270756
    Abstract: A decimal floating point finite number in a decimal floating point format is composed from the number in a different format. A decimal floating point format includes fields to hold information relating to the sign, exponent and significand of the decimal floating point finite number. Other decimal floating point data, including infinities and NaNs (not a number), are also composed. Decimal floating point data are also decomposed from the decimal floating point format to a different format. For composition and decomposition, one or more instructions may be employed, including a shift significand instruction.
    Type: Application
    Filed: April 26, 2007
    Publication date: October 30, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Shawn D. Lundvall, Eric M. Schwarz, Ronald M. Smith, Phil C. Yeh
  • Patent number: 7444488
    Abstract: A method and a programmable unit for bit field shifting in a memory device in a programmable unit as a result of the execution of an instruction, in which a bit segment is shifted within a first memory unit to a second memory unit, are presented. The bit segment is read with a first bit length from a first bit field in the first memory unit starting at a first start point. The bit segment that has been read is stored in the first bit field in the second memory unit starting at a second start point. The first or the second start points is updated by a predetermined value and the updated start point is stored for subsequent method steps.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: October 28, 2008
    Assignee: Infineon Technologies
    Inventors: Xiaoning Nie, Thomas Wahl
  • Patent number: 7437537
    Abstract: In an instruction execution pipeline, the misalignment of memory access instructions is predicted. Based on the prediction, an additional micro-operation is generated in the pipeline prior to the effective address generation of the memory access instruction. The additional micro-operation accesses the memory falling across a predetermined address boundary. Predicting the misalignment and generating a micro-operation early in the pipeline ensures that sufficient pipeline control resources are available to generate and track the additional micro-operation, avoiding a pipeline flush if the resources are not available at the time of effective address generation. The misalignment prediction may employ known conditional branch prediction techniques, such as a flag, a bimodal counter, a local predictor, a global predictor, and combined predictors. A misalignment predictor may be enabled or biased by a memory access instruction flag or misaligned instruction type.
    Type: Grant
    Filed: February 17, 2005
    Date of Patent: October 14, 2008
    Assignee: QUALCOMM Incorporated
    Inventors: Jeffrey Todd Bridges, Victor Roberts Augsburg, James Norris Dieffenderfer, Thomas Andrew Sartorius
  • Publication number: 20080229066
    Abstract: A system, method, and computer program product are provided for performing scalar operations using a SIMD data parallel execution unit. With the mechanisms of the illustrative embodiments, scalar operations in application code are identified that may be executed using vector operations in a SIMD data parallel execution unit. The scalar operations are converted, such as by a static or dynamic compiler, into one or more vector load instructions and one or more vector computation instructions. In addition, control words may be generated to adjust the alignment of the scalar values for the scalar operation within the vector registers to which these scalar values are loaded using the vector load instructions. The alignment amounts for adjusting the scalar values within the vector registers may be statically or dynamically determined.
    Type: Application
    Filed: May 28, 2008
    Publication date: September 18, 2008
    Applicant: International Business Machines Corporation
    Inventor: Michael K. Gschwind
  • Publication number: 20080222391
    Abstract: An apparatus and method for optimizing scalar code executed on a single instruction multiple data (SIMD) engine is provided that aligns the slots of SIMD registers. With the apparatus and method, a compiler is provided that parses source code and, for each statement in the program, generates an expression tree. The compiler inspects all storage inputs to scalar operations in the expression tree to determine their alignment in the SIMD registers. This alignment is propagated up the expression tree from the leaves. When the alignments of two operands in the expression tree are the same, the resulting alignment is the shared value. When the alignments of two operands in the expression tree are different, one operand is shifted. For shifted operands, a shift operation is inserted in the expression tree. The executable code is then generated for the expression tree and shifts are inserted where indicated.
    Type: Application
    Filed: May 27, 2008
    Publication date: September 11, 2008
    Applicant: International Business Machines Corporation
    Inventors: Alexandre E. Eichenberger, John Kevin Patrick O'Brien
  • Patent number: 7412584
    Abstract: Systems and methods are disclosed for aligning data in memory access and other applications. In one embodiment a system is provided that includes a memory unit, a shifter, and control logic operable to route data from the memory unit to the shifter and to send an indication to the shifter of an amount by which the data is to be shifted. In one embodiment, the control logic provides support for speculative execution. The control logic may also permit multiplexing of big endian and little endian data alignment operations, and/or multiplexing of data alignment operations with non-data alignment operations. In one embodiment, the memory unit, shifter, and control logic are integrated within a processing unit, such as a microengine in a network processor.
    Type: Grant
    Filed: May 3, 2004
    Date of Patent: August 12, 2008
    Assignee: Intel Corporation
    Inventors: Jose S. Niell, Gilbert M. Wolrich, Thomas L. Dmukauskas, Mark B. Rosenbluth
  • Patent number: 7404019
    Abstract: A method for providing endianness control in a data processing system includes initiating an access which accesses a peripheral, providing a first endianness control that corresponds to the peripheral, and completing the access using the endianness control to affect the endianness order of the information transferred during the access. In one embodiment, the first endianness control overrides a default endianness corresponding to the access. The default endianness may be provided by a master endianness control corresponding to a master requesting the current access. A data processing system includes a first bus master, first and second peripherals, first endianness control corresponding to the first peripheral and second endianness control corresponding to the second peripheral, and control circuitry which uses the first endianness control to control endianness for an access between the first bus master and the first peripheral. In one embodiment, the data processing system may include multiple masters.
    Type: Grant
    Filed: May 26, 2004
    Date of Patent: July 22, 2008
    Assignee: Freescale Semiconductor, Inc.
    Inventors: William C. Moyer, Michael D. Fitzsimmons
  • Patent number: 7404042
    Abstract: A fetch section of a processor comprises an instruction cache and a pipeline of several stages for obtaining instructions. Instructions may cross cache line boundaries. The pipeline stages process two addresses to recover a complete boundary crossing instruction. During such processing, if the second piece of the instruction is not in the cache, the fetch with regard to the first line is invalidated and recycled. On this first pass, processing of the address for the second part of the instruction is treated as a pre-fetch request to load instruction data to the cache from higher level memory, without passing any of that data to the later stages of the processor. When the first line address passes through the fetch stages again, the second line address follows in the normal order, and both pieces of the instruction are can be fetched from the cache and combined in the normal manner.
    Type: Grant
    Filed: May 18, 2005
    Date of Patent: July 22, 2008
    Assignee: QUALCOMM Incorporated
    Inventors: Brian Michael Stempel, Jeffrey Todd Bridges, Rodney Wayne Smith, Thomas Andrew Sartorius
  • Publication number: 20080162879
    Abstract: In some embodiments, a method includes receiving a sequence of instructions in a processing system, determining whether an instruction in the sequence is a type to be aligned, and if the instruction is a type to be aligned, aligning the instruction. In some embodiments, a method includes receiving an instruction in a processing system and executing the instruction unless the instruction is a first type of instruction. In some embodiments, an apparatus includes circuitry to receive an instruction and to execute the instruction unless the instruction is a first type of instruction. In some embodiments, a system includes circuitry to receive an instruction and to execute the instruction unless the instruction is a first type of instruction, and a memory unit to store the instruction.
    Type: Application
    Filed: December 30, 2006
    Publication date: July 3, 2008
    Inventor: Hong Jiang
  • Publication number: 20080162880
    Abstract: A system and method for extracting complex, variable length computer instructions from a stream of complex instructions each subdivided into a variable number of instructions bytes, and aligning instruction bytes of individual ones of the complex instructions. The system receives a portion of the stream of complex instructions and extracts a first set of instruction bytes starting with the first instruction bytes, using an extract shifter. The set of instruction bytes are then passed to an align latch where they are aligned and output to a next instruction detector. The next instruction detector determines the end of the first instruction based on said set of instruction bytes. An extract shifter is used to extract and provide the next set of instruction bytes to an align shifter which aligns and outputs the next instruction. The process is then repeated for the remaining instruction bytes in the stream of complex instructions.
    Type: Application
    Filed: March 11, 2008
    Publication date: July 3, 2008
    Applicant: Transmeta Corporation
    Inventors: Brett Coon, Yoshiyuki Miyayama, Le Trong Nguyen, Johannes Wang
  • Patent number: 7392337
    Abstract: A system for implementing a memory subsystem command interface, the system including a cascaded interconnect system including one or more memory modules, a memory controller and a memory bus. The memory controller generates a data frame that includes a plurality of commands. The memory controller and the memory module are interconnected by a packetized multi-transfer interface via the memory bus and the frame is transmitted to the memory modules via the memory bus.
    Type: Grant
    Filed: July 20, 2007
    Date of Patent: June 24, 2008
    Assignee: International Business Machines Corporation
    Inventors: Kevin C. Gower, Warren E. Maule
  • Patent number: 7392366
    Abstract: A multithreaded processor, fetch control for a multithreaded processor and a method of fetching in the multithreaded processor. Processor event and use (EU) signs are monitored for downstream pipeline conditions indicating pipeline execution thread states. Instruction cache fetches are skipped for any thread that is incapable of receiving fetched cache contents, e.g., because the thread is full or stalled. Also, consecutive fetches may be selected for the same thread, e.g., on a branch mis-predict. Thus, the processor avoids wasting power on unnecessary or place keeper fetches.
    Type: Grant
    Filed: September 16, 2005
    Date of Patent: June 24, 2008
    Assignee: International Business Machines Corp.
    Inventors: Pradip Bose, Alper Buyuktosunoglu, Richard J. Eickemeyer, Lee E. Eisen, Philip G. Emma, John B. Griswell, Zhigang Hu, Hung Q. Le, Douglas R. Logan, Balaram Sinharoy
  • Publication number: 20080140992
    Abstract: A computing system may support an endian toggle register (ETR) and the endianess of the endian toggle register may be designated using a set endian bit (SEB) or a clear endian bit (CEB) instruction. An endian conversion is performed on the data that is moved into and moved out of the ETR. However, if the destination memory is an endian toggle disabled memory, the contents of the ETR may be transferred to the endian toggle disabled memory without performing the endian conversion. A compiler supported on the computing system may comprise an endian storage class to perform endian conversion, transparently, using high-level languages.
    Type: Application
    Filed: October 16, 2007
    Publication date: June 12, 2008
    Inventor: Gurumurthy Rajaram
  • Patent number: 7386706
    Abstract: A system and software for improving the performance of processors by incorporating an execution unit operable to decode and execute single instructions in an instruction set comprising (a) group instructions that operate on a plurality of data elements in partitioned fields of a register to produce a catenated result, (b) aligned memory operations that move data between memory and register where the memory operand is aligned, and (c) unaligned memory operations where the memory operand is unaligned.
    Type: Grant
    Filed: November 20, 2003
    Date of Patent: June 10, 2008
    Assignee: Microunity Systems Engineering, Inc.
    Inventors: Craig Hansen, John Moussouris
  • Patent number: 7360059
    Abstract: In one embodiment, a digital signal processor includes look ahead logic to decrease the number of bubbles inserted in the processing pipeline. The processor receives data containing instructions in a plurality of buffers and decodes the size of a first instruction. The beginning of a second instruction is determined based on the size of the first instruction. The size of the second instruction is decoded and the processor determines whether loading the second instruction will deplete one of the plurality of buffers.
    Type: Grant
    Filed: February 3, 2006
    Date of Patent: April 15, 2008
    Assignee: Analog Devices, Inc.
    Inventors: Thomas Tomazin, William C. Anderson, Charles P. Roth, Kayla Chalmers, Juan G. Revilla, Ravi P. Singh
  • Patent number: 7353371
    Abstract: A method and device to copy data fields from one or more source packets to one or more result packets. In a SET function, adjacent data fields in a source packet is copied to respective destination data fields in a result packet governed by a field locator packet. In an ESET function, data fields in respective source packets are copied to adjacent data fields in a result packet governed by a field locator packet. In an EXTRACT function, data fields in a source packet are copied to adjacent data fields in a result packet governed by a field locator packet. In a SCATTER function, adjacent data fields in a source packet are copied to data fields in respective result packets governed by a field locator packet.
    Type: Grant
    Filed: December 5, 2002
    Date of Patent: April 1, 2008
    Assignee: Intel Corporation
    Inventors: Corey Gee, Bapi Vinnakota
  • Patent number: 7343473
    Abstract: A system and method for extracting complex, variable length computer instructions from a stream of complex instructions each subdivided into a variable number of instructions bytes, and aligning instruction bytes of individual ones of the complex instructions. The system receives a portion of the stream of complex instructions and extracts a first set of instruction bytes starting with the first instruction bytes, using an extract shifter. The set of instruction bytes are then passed to an align latch where they are aligned and output to a next instruction detector. The next instruction detector determines the end of the first instruction based on said set of instruction bytes. An extract shifter is used to extract and provide the next set of instruction bytes to an align shifter which aligns and outputs the next instruction. The process is then repeated for the remaining instruction bytes in the stream of complex instructions.
    Type: Grant
    Filed: June 28, 2005
    Date of Patent: March 11, 2008
    Assignee: Transmeta Corporation
    Inventors: Brett Coon, Yoshiyuki Miyayama, Le Trong Nguyen, Johannes Wang
  • Patent number: 7334066
    Abstract: A computer system providing endian information and a method of data transmission thereof are disclosed. The method of data transmission in the computer system of the present invention comprises: reading endian information stored in a base address register of peripheral devices; deciding whether the endian information of the computer system is identical with endian information of the peripheral devices; byte-swapping data of the peripheral devices when the endian information of the computer system is different from the endian information of the peripheral devices, and transmitting the byte-swapped data to a system bus of the computer system; and transmitting the data of the peripheral devices to the system bus when the endian information of the computer system is identical with the endian information of the peripheral devices.
    Type: Grant
    Filed: June 27, 2003
    Date of Patent: February 19, 2008
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Jeong-Ju Lee
  • Publication number: 20080040576
    Abstract: In a variable-length instruction set wherein the length of each instruction is a multiple of a minimum instruction length granularity, an indication of the last granularity (i.e., the end) of a taken branch instruction is a stored in a branch target address cache (BTAC). If a branch instruction that later hits in the BTAC is predicted taken, previously fetched instructions are flushed from the pipeline beginning immediately past the indicated end of the branch instruction. This technique saves BTAC space by avoiding to the need to store the length of the branch instruction in the BTAC, and improves performance by eliminating the necessity of calculating where to begin flushing (based on the length of the branch instruction).
    Type: Application
    Filed: August 9, 2006
    Publication date: February 14, 2008
    Inventors: Brian Michael Stempel, Rodney Wayne Smith
  • Patent number: 7328433
    Abstract: Methods and apparatus for reducing memory latency in a software application are disclosed. A disclosed system uses one or more helper threads to prefetch variables for a main thread to reduce performance bottlenecks due to memory latency and/or a cache miss. A performance analysis tool is used to profile the software application's resource usage and identifies areas in the software application experiencing performance bottlenecks. Compiler-runtime instructions are generated into the software application to create and manage the helper thread. The helper thread prefetches data in the identified areas of the software application experiencing performance bottlenecks. A counting mechanism is inserted into the helper thread and a counting mechanism is inserted into the main thread to coordinate the execution of the helper thread with the main thread and to help ensure the prefetched data is not removed from the cache before the main thread is able to take advantage of the prefetched data.
    Type: Grant
    Filed: October 2, 2003
    Date of Patent: February 5, 2008
    Assignee: Intel Corporation
    Inventors: Xinmin Tian, Shih-wei Liao, Hong Wang, Milind Girkar, John Shen, Perry Wang, Grant Haab, Gerolf Hoflehner, Daniel Lavery, Hideki Saito, Sanjiv Shah, Dongkeun Kim
  • Patent number: 7305542
    Abstract: Speculatively decoding instruction lengths in order to increase instruction throughput. Instructions are speculatively decoded within a pipelined microprocessor architecture such that up to four instruction lengths may be decoded within a maximum of two processor clock cycles.
    Type: Grant
    Filed: June 25, 2002
    Date of Patent: December 4, 2007
    Assignee: Intel Corporation
    Inventor: Venkateswara Rao Madduri
  • Patent number: 7302552
    Abstract: A processor is described including a plurality of data path elements which independently perform in parallel different data processing operations. Program instructions are provided which are decoded to generate control signals for controlling the data path elements. Multiple instruction sets are supported with the same data processing operation to be performed by the same data path element being differently encoded within different instructions of different instruction sets. This enables code compaction when little parallelism may be achieved and full parallelism to be specified when this is possible.
    Type: Grant
    Filed: October 14, 2004
    Date of Patent: November 27, 2007
    Assignee: Arm Limited
    Inventors: Jan Guffens, Ludwig Callewaert, Koenraad Van Nieuwenhove