Sequential Patents (Class 712/8)
  • Patent number: 11829756
    Abstract: A vector cumulative sum circuit can include a set of input registers, a carry-forward data source, a set of output registers, and a network of adder circuits coupling the input registers to the output registers such that the output value in a given output register is the sum of a value provided by the carry-forward data source and the input values from all of the input registers (in logical order) up to (and including) the corresponding input register. The value in the last output register can be carried forward to enable cumulative summing of a larger number of input values. The vector cumulative sum circuit can be implemented in a programmable processor, and a vector cumulative sum instruction can be defined in the instruction set. Using the vector cumulative sum circuit and instruction, filtering operations can be accelerated.
    Type: Grant
    Filed: September 24, 2021
    Date of Patent: November 28, 2023
    Assignee: Apple Inc.
    Inventors: On Wa Yeung, Seydou N. Ba
  • Patent number: 11249767
    Abstract: An information handling system may load first data from a location information area of a first memory, specifying a plurality of locations of metadata for a plurality of stages of basic input/output system (BIOS) initialization. The information handling system may then load first metadata for a first stage of BIOS initialization from a first metadata location of the plurality of locations specified by the first data. The first metadata may contain information for indexing first initialization data located at a first initialization data location. The information handling system may then index the first initialization data of the first initialization data location based, at least in part, on the first metadata. The information handling system may then perform the first stage of BIOS initialization based, at least in part, on the first initialization data.
    Type: Grant
    Filed: February 5, 2019
    Date of Patent: February 15, 2022
    Assignee: Dell Products L.P.
    Inventors: Shekar Babu Suryanarayana, Sumanth Vidyadhara, Anand Prakash Joshi
  • Patent number: 10838719
    Abstract: Examples of a carry chain for performing an operation on operands each including elements of a selectable size is provided. Advantageously, the carry chain adapts to elements of different sizes. The carry chain determines a mask based on a selected size of an element. The carry chain selects, based on the mask, whether to carry a partial result of an operation performed on corresponding first portions of a first operand and a second operand into a next operation. The next operation is performed on corresponding second portions of the first operand and the second operand, and, based on the selection, the partial result of the operation. The carry chain stores, in a memory, a result formed from outputs of the operation and the next operation.
    Type: Grant
    Filed: November 13, 2015
    Date of Patent: November 17, 2020
    Assignee: Marvell Asia Pte, Ltd
    Inventor: David Kravitz
  • Patent number: 10831478
    Abstract: A Sort Lists instruction is provided to perform a sort and/or a merge operation. The instruction is an architected machine instruction of an instruction set architecture and is executed by a general-purpose processor of the computing environment. The executing includes sorting a plurality of input lists to obtain one or more sorted output lists, which are output.
    Type: Grant
    Filed: November 6, 2018
    Date of Patent: November 10, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce C. Giamei, Martin Recktenwald, Donald W. Schmidt, Timothy Slegel, Aditya N. Puranik, Mark S. Farrell, Christian Jacobi, Jonathan D. Bradbury, Christian Zoellin
  • Patent number: 10831503
    Abstract: Saving and restoring machine state between multiple executions of an instruction. A determination is made that processing of an operation of an instruction executing on a processor has been interrupted prior to completion. Based on determining that the processing of the operation has been interrupted, current metadata of the processor is extracted. The metadata is stored in a location associated with the instruction and used to re-execute the instruction to resume forward processing of the instruction from where it was interrupted.
    Type: Grant
    Filed: November 6, 2018
    Date of Patent: November 10, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Bruce C. Giamei, Martin Recktenwald, Donald W. Schmidt, Timothy Slegel, Aditya N. Puranik, Mark S. Farrell, Christian Jacobi, Jonathan D. Bradbury, Christian Zoellin
  • Patent number: 10809978
    Abstract: A merge sort accelerator (MSA) includes a pre-processing stage configured to receive an input vector and generate a pre-processing output vector based on a pre-processing instruction and the input vector. The MSA also includes a merge sort network having multiple sorting stages configured to be selectively enabled. The merge sort network is configured to receive the pre-processing output vector and generate a sorted output vector based on a sorting instruction and the pre-processing output vector. The MSA includes an accumulator stage configured to receive the sorted output vector and update an accumulator vector based on the accumulator instruction and the sorted output vector. The MSA also includes a post-processing stage configured to receive the accumulator vector and generate a post-processing output vector based on a post-processing instruction and the accumulator vector.
    Type: Grant
    Filed: June 1, 2018
    Date of Patent: October 20, 2020
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Arthur John Redfern, Asheesh Bhardwaj, Tarek Aziz Lahlou, William Franklin Leven
  • Patent number: 10620956
    Abstract: An instruction defined to be a looping instruction that repeats a plurality of times to perform an operation on a defined amount of data is obtained. The looping instruction is expanded into a sequence of operations. The sequence of operations is a non-looping sequence of operations to perform the operation on the defined amount of data.
    Type: Grant
    Filed: March 3, 2017
    Date of Patent: April 14, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Michael K. Gschwind
  • Patent number: 9250899
    Abstract: There is provided a multi-bit storage cell for a register file. The storage cell includes a first set of storage elements for a vector slice. Each storage element respectively corresponds to a particular one of a plurality of thread sets for the vector slice. The storage cell includes a second set of storage elements for a scalar slice. Each storage element in the second set respectively corresponds to a particular one of at least one thread set for the scalar slice. The storage cell includes at least one selection circuit for selecting, for an instruction issued by a thread, a particular one of the storage elements from any of the first set and the second set based upon the instruction being a vector instruction or a scalar instruction and based upon a corresponding set from among the pluralities of thread sets to which the thread belongs.
    Type: Grant
    Filed: June 13, 2007
    Date of Patent: February 2, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Michael Gschwind
  • Patent number: 9038073
    Abstract: Efficient data processing apparatus and methods include hardware components which are pre-programmed by software. Each hardware component triggers the other to complete its tasks. After the final pre-programmed hardware task is complete, the hardware component issues a software interrupt.
    Type: Grant
    Filed: August 13, 2009
    Date of Patent: May 19, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Mathias Kohlenz, Irfan Anwar Khan, Sathyanarayan Madhusudan, Shailesh Maheshwari, Srividhya Krishnamoorthy, Sandeep Urgaonkar, Thomas Klingenbrunn, Tim Tynghuei Liou, Idreas Mir
  • Publication number: 20150019837
    Abstract: A data processor includes: a plurality of controllers that process data; a program memory that stores a standby instruction and a data processing instruction at a plurality of addresses respectively; and a queue that stores different execution start addresses for the plurality of controllers, wherein after the plurality of controllers sequentially access the queue, the plurality of controllers acquire the different execution start addresses from the queue in an order of the sequential access, start execution of instructions from the acquired different execution start addresses in the program memory, and execute the data processing instruction and execute the standby instruction the number of times different for each of the controllers.
    Type: Application
    Filed: September 29, 2014
    Publication date: January 15, 2015
    Inventors: Toshiya Otomo, Koichiro Yamashita, Takahisa Suzuki, Hiromasa Yamauchi, Koji Kurihara, Yuta Teranishi
  • Patent number: 8935509
    Abstract: A Baseboard Management Controller (BMC) controlling method includes the steps of dividing a memory of a BMC into an original region and customized region, in which the original region includes at least one original sensor data record (SDR) and original platform event filter (PEF) corresponding to each other; providing an instruction set to at least one external system, in which the external system manages at least one customized SDR and customized PEF corresponding to each other in the customized region through the instruction set; polling the original SDR in the original region and the customized SDR in the customized region; determining whether values of the SDRs obtained through polling conform to a plurality of critical values individually corresponding to the SDRs; and obtaining a processing policy according to the corresponding PEF when at least one value of the SDR does not conform to the corresponding critical value.
    Type: Grant
    Filed: February 24, 2011
    Date of Patent: January 13, 2015
    Assignee: Inventec Corporation
    Inventors: Chih Wei Chen, Hsiao Fen Lu
  • Publication number: 20140189295
    Abstract: A machine readable storage medium containing program code is described that when processed by a processor causes a method to be performed. The method includes creating a resultant rolled version of an input vector by forming a first intermediate vector, forming a second intermediate vector and forming a resultant rolled version of an input vector. The first intermediate vector is formed by barrel rolling elements of the input vector along a first of two lanes defined by an upper half and a lower half of the input vector. The second intermediate vector is formed by barrel rolling elements of the input vector along a second of the two lanes. The resultant rolled version of the input vector is formed by incorporating upper portions of one of the intermediate vector's upper and lower halves as upper portions of the resultant's upper and lower halves and incorporating lower portions of the other intermediate vector's upper and lower halves as lower portions of the resultant's upper and lower halves.
    Type: Application
    Filed: December 29, 2012
    Publication date: July 3, 2014
    Inventors: Tal ULIEL, Boris BOLSHEM, ELMOUSTAPHA OULD-AHMED-VALL
  • Patent number: 8094768
    Abstract: The present invention discloses a novel multi-channel timing recovery scheme that utilizes a shared CORDIC to accurately compute the phase for each tone. Then a hardware-based linear combiner module is used to reconstruct the best phase estimate from multiple phase measurements. The firmware monitors the noise variance for the pilot tones and determines the corresponding weight for each tone to ensure that the minimum phase jitter noise is achieved through the linear combiner. Then a hardware-based second-order timing recovery control loop generates the frequency reference signal for VCXO or DCXO. A single sequentially controlled multiplier is used for all multiplications in the control loop.
    Type: Grant
    Filed: December 21, 2006
    Date of Patent: January 10, 2012
    Assignee: Triductor Technology (Suzhou) Inc.
    Inventor: Yaolong Tan
  • Patent number: 7870159
    Abstract: A computer program product and associated algorithm for sorting S sequences of binary bits. The S sequences may be integers, floating point numbers, or character strings. The algorithm is executed by a processor of a computer system. Each sequence includes contiguous fields of bits. The algorithm executes program code at nodes of a linked execution structure in a sequential order with respect to the nodes. The algorithm executes a masking of the contiguous fields of the S sequences in accordance with a mask whose content is keyed to the field being masked. The sequential order of execution of the nodes is a function of an ordering of masking results of the masking. Each sequence, or a pointer to each sequence, is outputted to an array in the memory device whenever the masking places the sequence in a leaf node of the nodal linked execution structure.
    Type: Grant
    Filed: January 2, 2008
    Date of Patent: January 11, 2011
    Assignee: International Business Machines Corporation
    Inventor: Dennis J. Carroll
  • Patent number: 7818539
    Abstract: A processor implements conditional vector operations in which, for example, an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed are divided into two groups based on whether or not they satisfy a given condition by e.g., steering each to one of the two index vectors. Once the data have been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication. Other examples of conditional operations include combining one or more input vectors into a single output vector based on a condition vector, conditional vector switching, conditional vector combining, and conditional vector load balancing.
    Type: Grant
    Filed: August 28, 2006
    Date of Patent: October 19, 2010
    Assignees: The Board of Trustees of the Leland Stanford Junior University, The Massachusetts Institute of Technology
    Inventors: Scott Rixner, John D. Owens, Ujval J. Kapasi, William J. Dally
  • Patent number: 7770024
    Abstract: A method, system and computer program product for computing a message authentication code for data in storage of a computing environment. An instruction specifies a unit of storage for which an authentication code is to be computed. An computing operation computes an authentication code for the unit of storage. A register is used for providing a cryptographic key for use in the computing to the authentication code. Further, the register may be used in a chaining operation.
    Type: Grant
    Filed: February 12, 2008
    Date of Patent: August 3, 2010
    Assignee: International Business Machines Corporation
    Inventors: Shawn D. Lundvall, Ronald M. Smith, Sr., Phil Chi-Chung Yeh
  • Patent number: 7735090
    Abstract: A method, apparatus and article of manufacture to dynamically modify, terminate, or replace software components and connections (i.e., contracts) between components in a running assembly. Information about the component and contracts between components in a running assembly is used to determine an allowable sequence of management commands to transition the assembly of components from a current state to a specified goal state. At the same time, other components may continue to perform an operational workflow.
    Type: Grant
    Filed: December 15, 2005
    Date of Patent: June 8, 2010
    Assignee: International Business Machines Corporation
    Inventors: James E. Carey, Scott N. Gerard
  • Patent number: 7694158
    Abstract: A multi-processing system-on-chip including a cluster of processors having respective CPUs is operated by: defining a master CPU within the respective CPUs to coordinate operation of said multi-processing system, running on the CPU a cluster manager agent. The cluster manager agent is adapted to dynamically migrate software processes between the CPUs of said plurality and change power settings therein.
    Type: Grant
    Filed: April 18, 2006
    Date of Patent: April 6, 2010
    Assignee: STMicroelectronics S.R.L.
    Inventors: Diego Melpignano, David Siorpaes, Paolo Zambotti, Antonio Borneo
  • Patent number: 7467138
    Abstract: A method and associated algorithm for in-place sorting S sequences of binary bits stored contiguously in an array within a memory device of a computer system prior to the sorting. Each sequence includes contiguous fields of bits. The algorithm is executed by a processor of a computer system. The in-place sorting executes program code at each node of a linked execution structure. Each node includes a segment of the array. The program code is executed in a hierarchical sequence with respect to the nodes. Executing program code at each node includes: dividing the segment of the node into groups of sequences based on a mask field having a mask width, wherein each group has a unique mask value of the mask field; and in-place rearranging the sequences in the segment, wherein the rearranging results in each group including only those sequences having the unique mask value of the group.
    Type: Grant
    Filed: December 14, 2004
    Date of Patent: December 16, 2008
    Assignee: International Business Machines Corporation
    Inventor: Dennis J. Carroll
  • Patent number: 7460989
    Abstract: A method is provided, wherein a virtual internal master clock is used in connection with a RISC CPU. The RISC CPU comprises a number of concurrently operating function units, wherein each unit runs according to its own clocks, including multiple-stage totally unsynchronized clocks, in order to process a stream of instructions. The method includes the steps of generating a virtual model master clock having a clock cycle, and initializing each of the function units at the beginning of respectively corresponding processing cycles. The method further includes operating each function unit during a respectively corresponding processing cycle to carry out a task with respect to one of the instructions, in order to produce a result. Respective results are all evaluated in synchronization, by means of the master clock. This enables the instruction processing operation to be modeled using a sequential computer language, such as C or C++.
    Type: Grant
    Filed: October 14, 2004
    Date of Patent: December 2, 2008
    Assignee: International Business Machines Corporation
    Inventor: Oliver Keren Ban
  • Patent number: 7444488
    Abstract: A method and a programmable unit for bit field shifting in a memory device in a programmable unit as a result of the execution of an instruction, in which a bit segment is shifted within a first memory unit to a second memory unit, are presented. The bit segment is read with a first bit length from a first bit field in the first memory unit starting at a first start point. The bit segment that has been read is stored in the first bit field in the second memory unit starting at a second start point. The first or the second start points is updated by a predetermined value and the updated start point is stored for subsequent method steps.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: October 28, 2008
    Assignee: Infineon Technologies
    Inventors: Xiaoning Nie, Thomas Wahl
  • Patent number: 7356710
    Abstract: A method, system and computer program product for computing a message authentication code for data in storage of a computing environment. An instruction specifies a unit of storage for which an authentication code is to be computed. An computing operation computes an authentication code for the unit of storage. A register is used for providing a cryptographic key for use in the computing to the authentication code. Further, the register may be used in a chaining operation.
    Type: Grant
    Filed: May 12, 2003
    Date of Patent: April 8, 2008
    Assignee: International Business Machines Corporation
    Inventors: Shawn D. Lundvall, Ronald M. Smith, Sr., Phil Chi-Chung Yeh
  • Patent number: 7237086
    Abstract: A customization program for use in customizing a baseboard management controller used for monitoring operation of various computer system components is disclosed. A user interacts with the customization program to customize the baseboard management controller based on a configuration of components specified for the baseboard of the computer system. The customization program provides a user interface having a repository of icons and a design page. The icons represent various components that may be connected, either directly or indirectly, to the baseboard. The design page is used for constructing a model representing the specified configuration of components. As a user drags icons onto the design page, the model is updated to reflect selection of the components corresponding to these icons. Further, the customization program creates a configuration file that identifies and describes each of the selected components.
    Type: Grant
    Filed: November 26, 2003
    Date of Patent: June 26, 2007
    Assignee: American Megatrends, Inc.
    Inventors: Govind A. Kothandapani, Bakka Ravinder Reddy
  • Patent number: 7231261
    Abstract: In order to automatically calculate an operational sequence of processes that determine an output value from at least one input value, a multitude of processes (P1–P8), whose inputs are provided with at least one of the attributes: input value of the same calculation cycle (PRE), input value of the preceding calculation cycle (POST), input value from any calculation cycle (ANY), are arranged in such a manner that a process, which does not have any input with the attribute input value of the same calculation cycle (PRE), is determined as the first process of a calculation cycle and, in successive analogous steps, determines a quantity of possible sequences.
    Type: Grant
    Filed: January 16, 2003
    Date of Patent: June 12, 2007
    Assignee: Siemens Aktiengesellschaft
    Inventors: Lutz Berentroth, Stefan Hoelzl, Helmut Wellnhofer
  • Patent number: 7100026
    Abstract: A processor implements conditional vector operations in which, for example, an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed are divided into two groups based on whether or not they satisfy a given condition by, e.g., steering each to one of two index vectors. Once the data have been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication. Other examples of conditional operations include combining one or more input vectors into a single output vector based on a condition vector, conditional vector switching, conditional vector combining, and conditional vector load balancing.
    Type: Grant
    Filed: May 30, 2001
    Date of Patent: August 29, 2006
    Assignees: The Massachusetts Institute of Technology, The Board of Trustees of the Leland Stanford Junior University
    Inventors: William J. Dally, Scott Rixner, John D. Owens, Ujval J. Kapasi
  • Patent number: 7000093
    Abstract: A cellular automaton cache memory architecture. On a micro-processor that is also capable of executing general-purpose instructions, a cache memory is provided to store instructions and data for use by the processor. The cache memory is further capable of storing data representing a first state of a cellular automaton at a first time step, where the data is organized in cells. A cellular automaton prefetch unit prefetches data associated with a cell to be updated and a neighborhood buffer stores the prefetched data. A cellular automaton update unit provides data from the neighborhood buffer to an update engine. The update engine includes a microprocessor execution unit capable of executing at least some general purpose microprocessor instructions and updates at least some of the selected cells according to an update rule and a state of any associated neighborhood cells to provide a state of the cellular automaton at a second time step.
    Type: Grant
    Filed: December 19, 2001
    Date of Patent: February 14, 2006
    Assignee: Intel Corporation
    Inventor: John W. Mates
  • Patent number: 6963341
    Abstract: The present invention provides efficient ways to implement scan conversion and matrix transpose operations using vector multiplex operations in a SIMD processor. The present method provides a very fast and flexible way to implement different scan conversions, such as zigzag conversion, and matrix transpose for 2×2, 4×4, 8×8 blocks commonly used by all video compression and decompression algorithms.
    Type: Grant
    Filed: May 20, 2003
    Date of Patent: November 8, 2005
    Inventor: Tibet Mimar
  • Patent number: 6954927
    Abstract: A method for optimizing a software pipelineable loop in a software code is provided. The loop comprises one or more pipelined stages and one or more loop operations. The method comprises evaluating an initiation interval time (IN) for a pipelined stage of the loop. A loop operation time latency (Tld) and a number of loop operations (Np) from the pipelined stages to peel based on IN and Tld is then determined. The loop operation is peeled Np times and copied before the loop in the software code. A vector of registers is allocated and the results of the peeled loop operations and a result of an original loop operation is assigned to the vector of registers. Memory addresses for the results of the peeled loop operations and original loop operation are also assigned.
    Type: Grant
    Filed: October 4, 2001
    Date of Patent: October 11, 2005
    Assignee: Elbrus International
    Inventor: Alexander Y. Ostanevich
  • Patent number: 6934938
    Abstract: A method for producing a formatted description of a computation representable by a data-flow graph and computer for performing a computation so described. A source instruction is generated for each input of the data-flow graph, a computational instruction is generated for each node of the data-flow graph, and a sink instruction is generated for each output of the data-flow graph. The computational instruction for a node includes a descriptor of an operation performed at the node and a descriptor of each instruction that produces an input to the node. The formatted description is a sequential instruction list comprising source instructions, computational instructions and sink instructions. Each instruction has an instruction identifier and the descriptor of each instruction that produces an input to the node is the instruction identifier. The computer is directed by a program of instructions to implement a computation representable by a data-flow graph.
    Type: Grant
    Filed: June 28, 2002
    Date of Patent: August 23, 2005
    Assignee: Motorola, Inc.
    Inventors: Philip E. May, Kent Donald Moat, Raymond B. Essick, IV, Silviu Chiricescu, Brian Geoffrey Lucas, James M. Norris, Michael Allen Schuette, Ali Saidi
  • Patent number: 6721813
    Abstract: A computer system is presented which implements a system and method for tracking the progress of posted write transactions. In one embodiment, the computer system includes a processing subsystem and an input/output (I/O) subsystem. The processing subsystem includes multiple processing nodes interconnected via coherent communication links. Each processing node may include a processor preferably executing software instructions. The I/O subsystem includes one or more I/O nodes. Each I/O node may embody one or more I/O functions (e.g., modem, sound card, etc.). The multiple processing nodes may include a first processing node and a second processing node, wherein the first processing node includes a host bridge, and wherein a memory is coupled to the second processing node. An I/O node may generate a non-coherent write transaction to store data within the second processing node's memory, wherein the non-coherent write transaction is a posted write transaction.
    Type: Grant
    Filed: January 30, 2001
    Date of Patent: April 13, 2004
    Assignee: Advanced Micro Devices, Inc.
    Inventors: Jonathan M. Owen, Mark D. Hummel, James B. Keller
  • Patent number: 6557048
    Abstract: A computer system is presented which implements a system and method for ordering input/output (I/O) memory operations. In one embodiment, the computer system includes a processing subsystem and an I/O subsystem. The processing subsystem includes multiple processing nodes interconnected via coherent communication links. Each processing node may include a processor executing software instructions. The I/O subsystem includes one or more I/O nodes serially coupled via non-coherent communication links. Each I/O node may embody one or more I/O functions (e.g., modem, sound card, etc.). One of the processing nodes includes a host bridge which translates packets moving between the processing subsystem and the I/O subsystem. One of the I/O nodes is coupled to the processing node including the host bridges. The I/O node coupled to the processing node produces and/or provides transactions having destinations or targets within the processing subsystem to the processing node including the host bridge.
    Type: Grant
    Filed: November 1, 1999
    Date of Patent: April 29, 2003
    Assignee: Advanced Micro Devices, Inc.
    Inventors: James B. Keller, Derrick R. Meyer, Dale E. Gulick, Larry D. Hewitt
  • Patent number: 6446193
    Abstract: A method and apparatus for reducing instruction cycles in a digital signal processor wherein the processor includes a multiplier unit, an adder, a memory, and at least one pair of first and second accumulators. The accumulators include respective guard, high and low parts. The method and apparatus enable vectoring the respective first and second high parts from the accumulators to define a single vectored register responsive to a single instruction cycle and processing the data in the vectored register.
    Type: Grant
    Filed: September 8, 1997
    Date of Patent: September 3, 2002
    Assignee: Agere Systems Guardian Corp.
    Inventors: Mazhar M. Alidina, Sivanand Simanapalli, Larry R. Tate
  • Publication number: 20020112142
    Abstract: A technique for handling a conditional move instruction in an out-of-order data processor. The technique involves detecting a conditional move instruction within an instruction stream, and generating multiple instructions according to the detected conditional move instruction. The technique further involves replacing the conditional move instruction within the instruction stream with the generated multiple instructions. The generated multiple instructions are generated such that each of the generated multiple instructions executes using no more than two input ports of an execution unit. The generated multiple instructions include a first generated instruction that produces a condition result indicating whether a condition exists, and a second generated instruction that inputs the condition result as a portion of an operand which identifies a register of the out-of-order data processor.
    Type: Application
    Filed: November 18, 1998
    Publication date: August 15, 2002
    Inventors: JOEL SPRINGER EMER, BRUCE EDWARDS, DANIEL LAWRENCE LEIBHOLZ, EDWARD J. MCLELLAN, DERRICK R. MEYER
  • Patent number: 6401194
    Abstract: A vector processor provides a data path divided into smaller slices of data, with each slice processed in parallel with the other slices. Furthermore, an execution unit provides smaller arithmetic and functional units chained together to execute more complex microprocessor instructions requiring multiple cycles by sharing single-cycle operations, thereby reducing both costs and size of the microprocessor. One embodiment handles 288-bit data widths using 36-bit data path slices. Another embodiment executes integer multiply and multiply-and-accumulate and floating point add/subtract and multiply operations using single-cycle arithmetic logic units. Other embodiments support 8-bit, 9-bit, 16-bit, and 32-bit integer data types and 32-bit floating data types.
    Type: Grant
    Filed: January 28, 1997
    Date of Patent: June 4, 2002
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Le Trong Nguyen, Heonchul Park, Roney S. Wong, Ted Nguyen, Edward H. Yu
  • Patent number: 6324600
    Abstract: A method and an apparatus for controlling movement of data between any host and any network including a set of devices in a computing system environment having a main memory with a queuing mechanism having a plurality of queues capable of being shared between a plurality of independent processes running on at least one host and at least one I/O adapter. A finite-state machine (FSM) is provided in the main memory and the FSM is divided into two disjoint sets of states, one of which represents state-values processed by the host and set by the adapter, and said other set represents state-values processed by the adapter and set by said host. Using each of these set of states free-running, non-deadlocking processes are provided within the host and the adapter so that the processes sequence circularly and continuously through a vector related to the FSMs.
    Type: Grant
    Filed: February 19, 1999
    Date of Patent: November 27, 2001
    Assignee: International Business Machines Corporation
    Inventors: Frank W. Brice, Richard P. Tarcza, Leslie W. Wyman
  • Patent number: 6295597
    Abstract: An apparatus and a method for extended-precision vector arithmetic capable of extremely long precision (i.e., precision to as many bits as a user desires or is limited to due to memory, disk-storage, or other resource constraints). Vector carry-out bits can be used as vector carry-in bits for successive operations. In performing add or subtract operations on integers that are longer than the word size of the computer, the operands a broken into word-sized parts which are used as operands. A vector of long-integer numbers is thus broken into a series of sub-vectors, each having word-sized elements. Vector add or subtract operations are performed successively on the sub-vectors, starting with the lowest-order sub-vectors. Carry-out (or borrow-out) bits from a first vector operation are used as carry-in (or borrow-in) bits for a successive vector operation.
    Type: Grant
    Filed: August 11, 1998
    Date of Patent: September 25, 2001
    Assignee: Cray, Inc.
    Inventors: David Resnick, William T. Moore
  • Patent number: 6269435
    Abstract: A processor implements conditional vector operations in which an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed is divided into two groups based on whether or not they satisfy a given condition by, e.g., steering each to one of two index vectors. Once the data has been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication.
    Type: Grant
    Filed: September 14, 1998
    Date of Patent: July 31, 2001
    Assignees: The Board of Trustees of the Leland Stanford Junior University, The Massachusetts Institute of Technology
    Inventors: William J. Dally, Scott Whitney Rixner, John Owens, Ujval J. Kapasi
  • Patent number: 6202141
    Abstract: A vector multiplication mechanism is provided that partitions vector multiplication operation into even and odd paths. In an odd path, odd data elements of first and second source vectors are selected, and multiplication operation is performed between each of the selected odd data elements of the first source vector and corresponding one of the selected odd data elements of the second source vector. In an even path, even data elements of the source vectors are selected, and multiplication operation is performed between each of the selected even data elements of the first source vector and corresponding one of the selected even data elements of the second source vector. Elements of resultant data of the two paths are merged together in a merge operation. The vector multiplication mechanism of the present invention preferably uses a single general-purpose register to store the resultant data of the odd path and the even path.
    Type: Grant
    Filed: June 16, 1998
    Date of Patent: March 13, 2001
    Assignee: International Business Machines Corporation
    Inventors: Keith Everett Diefendorff, Pradeep Kumar Dubey, Ronald Ray Hochsprung, Brett Olsson, Hunter Ledbetter Scales, III
  • Patent number: 6073158
    Abstract: A system and method for time slicing multiple received data streams utilizing multiple processors in such a manner as to ensure that all processors are running at full capability and are efficiently timesharing a global memory storage area. The received data streams are each divided into fixed portions called spans. The invention is operable for sequencing the movement of the time-sliced spans between the processors, adjusting the scheduling of particular ones of the time-sliced spans as a function of either processor availability or maintenance of real-time transmission of the received real-time time-sliced data streams.
    Type: Grant
    Filed: July 29, 1993
    Date of Patent: June 6, 2000
    Assignee: Cirrus Logic, Inc.
    Inventors: Robert Marshall Nally, John Charles Schafer
  • Patent number: 6061777
    Abstract: One aspect of the invention relates to a method for operating a processor. In one version of the invention, the method includes the steps of dispatching an instruction; determining a presently architected RMAP entry for the architectural register targeted by the dispatched instruction; selecting the RMAP entries which are associated with physical registers that contain operands for the dispatched instruction; updating a use indicator in the selected RMAP entries; determining whether the dispatched instruction is interruptible; and updating an architectural indicator and a historical indicator in the presently architected RMAP entry if the dispatched instruction is uninterruptible.
    Type: Grant
    Filed: October 28, 1997
    Date of Patent: May 9, 2000
    Assignee: International Business Machines Corporation
    Inventors: Hoichi Cheong, Paul Joseph Jordan, Hung Qui Le, Soummya Mallick
  • Patent number: 6023752
    Abstract: A program driver means is disclosed that allows for the exchange of inforion between a NTDS device and a device having a bus topology, especially a VMEbus. The program driver utilizes chain commands which are fully programmable at the user level. The processor itself is programmed at the register level to assure the fastest data rate possible (32 bit access) across the VMEbus. The processor driver is invisible to the user.
    Type: Grant
    Filed: November 25, 1997
    Date of Patent: February 8, 2000
    Assignee: The United States of America as represented by the Secretary of the Navy
    Inventor: William M. Huttle