Sequential Patents (Class 712/8)

Vector cumulative sum instruction and circuit for implementing filtering operations

Patent number: 11829756

Abstract: A vector cumulative sum circuit can include a set of input registers, a carry-forward data source, a set of output registers, and a network of adder circuits coupling the input registers to the output registers such that the output value in a given output register is the sum of a value provided by the carry-forward data source and the input values from all of the input registers (in logical order) up to (and including) the corresponding input register. The value in the last output register can be carried forward to enable cumulative summing of a larger number of input values. The vector cumulative sum circuit can be implemented in a programmable processor, and a vector cumulative sum instruction can be defined in the instruction set. Using the vector cumulative sum circuit and instruction, filtering operations can be accelerated.

Type: Grant

Filed: September 24, 2021

Date of Patent: November 28, 2023

Assignee: Apple Inc.

Inventors: On Wa Yeung, Seydou N. Ba
Boot assist zero overhead flash extended file system

Patent number: 11249767

Abstract: An information handling system may load first data from a location information area of a first memory, specifying a plurality of locations of metadata for a plurality of stages of basic input/output system (BIOS) initialization. The information handling system may then load first metadata for a first stage of BIOS initialization from a first metadata location of the plurality of locations specified by the first data. The first metadata may contain information for indexing first initialization data located at a first initialization data location. The information handling system may then index the first initialization data of the first initialization data location based, at least in part, on the first metadata. The information handling system may then perform the first stage of BIOS initialization based, at least in part, on the first initialization data.

Type: Grant

Filed: February 5, 2019

Date of Patent: February 15, 2022

Assignee: Dell Products L.P.

Inventors: Shekar Babu Suryanarayana, Sumanth Vidyadhara, Anand Prakash Joshi
Carry chain for SIMD operations

Patent number: 10838719

Abstract: Examples of a carry chain for performing an operation on operands each including elements of a selectable size is provided. Advantageously, the carry chain adapts to elements of different sizes. The carry chain determines a mask based on a selected size of an element. The carry chain selects, based on the mask, whether to carry a partial result of an operation performed on corresponding first portions of a first operand and a second operand into a next operation. The next operation is performed on corresponding second portions of the first operand and the second operand, and, based on the selection, the partial result of the operation. The carry chain stores, in a memory, a result formed from outputs of the operation and the next operation.

Type: Grant

Filed: November 13, 2015

Date of Patent: November 17, 2020

Assignee: Marvell Asia Pte, Ltd

Inventor: David Kravitz
Sort and merge instruction for a general-purpose processor

Patent number: 10831478

Abstract: A Sort Lists instruction is provided to perform a sort and/or a merge operation. The instruction is an architected machine instruction of an instruction set architecture and is executed by a general-purpose processor of the computing environment. The executing includes sorting a plurality of input lists to obtain one or more sorted output lists, which are output.

Type: Grant

Filed: November 6, 2018

Date of Patent: November 10, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce C. Giamei, Martin Recktenwald, Donald W. Schmidt, Timothy Slegel, Aditya N. Puranik, Mark S. Farrell, Christian Jacobi, Jonathan D. Bradbury, Christian Zoellin
Saving and restoring machine state between multiple executions of an instruction

Patent number: 10831503

Abstract: Saving and restoring machine state between multiple executions of an instruction. A determination is made that processing of an operation of an instruction executing on a processor has been interrupted prior to completion. Based on determining that the processing of the operation has been interrupted, current metadata of the processor is extracted. The metadata is stored in a location associated with the instruction and used to re-execute the instruction to resume forward processing of the instruction from where it was interrupted.

Type: Grant

Filed: November 6, 2018

Date of Patent: November 10, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Bruce C. Giamei, Martin Recktenwald, Donald W. Schmidt, Timothy Slegel, Aditya N. Puranik, Mark S. Farrell, Christian Jacobi, Jonathan D. Bradbury, Christian Zoellin
Merge sort accelerator

Patent number: 10809978

Abstract: A merge sort accelerator (MSA) includes a pre-processing stage configured to receive an input vector and generate a pre-processing output vector based on a pre-processing instruction and the input vector. The MSA also includes a merge sort network having multiple sorting stages configured to be selectively enabled. The merge sort network is configured to receive the pre-processing output vector and generate a sorted output vector based on a sorting instruction and the pre-processing output vector. The MSA includes an accumulator stage configured to receive the sorted output vector and update an accumulator vector based on the accumulator instruction and the sorted output vector. The MSA also includes a post-processing stage configured to receive the accumulator vector and generate a post-processing output vector based on a post-processing instruction and the accumulator vector.

Type: Grant

Filed: June 1, 2018

Date of Patent: October 20, 2020

Assignee: TEXAS INSTRUMENTS INCORPORATED

Inventors: Arthur John Redfern, Asheesh Bhardwaj, Tarek Aziz Lahlou, William Franklin Leven
Search string processing via inline decode-based micro-operations expansion

Patent number: 10620956

Abstract: An instruction defined to be a looping instruction that repeats a plurality of times to perform an operation on a defined amount of data is obtained. The looping instruction is expanded into a sequence of operations. The sequence of operations is a non-looping sequence of operations to perform the operation on the defined amount of data.

Type: Grant

Filed: March 3, 2017

Date of Patent: April 14, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Michael K. Gschwind
Method and apparatus for spatial register partitioning with a multi-bit cell register file

Patent number: 9250899

Abstract: There is provided a multi-bit storage cell for a register file. The storage cell includes a first set of storage elements for a vector slice. Each storage element respectively corresponds to a particular one of a plurality of thread sets for the vector slice. The storage cell includes a second set of storage elements for a scalar slice. Each storage element in the second set respectively corresponds to a particular one of at least one thread set for the scalar slice. The storage cell includes at least one selection circuit for selecting, for an instruction issued by a thread, a particular one of the storage elements from any of the first set and the second set based upon the instruction being a vector instruction or a scalar instruction and based upon a corresponding set from among the pluralities of thread sets to which the thread belongs.

Type: Grant

Filed: June 13, 2007

Date of Patent: February 2, 2016

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Michael Gschwind
Data mover moving data to accelerator for processing and returning result data based on instruction received from a processor utilizing software and hardware interrupts

Patent number: 9038073

Abstract: Efficient data processing apparatus and methods include hardware components which are pre-programmed by software. Each hardware component triggers the other to complete its tasks. After the final pre-programmed hardware task is complete, the hardware component issues a software interrupt.

Type: Grant

Filed: August 13, 2009

Date of Patent: May 19, 2015

Assignee: QUALCOMM Incorporated

Inventors: Mathias Kohlenz, Irfan Anwar Khan, Sathyanarayan Madhusudan, Shailesh Maheshwari, Srividhya Krishnamoorthy, Sandeep Urgaonkar, Thomas Klingenbrunn, Tim Tynghuei Liou, Idreas Mir
DATA PROCESSOR

Publication number: 20150019837

Abstract: A data processor includes: a plurality of controllers that process data; a program memory that stores a standby instruction and a data processing instruction at a plurality of addresses respectively; and a queue that stores different execution start addresses for the plurality of controllers, wherein after the plurality of controllers sequentially access the queue, the plurality of controllers acquire the different execution start addresses from the queue in an order of the sequential access, start execution of instructions from the acquired different execution start addresses in the program memory, and execute the data processing instruction and execute the standby instruction the number of times different for each of the controllers.

Type: Application

Filed: September 29, 2014

Publication date: January 15, 2015

Inventors: Toshiya Otomo, Koichiro Yamashita, Takahisa Suzuki, Hiromasa Yamauchi, Koji Kurihara, Yuta Teranishi
Method for controlling BMC having customized SDR

Patent number: 8935509

Abstract: A Baseboard Management Controller (BMC) controlling method includes the steps of dividing a memory of a BMC into an original region and customized region, in which the original region includes at least one original sensor data record (SDR) and original platform event filter (PEF) corresponding to each other; providing an instruction set to at least one external system, in which the external system manages at least one customized SDR and customized PEF corresponding to each other in the customized region through the instruction set; polling the original SDR in the original region and the customized SDR in the customized region; determining whether values of the SDRs obtained through polling conform to a plurality of critical values individually corresponding to the SDRs; and obtaining a processing policy according to the corresponding PEF when at least one value of the SDR does not conform to the corresponding critical value.

Type: Grant

Filed: February 24, 2011

Date of Patent: January 13, 2015

Assignee: Inventec Corporation

Inventors: Chih Wei Chen, Hsiao Fen Lu
Apparatus and Method of Efficient Vector Roll Operation

Publication number: 20140189295

Abstract: A machine readable storage medium containing program code is described that when processed by a processor causes a method to be performed. The method includes creating a resultant rolled version of an input vector by forming a first intermediate vector, forming a second intermediate vector and forming a resultant rolled version of an input vector. The first intermediate vector is formed by barrel rolling elements of the input vector along a first of two lanes defined by an upper half and a lower half of the input vector. The second intermediate vector is formed by barrel rolling elements of the input vector along a second of the two lanes. The resultant rolled version of the input vector is formed by incorporating upper portions of one of the intermediate vector's upper and lower halves as upper portions of the resultant's upper and lower halves and incorporating lower portions of the other intermediate vector's upper and lower halves as lower portions of the resultant's upper and lower halves.

Type: Application

Filed: December 29, 2012

Publication date: July 3, 2014

Inventors: Tal ULIEL, Boris BOLSHEM, ELMOUSTAPHA OULD-AHMED-VALL
Multi-channel timing recovery system

Patent number: 8094768

Abstract: The present invention discloses a novel multi-channel timing recovery scheme that utilizes a shared CORDIC to accurately compute the phase for each tone. Then a hardware-based linear combiner module is used to reconstruct the best phase estimate from multiple phase measurements. The firmware monitors the noise variance for the pilot tones and determines the corresponding weight for each tone to ensure that the minimum phase jitter noise is achieved through the linear combiner. Then a hardware-based second-order timing recovery control loop generates the frequency reference signal for VCXO or DCXO. A single sequentially controlled multiplier is used for all multiplications in the control loop.

Type: Grant

Filed: December 21, 2006

Date of Patent: January 10, 2012

Assignee: Triductor Technology (Suzhou) Inc.

Inventor: Yaolong Tan
Algorithm for sorting bit sequences in linear complexity

Patent number: 7870159

Abstract: A computer program product and associated algorithm for sorting S sequences of binary bits. The S sequences may be integers, floating point numbers, or character strings. The algorithm is executed by a processor of a computer system. Each sequence includes contiguous fields of bits. The algorithm executes program code at nodes of a linked execution structure in a sequential order with respect to the nodes. The algorithm executes a masking of the contiguous fields of the S sequences in accordance with a mask whose content is keyed to the field being masked. The sequential order of execution of the nodes is a function of an ordering of masking results of the masking. Each sequence, or a pointer to each sequence, is outputted to an array in the memory device whenever the masking places the sequence in a leaf node of the nodal linked execution structure.

Type: Grant

Filed: January 2, 2008

Date of Patent: January 11, 2011

Assignee: International Business Machines Corporation

Inventor: Dennis J. Carroll
System and method for performing efficient conditional vector operations for data parallel architectures involving both input and conditional vector values

Patent number: 7818539

Abstract: A processor implements conditional vector operations in which, for example, an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed are divided into two groups based on whether or not they satisfy a given condition by e.g., steering each to one of the two index vectors. Once the data have been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication. Other examples of conditional operations include combining one or more input vectors into a single output vector based on a condition vector, conditional vector switching, conditional vector combining, and conditional vector load balancing.

Type: Grant

Filed: August 28, 2006

Date of Patent: October 19, 2010

Assignees: The Board of Trustees of the Leland Stanford Junior University, The Massachusetts Institute of Technology

Inventors: Scott Rixner, John D. Owens, Ujval J. Kapasi, William J. Dally
Security message authentication instruction

Patent number: 7770024

Abstract: A method, system and computer program product for computing a message authentication code for data in storage of a computing environment. An instruction specifies a unit of storage for which an authentication code is to be computed. An computing operation computes an authentication code for the unit of storage. A register is used for providing a cryptographic key for use in the computing to the authentication code. Further, the register may be used in a chaining operation.

Type: Grant

Filed: February 12, 2008

Date of Patent: August 3, 2010

Assignee: International Business Machines Corporation

Inventors: Shawn D. Lundvall, Ronald M. Smith, Sr., Phil Chi-Chung Yeh
On demand software contract modification and termination in running component assemblies

Patent number: 7735090

Abstract: A method, apparatus and article of manufacture to dynamically modify, terminate, or replace software components and connections (i.e., contracts) between components in a running assembly. Information about the component and contracts between components in a running assembly is used to determine an allowable sequence of management commands to transition the assembly of components from a current state to a specified goal state. At the same time, other components may continue to perform an operational workflow.

Type: Grant

Filed: December 15, 2005

Date of Patent: June 8, 2010

Assignee: International Business Machines Corporation

Inventors: James E. Carey, Scott N. Gerard
Parallel processing method and system, for instance for supporting embedded cluster platforms, computer program product therefor

Patent number: 7694158

Abstract: A multi-processing system-on-chip including a cluster of processors having respective CPUs is operated by: defining a master CPU within the respective CPUs to coordinate operation of said multi-processing system, running on the CPU a cluster manager agent. The cluster manager agent is adapted to dynamically migrate software processes between the CPUs of said plurality and change power settings therein.

Type: Grant

Filed: April 18, 2006

Date of Patent: April 6, 2010

Assignee: STMicroelectronics S.R.L.

Inventors: Diego Melpignano, David Siorpaes, Paolo Zambotti, Antonio Borneo
Algorithm for sorting bit sequences in linear complexity

Patent number: 7467138

Abstract: A method and associated algorithm for in-place sorting S sequences of binary bits stored contiguously in an array within a memory device of a computer system prior to the sorting. Each sequence includes contiguous fields of bits. The algorithm is executed by a processor of a computer system. The in-place sorting executes program code at each node of a linked execution structure. Each node includes a segment of the array. The program code is executed in a hierarchical sequence with respect to the nodes. Executing program code at each node includes: dividing the segment of the node into groups of sequences based on a mask field having a mask width, wherein each group has a unique mask value of the mask field; and in-place rearranging the sequences in the segment, wherein the rearranging results in each group including only those sequences having the unique mask value of the group.

Type: Grant

Filed: December 14, 2004

Date of Patent: December 16, 2008

Assignee: International Business Machines Corporation

Inventor: Dennis J. Carroll
Method and apparatus for modeling multiple concurrently dispatched instruction streams in super scalar CPU with a sequential language

Patent number: 7460989

Abstract: A method is provided, wherein a virtual internal master clock is used in connection with a RISC CPU. The RISC CPU comprises a number of concurrently operating function units, wherein each unit runs according to its own clocks, including multiple-stage totally unsynchronized clocks, in order to process a stream of instructions. The method includes the steps of generating a virtual model master clock having a clock cycle, and initializing each of the function units at the beginning of respectively corresponding processing cycles. The method further includes operating each function unit during a respectively corresponding processing cycle to carry out a task with respect to one of the instructions, in order to produce a result. Respective results are all evaluated in synchronization, by means of the master clock. This enables the instruction processing operation to be modeled using a sequential computer language, such as C or C++.

Type: Grant

Filed: October 14, 2004

Date of Patent: December 2, 2008

Assignee: International Business Machines Corporation

Inventor: Oliver Keren Ban
Method and programmable unit for bit field shifting

Patent number: 7444488

Abstract: A method and a programmable unit for bit field shifting in a memory device in a programmable unit as a result of the execution of an instruction, in which a bit segment is shifted within a first memory unit to a second memory unit, are presented. The bit segment is read with a first bit length from a first bit field in the first memory unit starting at a first start point. The bit segment that has been read is stored in the first bit field in the second memory unit starting at a second start point. The first or the second start points is updated by a predetermined value and the updated start point is stored for subsequent method steps.

Type: Grant

Filed: September 30, 2005

Date of Patent: October 28, 2008

Assignee: Infineon Technologies

Inventors: Xiaoning Nie, Thomas Wahl
Security message authentication control instruction

Patent number: 7356710

Abstract: A method, system and computer program product for computing a message authentication code for data in storage of a computing environment. An instruction specifies a unit of storage for which an authentication code is to be computed. An computing operation computes an authentication code for the unit of storage. A register is used for providing a cryptographic key for use in the computing to the authentication code. Further, the register may be used in a chaining operation.

Type: Grant

Filed: May 12, 2003

Date of Patent: April 8, 2008

Assignee: International Business Machines Corporation

Inventors: Shawn D. Lundvall, Ronald M. Smith, Sr., Phil Chi-Chung Yeh
Configuring a management module through a graphical user interface for use in a computer system

Patent number: 7237086

Abstract: A customization program for use in customizing a baseboard management controller used for monitoring operation of various computer system components is disclosed. A user interacts with the customization program to customize the baseboard management controller based on a configuration of components specified for the baseboard of the computer system. The customization program provides a user interface having a repository of icons and a design page. The icons represent various components that may be connected, either directly or indirectly, to the baseboard. The design page is used for constructing a model representing the specified configuration of components. As a user drags icons onto the design page, the model is updated to reflect selection of the components corresponding to these icons. Further, the customization program creates a configuration file that identifies and describes each of the selected components.

Type: Grant

Filed: November 26, 2003

Date of Patent: June 26, 2007

Assignee: American Megatrends, Inc.

Inventors: Govind A. Kothandapani, Bakka Ravinder Reddy
Method for automatically obtaining an operational sequence of processes and a tool for performing such method

Patent number: 7231261

Abstract: In order to automatically calculate an operational sequence of processes that determine an output value from at least one input value, a multitude of processes (P1–P8), whose inputs are provided with at least one of the attributes: input value of the same calculation cycle (PRE), input value of the preceding calculation cycle (POST), input value from any calculation cycle (ANY), are arranged in such a manner that a process, which does not have any input with the attribute input value of the same calculation cycle (PRE), is determined as the first process of a calculation cycle and, in successive analogous steps, determines a quantity of possible sequences.

Type: Grant

Filed: January 16, 2003

Date of Patent: June 12, 2007

Assignee: Siemens Aktiengesellschaft

Inventors: Lutz Berentroth, Stefan Hoelzl, Helmut Wellnhofer
System and method for performing efficient conditional vector operations for data parallel architectures involving both input and conditional vector values

Patent number: 7100026

Abstract: A processor implements conditional vector operations in which, for example, an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed are divided into two groups based on whether or not they satisfy a given condition by, e.g., steering each to one of two index vectors. Once the data have been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication. Other examples of conditional operations include combining one or more input vectors into a single output vector based on a condition vector, conditional vector switching, conditional vector combining, and conditional vector load balancing.

Type: Grant

Filed: May 30, 2001

Date of Patent: August 29, 2006

Assignees: The Massachusetts Institute of Technology, The Board of Trustees of the Leland Stanford Junior University

Inventors: William J. Dally, Scott Rixner, John D. Owens, Ujval J. Kapasi
Cellular automaton processing microprocessor prefetching data in neighborhood buffer

Patent number: 7000093

Abstract: A cellular automaton cache memory architecture. On a micro-processor that is also capable of executing general-purpose instructions, a cache memory is provided to store instructions and data for use by the processor. The cache memory is further capable of storing data representing a first state of a cellular automaton at a first time step, where the data is organized in cells. A cellular automaton prefetch unit prefetches data associated with a cell to be updated and a neighborhood buffer stores the prefetched data. A cellular automaton update unit provides data from the neighborhood buffer to an update engine. The update engine includes a microprocessor execution unit capable of executing at least some general purpose microprocessor instructions and updates at least some of the selected cells according to an update rule and a state of any associated neighborhood cells to provide a state of the cellular automaton at a second time step.

Type: Grant

Filed: December 19, 2001

Date of Patent: February 14, 2006

Assignee: Intel Corporation

Inventor: John W. Mates
Fast and flexible scan conversion and matrix transpose in a SIMD processor

Patent number: 6963341

Abstract: The present invention provides efficient ways to implement scan conversion and matrix transpose operations using vector multiplex operations in a SIMD processor. The present method provides a very fast and flexible way to implement different scan conversions, such as zigzag conversion, and matrix transpose for 2×2, 4×4, 8×8 blocks commonly used by all video compression and decompression algorithms.

Type: Grant

Filed: May 20, 2003

Date of Patent: November 8, 2005

Inventor: Tibet Mimar
Hardware supported software pipelined loop prologue optimization

Patent number: 6954927

Abstract: A method for optimizing a software pipelineable loop in a software code is provided. The loop comprises one or more pipelined stages and one or more loop operations. The method comprises evaluating an initiation interval time (IN) for a pipelined stage of the loop. A loop operation time latency (Tld) and a number of loop operations (Np) from the pipelined stages to peel based on IN and Tld is then determined. The loop operation is peeled Np times and copied before the loop in the software code. A vector of registers is allocated and the results of the peeled loop operations and a result of an original loop operation is assigned to the vector of registers. Memory addresses for the results of the peeled loop operations and original loop operation are also assigned.

Type: Grant

Filed: October 4, 2001

Date of Patent: October 11, 2005

Assignee: Elbrus International

Inventor: Alexander Y. Ostanevich
Method of programming linear graphs for streaming vector computation

Patent number: 6934938

Abstract: A method for producing a formatted description of a computation representable by a data-flow graph and computer for performing a computation so described. A source instruction is generated for each input of the data-flow graph, a computational instruction is generated for each node of the data-flow graph, and a sink instruction is generated for each output of the data-flow graph. The computational instruction for a node includes a descriptor of an operation performed at the node and a descriptor of each instruction that produces an input to the node. The formatted description is a sequential instruction list comprising source instructions, computational instructions and sink instructions. Each instruction has an instruction identifier and the descriptor of each instruction that produces an input to the node is the instruction identifier. The computer is directed by a program of instructions to implement a computation representable by a data-flow graph.

Type: Grant

Filed: June 28, 2002

Date of Patent: August 23, 2005

Assignee: Motorola, Inc.

Inventors: Philip E. May, Kent Donald Moat, Raymond B. Essick, IV, Silviu Chiricescu, Brian Geoffrey Lucas, James M. Norris, Michael Allen Schuette, Ali Saidi
Computer system implementing a system and method for tracking the progress of posted write transactions

Patent number: 6721813

Abstract: A computer system is presented which implements a system and method for tracking the progress of posted write transactions. In one embodiment, the computer system includes a processing subsystem and an input/output (I/O) subsystem. The processing subsystem includes multiple processing nodes interconnected via coherent communication links. Each processing node may include a processor preferably executing software instructions. The I/O subsystem includes one or more I/O nodes. Each I/O node may embody one or more I/O functions (e.g., modem, sound card, etc.). The multiple processing nodes may include a first processing node and a second processing node, wherein the first processing node includes a host bridge, and wherein a memory is coupled to the second processing node. An I/O node may generate a non-coherent write transaction to store data within the second processing node's memory, wherein the non-coherent write transaction is a posted write transaction.

Type: Grant

Filed: January 30, 2001

Date of Patent: April 13, 2004

Assignee: Advanced Micro Devices, Inc.

Inventors: Jonathan M. Owen, Mark D. Hummel, James B. Keller
Computer system implementing a system and method for ordering input/output (IO) memory operations within a coherent portion thereof

Patent number: 6557048

Abstract: A computer system is presented which implements a system and method for ordering input/output (I/O) memory operations. In one embodiment, the computer system includes a processing subsystem and an I/O subsystem. The processing subsystem includes multiple processing nodes interconnected via coherent communication links. Each processing node may include a processor executing software instructions. The I/O subsystem includes one or more I/O nodes serially coupled via non-coherent communication links. Each I/O node may embody one or more I/O functions (e.g., modem, sound card, etc.). One of the processing nodes includes a host bridge which translates packets moving between the processing subsystem and the I/O subsystem. One of the I/O nodes is coupled to the processing node including the host bridges. The I/O node coupled to the processing node produces and/or provides transactions having destinations or targets within the processing subsystem to the processing node including the host bridge.

Type: Grant

Filed: November 1, 1999

Date of Patent: April 29, 2003

Assignee: Advanced Micro Devices, Inc.

Inventors: James B. Keller, Derrick R. Meyer, Dale E. Gulick, Larry D. Hewitt
Method and apparatus for single cycle processing of data associated with separate accumulators in a dual multiply-accumulate architecture

Patent number: 6446193

Abstract: A method and apparatus for reducing instruction cycles in a digital signal processor wherein the processor includes a multiplier unit, an adder, a memory, and at least one pair of first and second accumulators. The accumulators include respective guard, high and low parts. The method and apparatus enable vectoring the respective first and second high parts from the accumulators to define a single vectored register responsive to a single instruction cycle and processing the data in the vectored register.

Type: Grant

Filed: September 8, 1997

Date of Patent: September 3, 2002

Assignee: Agere Systems Guardian Corp.

Inventors: Mazhar M. Alidina, Sivanand Simanapalli, Larry R. Tate
IMPLEMENTATION OF A CONDITIONAL MOVE INSTRUCTION IN AN OUT-OF-ORDER PROCESSOR

Publication number: 20020112142

Abstract: A technique for handling a conditional move instruction in an out-of-order data processor. The technique involves detecting a conditional move instruction within an instruction stream, and generating multiple instructions according to the detected conditional move instruction. The technique further involves replacing the conditional move instruction within the instruction stream with the generated multiple instructions. The generated multiple instructions are generated such that each of the generated multiple instructions executes using no more than two input ports of an execution unit. The generated multiple instructions include a first generated instruction that produces a condition result indicating whether a condition exists, and a second generated instruction that inputs the condition result as a portion of an operand which identifies a register of the out-of-order data processor.

Type: Application

Filed: November 18, 1998

Publication date: August 15, 2002

Inventors: JOEL SPRINGER EMER, BRUCE EDWARDS, DANIEL LAWRENCE LEIBHOLZ, EDWARD J. MCLELLAN, DERRICK R. MEYER
Execution unit for processing a data stream independently and in parallel

Patent number: 6401194

Abstract: A vector processor provides a data path divided into smaller slices of data, with each slice processed in parallel with the other slices. Furthermore, an execution unit provides smaller arithmetic and functional units chained together to execute more complex microprocessor instructions requiring multiple cycles by sharing single-cycle operations, thereby reducing both costs and size of the microprocessor. One embodiment handles 288-bit data widths using 36-bit data path slices. Another embodiment executes integer multiply and multiply-and-accumulate and floating point add/subtract and multiply operations using single-cycle arithmetic logic units. Other embodiments support 8-bit, 9-bit, 16-bit, and 32-bit integer data types and 32-bit floating data types.

Type: Grant

Filed: January 28, 1997

Date of Patent: June 4, 2002

Assignee: Samsung Electronics Co., Ltd.

Inventors: Le Trong Nguyen, Heonchul Park, Roney S. Wong, Ted Nguyen, Edward H. Yu
System for controlling movement of data in virtual environment using queued direct input/output device and utilizing finite state machine in main memory with two disjoint sets of states representing host and adapter states

Patent number: 6324600

Abstract: A method and an apparatus for controlling movement of data between any host and any network including a set of devices in a computing system environment having a main memory with a queuing mechanism having a plurality of queues capable of being shared between a plurality of independent processes running on at least one host and at least one I/O adapter. A finite-state machine (FSM) is provided in the main memory and the FSM is divided into two disjoint sets of states, one of which represents state-values processed by the host and set by the adapter, and said other set represents state-values processed by the adapter and set by said host. Using each of these set of states free-running, non-deadlocking processes are provided within the host and the adapter so that the processes sequence circularly and continuously through a vector related to the FSMs.

Type: Grant

Filed: February 19, 1999

Date of Patent: November 27, 2001

Assignee: International Business Machines Corporation

Inventors: Frank W. Brice, Richard P. Tarcza, Leslie W. Wyman
Apparatus and method for improved vector processing to support extended-length integer arithmetic

Patent number: 6295597

Abstract: An apparatus and a method for extended-precision vector arithmetic capable of extremely long precision (i.e., precision to as many bits as a user desires or is limited to due to memory, disk-storage, or other resource constraints). Vector carry-out bits can be used as vector carry-in bits for successive operations. In performing add or subtract operations on integers that are longer than the word size of the computer, the operands a broken into word-sized parts which are used as operands. A vector of long-integer numbers is thus broken into a series of sub-vectors, each having word-sized elements. Vector add or subtract operations are performed successively on the sub-vectors, starting with the lowest-order sub-vectors. Carry-out (or borrow-out) bits from a first vector operation are used as carry-in (or borrow-in) bits for a successive vector operation.

Type: Grant

Filed: August 11, 1998

Date of Patent: September 25, 2001

Assignee: Cray, Inc.

Inventors: David Resnick, William T. Moore
System and method for implementing conditional vector operations in which an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector

Patent number: 6269435

Abstract: A processor implements conditional vector operations in which an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed is divided into two groups based on whether or not they satisfy a given condition by, e.g., steering each to one of two index vectors. Once the data has been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication.

Type: Grant

Filed: September 14, 1998

Date of Patent: July 31, 2001

Assignees: The Board of Trustees of the Leland Stanford Junior University, The Massachusetts Institute of Technology

Inventors: William J. Dally, Scott Whitney Rixner, John Owens, Ujval J. Kapasi
Method and apparatus for performing vector operation using separate multiplication on odd and even data elements of source vectors

Patent number: 6202141

Abstract: A vector multiplication mechanism is provided that partitions vector multiplication operation into even and odd paths. In an odd path, odd data elements of first and second source vectors are selected, and multiplication operation is performed between each of the selected odd data elements of the first source vector and corresponding one of the selected odd data elements of the second source vector. In an even path, even data elements of the source vectors are selected, and multiplication operation is performed between each of the selected even data elements of the first source vector and corresponding one of the selected even data elements of the second source vector. Elements of resultant data of the two paths are merged together in a merge operation. The vector multiplication mechanism of the present invention preferably uses a single general-purpose register to store the resultant data of the odd path and the even path.

Type: Grant

Filed: June 16, 1998

Date of Patent: March 13, 2001

Assignee: International Business Machines Corporation

Inventors: Keith Everett Diefendorff, Pradeep Kumar Dubey, Ronald Ray Hochsprung, Brett Olsson, Hunter Ledbetter Scales, III
System and method for processing multiple received signal sources

Patent number: 6073158

Abstract: A system and method for time slicing multiple received data streams utilizing multiple processors in such a manner as to ensure that all processors are running at full capability and are efficiently timesharing a global memory storage area. The received data streams are each divided into fixed portions called spans. The invention is operable for sequencing the movement of the time-sliced spans between the processors, adjusting the scheduling of particular ones of the time-sliced spans as a function of either processor availability or maintenance of real-time transmission of the received real-time time-sliced data streams.

Type: Grant

Filed: July 29, 1993

Date of Patent: June 6, 2000

Assignee: Cirrus Logic, Inc.

Inventors: Robert Marshall Nally, John Charles Schafer
Apparatus and method for reducing the number of rename registers required in the operation of a processor

Patent number: 6061777

Abstract: One aspect of the invention relates to a method for operating a processor. In one version of the invention, the method includes the steps of dispatching an instruction; determining a presently architected RMAP entry for the architectural register targeted by the dispatched instruction; selecting the RMAP entries which are associated with physical registers that contain operands for the dispatched instruction; updating a use indicator in the selected RMAP entries; determining whether the dispatched instruction is interruptible; and updating an architectural indicator and a historical indicator in the presently architected RMAP entry if the dispatched instruction is uninterruptible.

Type: Grant

Filed: October 28, 1997

Date of Patent: May 9, 2000

Assignee: International Business Machines Corporation

Inventors: Hoichi Cheong, Paul Joseph Jordan, Hung Qui Le, Soummya Mallick
Digital data apparatus for transferring data between NTDS and bus topology data buses

Patent number: 6023752

Abstract: A program driver means is disclosed that allows for the exchange of inforion between a NTDS device and a device having a bus topology, especially a VMEbus. The program driver utilizes chain commands which are fully programmable at the user level. The processor itself is programmed at the register level to assure the fastest data rate possible (32 bit access) across the VMEbus. The processor driver is invisible to the user.

Type: Grant

Filed: November 25, 1997

Date of Patent: February 8, 2000

Assignee: The United States of America as represented by the Secretary of the Navy

Inventor: William M. Huttle