Sequential Patents (Class 712/8)
-
Patent number: 11829756Abstract: A vector cumulative sum circuit can include a set of input registers, a carry-forward data source, a set of output registers, and a network of adder circuits coupling the input registers to the output registers such that the output value in a given output register is the sum of a value provided by the carry-forward data source and the input values from all of the input registers (in logical order) up to (and including) the corresponding input register. The value in the last output register can be carried forward to enable cumulative summing of a larger number of input values. The vector cumulative sum circuit can be implemented in a programmable processor, and a vector cumulative sum instruction can be defined in the instruction set. Using the vector cumulative sum circuit and instruction, filtering operations can be accelerated.Type: GrantFiled: September 24, 2021Date of Patent: November 28, 2023Assignee: Apple Inc.Inventors: On Wa Yeung, Seydou N. Ba
-
Patent number: 11249767Abstract: An information handling system may load first data from a location information area of a first memory, specifying a plurality of locations of metadata for a plurality of stages of basic input/output system (BIOS) initialization. The information handling system may then load first metadata for a first stage of BIOS initialization from a first metadata location of the plurality of locations specified by the first data. The first metadata may contain information for indexing first initialization data located at a first initialization data location. The information handling system may then index the first initialization data of the first initialization data location based, at least in part, on the first metadata. The information handling system may then perform the first stage of BIOS initialization based, at least in part, on the first initialization data.Type: GrantFiled: February 5, 2019Date of Patent: February 15, 2022Assignee: Dell Products L.P.Inventors: Shekar Babu Suryanarayana, Sumanth Vidyadhara, Anand Prakash Joshi
-
Patent number: 10838719Abstract: Examples of a carry chain for performing an operation on operands each including elements of a selectable size is provided. Advantageously, the carry chain adapts to elements of different sizes. The carry chain determines a mask based on a selected size of an element. The carry chain selects, based on the mask, whether to carry a partial result of an operation performed on corresponding first portions of a first operand and a second operand into a next operation. The next operation is performed on corresponding second portions of the first operand and the second operand, and, based on the selection, the partial result of the operation. The carry chain stores, in a memory, a result formed from outputs of the operation and the next operation.Type: GrantFiled: November 13, 2015Date of Patent: November 17, 2020Assignee: Marvell Asia Pte, LtdInventor: David Kravitz
-
Patent number: 10831478Abstract: A Sort Lists instruction is provided to perform a sort and/or a merge operation. The instruction is an architected machine instruction of an instruction set architecture and is executed by a general-purpose processor of the computing environment. The executing includes sorting a plurality of input lists to obtain one or more sorted output lists, which are output.Type: GrantFiled: November 6, 2018Date of Patent: November 10, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Bruce C. Giamei, Martin Recktenwald, Donald W. Schmidt, Timothy Slegel, Aditya N. Puranik, Mark S. Farrell, Christian Jacobi, Jonathan D. Bradbury, Christian Zoellin
-
Patent number: 10831503Abstract: Saving and restoring machine state between multiple executions of an instruction. A determination is made that processing of an operation of an instruction executing on a processor has been interrupted prior to completion. Based on determining that the processing of the operation has been interrupted, current metadata of the processor is extracted. The metadata is stored in a location associated with the instruction and used to re-execute the instruction to resume forward processing of the instruction from where it was interrupted.Type: GrantFiled: November 6, 2018Date of Patent: November 10, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Bruce C. Giamei, Martin Recktenwald, Donald W. Schmidt, Timothy Slegel, Aditya N. Puranik, Mark S. Farrell, Christian Jacobi, Jonathan D. Bradbury, Christian Zoellin
-
Patent number: 10809978Abstract: A merge sort accelerator (MSA) includes a pre-processing stage configured to receive an input vector and generate a pre-processing output vector based on a pre-processing instruction and the input vector. The MSA also includes a merge sort network having multiple sorting stages configured to be selectively enabled. The merge sort network is configured to receive the pre-processing output vector and generate a sorted output vector based on a sorting instruction and the pre-processing output vector. The MSA includes an accumulator stage configured to receive the sorted output vector and update an accumulator vector based on the accumulator instruction and the sorted output vector. The MSA also includes a post-processing stage configured to receive the accumulator vector and generate a post-processing output vector based on a post-processing instruction and the accumulator vector.Type: GrantFiled: June 1, 2018Date of Patent: October 20, 2020Assignee: TEXAS INSTRUMENTS INCORPORATEDInventors: Arthur John Redfern, Asheesh Bhardwaj, Tarek Aziz Lahlou, William Franklin Leven
-
Patent number: 10620956Abstract: An instruction defined to be a looping instruction that repeats a plurality of times to perform an operation on a defined amount of data is obtained. The looping instruction is expanded into a sequence of operations. The sequence of operations is a non-looping sequence of operations to perform the operation on the defined amount of data.Type: GrantFiled: March 3, 2017Date of Patent: April 14, 2020Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Michael K. Gschwind
-
Patent number: 9250899Abstract: There is provided a multi-bit storage cell for a register file. The storage cell includes a first set of storage elements for a vector slice. Each storage element respectively corresponds to a particular one of a plurality of thread sets for the vector slice. The storage cell includes a second set of storage elements for a scalar slice. Each storage element in the second set respectively corresponds to a particular one of at least one thread set for the scalar slice. The storage cell includes at least one selection circuit for selecting, for an instruction issued by a thread, a particular one of the storage elements from any of the first set and the second set based upon the instruction being a vector instruction or a scalar instruction and based upon a corresponding set from among the pluralities of thread sets to which the thread belongs.Type: GrantFiled: June 13, 2007Date of Patent: February 2, 2016Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Michael Gschwind
-
Patent number: 9038073Abstract: Efficient data processing apparatus and methods include hardware components which are pre-programmed by software. Each hardware component triggers the other to complete its tasks. After the final pre-programmed hardware task is complete, the hardware component issues a software interrupt.Type: GrantFiled: August 13, 2009Date of Patent: May 19, 2015Assignee: QUALCOMM IncorporatedInventors: Mathias Kohlenz, Irfan Anwar Khan, Sathyanarayan Madhusudan, Shailesh Maheshwari, Srividhya Krishnamoorthy, Sandeep Urgaonkar, Thomas Klingenbrunn, Tim Tynghuei Liou, Idreas Mir
-
Publication number: 20150019837Abstract: A data processor includes: a plurality of controllers that process data; a program memory that stores a standby instruction and a data processing instruction at a plurality of addresses respectively; and a queue that stores different execution start addresses for the plurality of controllers, wherein after the plurality of controllers sequentially access the queue, the plurality of controllers acquire the different execution start addresses from the queue in an order of the sequential access, start execution of instructions from the acquired different execution start addresses in the program memory, and execute the data processing instruction and execute the standby instruction the number of times different for each of the controllers.Type: ApplicationFiled: September 29, 2014Publication date: January 15, 2015Inventors: Toshiya Otomo, Koichiro Yamashita, Takahisa Suzuki, Hiromasa Yamauchi, Koji Kurihara, Yuta Teranishi
-
Patent number: 8935509Abstract: A Baseboard Management Controller (BMC) controlling method includes the steps of dividing a memory of a BMC into an original region and customized region, in which the original region includes at least one original sensor data record (SDR) and original platform event filter (PEF) corresponding to each other; providing an instruction set to at least one external system, in which the external system manages at least one customized SDR and customized PEF corresponding to each other in the customized region through the instruction set; polling the original SDR in the original region and the customized SDR in the customized region; determining whether values of the SDRs obtained through polling conform to a plurality of critical values individually corresponding to the SDRs; and obtaining a processing policy according to the corresponding PEF when at least one value of the SDR does not conform to the corresponding critical value.Type: GrantFiled: February 24, 2011Date of Patent: January 13, 2015Assignee: Inventec CorporationInventors: Chih Wei Chen, Hsiao Fen Lu
-
Publication number: 20140189295Abstract: A machine readable storage medium containing program code is described that when processed by a processor causes a method to be performed. The method includes creating a resultant rolled version of an input vector by forming a first intermediate vector, forming a second intermediate vector and forming a resultant rolled version of an input vector. The first intermediate vector is formed by barrel rolling elements of the input vector along a first of two lanes defined by an upper half and a lower half of the input vector. The second intermediate vector is formed by barrel rolling elements of the input vector along a second of the two lanes. The resultant rolled version of the input vector is formed by incorporating upper portions of one of the intermediate vector's upper and lower halves as upper portions of the resultant's upper and lower halves and incorporating lower portions of the other intermediate vector's upper and lower halves as lower portions of the resultant's upper and lower halves.Type: ApplicationFiled: December 29, 2012Publication date: July 3, 2014Inventors: Tal ULIEL, Boris BOLSHEM, ELMOUSTAPHA OULD-AHMED-VALL
-
Patent number: 8094768Abstract: The present invention discloses a novel multi-channel timing recovery scheme that utilizes a shared CORDIC to accurately compute the phase for each tone. Then a hardware-based linear combiner module is used to reconstruct the best phase estimate from multiple phase measurements. The firmware monitors the noise variance for the pilot tones and determines the corresponding weight for each tone to ensure that the minimum phase jitter noise is achieved through the linear combiner. Then a hardware-based second-order timing recovery control loop generates the frequency reference signal for VCXO or DCXO. A single sequentially controlled multiplier is used for all multiplications in the control loop.Type: GrantFiled: December 21, 2006Date of Patent: January 10, 2012Assignee: Triductor Technology (Suzhou) Inc.Inventor: Yaolong Tan
-
Patent number: 7870159Abstract: A computer program product and associated algorithm for sorting S sequences of binary bits. The S sequences may be integers, floating point numbers, or character strings. The algorithm is executed by a processor of a computer system. Each sequence includes contiguous fields of bits. The algorithm executes program code at nodes of a linked execution structure in a sequential order with respect to the nodes. The algorithm executes a masking of the contiguous fields of the S sequences in accordance with a mask whose content is keyed to the field being masked. The sequential order of execution of the nodes is a function of an ordering of masking results of the masking. Each sequence, or a pointer to each sequence, is outputted to an array in the memory device whenever the masking places the sequence in a leaf node of the nodal linked execution structure.Type: GrantFiled: January 2, 2008Date of Patent: January 11, 2011Assignee: International Business Machines CorporationInventor: Dennis J. Carroll
-
Patent number: 7818539Abstract: A processor implements conditional vector operations in which, for example, an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed are divided into two groups based on whether or not they satisfy a given condition by e.g., steering each to one of the two index vectors. Once the data have been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication. Other examples of conditional operations include combining one or more input vectors into a single output vector based on a condition vector, conditional vector switching, conditional vector combining, and conditional vector load balancing.Type: GrantFiled: August 28, 2006Date of Patent: October 19, 2010Assignees: The Board of Trustees of the Leland Stanford Junior University, The Massachusetts Institute of TechnologyInventors: Scott Rixner, John D. Owens, Ujval J. Kapasi, William J. Dally
-
Patent number: 7770024Abstract: A method, system and computer program product for computing a message authentication code for data in storage of a computing environment. An instruction specifies a unit of storage for which an authentication code is to be computed. An computing operation computes an authentication code for the unit of storage. A register is used for providing a cryptographic key for use in the computing to the authentication code. Further, the register may be used in a chaining operation.Type: GrantFiled: February 12, 2008Date of Patent: August 3, 2010Assignee: International Business Machines CorporationInventors: Shawn D. Lundvall, Ronald M. Smith, Sr., Phil Chi-Chung Yeh
-
Patent number: 7735090Abstract: A method, apparatus and article of manufacture to dynamically modify, terminate, or replace software components and connections (i.e., contracts) between components in a running assembly. Information about the component and contracts between components in a running assembly is used to determine an allowable sequence of management commands to transition the assembly of components from a current state to a specified goal state. At the same time, other components may continue to perform an operational workflow.Type: GrantFiled: December 15, 2005Date of Patent: June 8, 2010Assignee: International Business Machines CorporationInventors: James E. Carey, Scott N. Gerard
-
Patent number: 7694158Abstract: A multi-processing system-on-chip including a cluster of processors having respective CPUs is operated by: defining a master CPU within the respective CPUs to coordinate operation of said multi-processing system, running on the CPU a cluster manager agent. The cluster manager agent is adapted to dynamically migrate software processes between the CPUs of said plurality and change power settings therein.Type: GrantFiled: April 18, 2006Date of Patent: April 6, 2010Assignee: STMicroelectronics S.R.L.Inventors: Diego Melpignano, David Siorpaes, Paolo Zambotti, Antonio Borneo
-
Patent number: 7467138Abstract: A method and associated algorithm for in-place sorting S sequences of binary bits stored contiguously in an array within a memory device of a computer system prior to the sorting. Each sequence includes contiguous fields of bits. The algorithm is executed by a processor of a computer system. The in-place sorting executes program code at each node of a linked execution structure. Each node includes a segment of the array. The program code is executed in a hierarchical sequence with respect to the nodes. Executing program code at each node includes: dividing the segment of the node into groups of sequences based on a mask field having a mask width, wherein each group has a unique mask value of the mask field; and in-place rearranging the sequences in the segment, wherein the rearranging results in each group including only those sequences having the unique mask value of the group.Type: GrantFiled: December 14, 2004Date of Patent: December 16, 2008Assignee: International Business Machines CorporationInventor: Dennis J. Carroll
-
Patent number: 7460989Abstract: A method is provided, wherein a virtual internal master clock is used in connection with a RISC CPU. The RISC CPU comprises a number of concurrently operating function units, wherein each unit runs according to its own clocks, including multiple-stage totally unsynchronized clocks, in order to process a stream of instructions. The method includes the steps of generating a virtual model master clock having a clock cycle, and initializing each of the function units at the beginning of respectively corresponding processing cycles. The method further includes operating each function unit during a respectively corresponding processing cycle to carry out a task with respect to one of the instructions, in order to produce a result. Respective results are all evaluated in synchronization, by means of the master clock. This enables the instruction processing operation to be modeled using a sequential computer language, such as C or C++.Type: GrantFiled: October 14, 2004Date of Patent: December 2, 2008Assignee: International Business Machines CorporationInventor: Oliver Keren Ban
-
Patent number: 7444488Abstract: A method and a programmable unit for bit field shifting in a memory device in a programmable unit as a result of the execution of an instruction, in which a bit segment is shifted within a first memory unit to a second memory unit, are presented. The bit segment is read with a first bit length from a first bit field in the first memory unit starting at a first start point. The bit segment that has been read is stored in the first bit field in the second memory unit starting at a second start point. The first or the second start points is updated by a predetermined value and the updated start point is stored for subsequent method steps.Type: GrantFiled: September 30, 2005Date of Patent: October 28, 2008Assignee: Infineon TechnologiesInventors: Xiaoning Nie, Thomas Wahl
-
Patent number: 7356710Abstract: A method, system and computer program product for computing a message authentication code for data in storage of a computing environment. An instruction specifies a unit of storage for which an authentication code is to be computed. An computing operation computes an authentication code for the unit of storage. A register is used for providing a cryptographic key for use in the computing to the authentication code. Further, the register may be used in a chaining operation.Type: GrantFiled: May 12, 2003Date of Patent: April 8, 2008Assignee: International Business Machines CorporationInventors: Shawn D. Lundvall, Ronald M. Smith, Sr., Phil Chi-Chung Yeh
-
Patent number: 7237086Abstract: A customization program for use in customizing a baseboard management controller used for monitoring operation of various computer system components is disclosed. A user interacts with the customization program to customize the baseboard management controller based on a configuration of components specified for the baseboard of the computer system. The customization program provides a user interface having a repository of icons and a design page. The icons represent various components that may be connected, either directly or indirectly, to the baseboard. The design page is used for constructing a model representing the specified configuration of components. As a user drags icons onto the design page, the model is updated to reflect selection of the components corresponding to these icons. Further, the customization program creates a configuration file that identifies and describes each of the selected components.Type: GrantFiled: November 26, 2003Date of Patent: June 26, 2007Assignee: American Megatrends, Inc.Inventors: Govind A. Kothandapani, Bakka Ravinder Reddy
-
Patent number: 7231261Abstract: In order to automatically calculate an operational sequence of processes that determine an output value from at least one input value, a multitude of processes (P1–P8), whose inputs are provided with at least one of the attributes: input value of the same calculation cycle (PRE), input value of the preceding calculation cycle (POST), input value from any calculation cycle (ANY), are arranged in such a manner that a process, which does not have any input with the attribute input value of the same calculation cycle (PRE), is determined as the first process of a calculation cycle and, in successive analogous steps, determines a quantity of possible sequences.Type: GrantFiled: January 16, 2003Date of Patent: June 12, 2007Assignee: Siemens AktiengesellschaftInventors: Lutz Berentroth, Stefan Hoelzl, Helmut Wellnhofer
-
Patent number: 7100026Abstract: A processor implements conditional vector operations in which, for example, an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed are divided into two groups based on whether or not they satisfy a given condition by, e.g., steering each to one of two index vectors. Once the data have been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication. Other examples of conditional operations include combining one or more input vectors into a single output vector based on a condition vector, conditional vector switching, conditional vector combining, and conditional vector load balancing.Type: GrantFiled: May 30, 2001Date of Patent: August 29, 2006Assignees: The Massachusetts Institute of Technology, The Board of Trustees of the Leland Stanford Junior UniversityInventors: William J. Dally, Scott Rixner, John D. Owens, Ujval J. Kapasi
-
Patent number: 7000093Abstract: A cellular automaton cache memory architecture. On a micro-processor that is also capable of executing general-purpose instructions, a cache memory is provided to store instructions and data for use by the processor. The cache memory is further capable of storing data representing a first state of a cellular automaton at a first time step, where the data is organized in cells. A cellular automaton prefetch unit prefetches data associated with a cell to be updated and a neighborhood buffer stores the prefetched data. A cellular automaton update unit provides data from the neighborhood buffer to an update engine. The update engine includes a microprocessor execution unit capable of executing at least some general purpose microprocessor instructions and updates at least some of the selected cells according to an update rule and a state of any associated neighborhood cells to provide a state of the cellular automaton at a second time step.Type: GrantFiled: December 19, 2001Date of Patent: February 14, 2006Assignee: Intel CorporationInventor: John W. Mates
-
Patent number: 6963341Abstract: The present invention provides efficient ways to implement scan conversion and matrix transpose operations using vector multiplex operations in a SIMD processor. The present method provides a very fast and flexible way to implement different scan conversions, such as zigzag conversion, and matrix transpose for 2×2, 4×4, 8×8 blocks commonly used by all video compression and decompression algorithms.Type: GrantFiled: May 20, 2003Date of Patent: November 8, 2005Inventor: Tibet Mimar
-
Patent number: 6954927Abstract: A method for optimizing a software pipelineable loop in a software code is provided. The loop comprises one or more pipelined stages and one or more loop operations. The method comprises evaluating an initiation interval time (IN) for a pipelined stage of the loop. A loop operation time latency (Tld) and a number of loop operations (Np) from the pipelined stages to peel based on IN and Tld is then determined. The loop operation is peeled Np times and copied before the loop in the software code. A vector of registers is allocated and the results of the peeled loop operations and a result of an original loop operation is assigned to the vector of registers. Memory addresses for the results of the peeled loop operations and original loop operation are also assigned.Type: GrantFiled: October 4, 2001Date of Patent: October 11, 2005Assignee: Elbrus InternationalInventor: Alexander Y. Ostanevich
-
Patent number: 6934938Abstract: A method for producing a formatted description of a computation representable by a data-flow graph and computer for performing a computation so described. A source instruction is generated for each input of the data-flow graph, a computational instruction is generated for each node of the data-flow graph, and a sink instruction is generated for each output of the data-flow graph. The computational instruction for a node includes a descriptor of an operation performed at the node and a descriptor of each instruction that produces an input to the node. The formatted description is a sequential instruction list comprising source instructions, computational instructions and sink instructions. Each instruction has an instruction identifier and the descriptor of each instruction that produces an input to the node is the instruction identifier. The computer is directed by a program of instructions to implement a computation representable by a data-flow graph.Type: GrantFiled: June 28, 2002Date of Patent: August 23, 2005Assignee: Motorola, Inc.Inventors: Philip E. May, Kent Donald Moat, Raymond B. Essick, IV, Silviu Chiricescu, Brian Geoffrey Lucas, James M. Norris, Michael Allen Schuette, Ali Saidi
-
Patent number: 6721813Abstract: A computer system is presented which implements a system and method for tracking the progress of posted write transactions. In one embodiment, the computer system includes a processing subsystem and an input/output (I/O) subsystem. The processing subsystem includes multiple processing nodes interconnected via coherent communication links. Each processing node may include a processor preferably executing software instructions. The I/O subsystem includes one or more I/O nodes. Each I/O node may embody one or more I/O functions (e.g., modem, sound card, etc.). The multiple processing nodes may include a first processing node and a second processing node, wherein the first processing node includes a host bridge, and wherein a memory is coupled to the second processing node. An I/O node may generate a non-coherent write transaction to store data within the second processing node's memory, wherein the non-coherent write transaction is a posted write transaction.Type: GrantFiled: January 30, 2001Date of Patent: April 13, 2004Assignee: Advanced Micro Devices, Inc.Inventors: Jonathan M. Owen, Mark D. Hummel, James B. Keller
-
Patent number: 6557048Abstract: A computer system is presented which implements a system and method for ordering input/output (I/O) memory operations. In one embodiment, the computer system includes a processing subsystem and an I/O subsystem. The processing subsystem includes multiple processing nodes interconnected via coherent communication links. Each processing node may include a processor executing software instructions. The I/O subsystem includes one or more I/O nodes serially coupled via non-coherent communication links. Each I/O node may embody one or more I/O functions (e.g., modem, sound card, etc.). One of the processing nodes includes a host bridge which translates packets moving between the processing subsystem and the I/O subsystem. One of the I/O nodes is coupled to the processing node including the host bridges. The I/O node coupled to the processing node produces and/or provides transactions having destinations or targets within the processing subsystem to the processing node including the host bridge.Type: GrantFiled: November 1, 1999Date of Patent: April 29, 2003Assignee: Advanced Micro Devices, Inc.Inventors: James B. Keller, Derrick R. Meyer, Dale E. Gulick, Larry D. Hewitt
-
Patent number: 6446193Abstract: A method and apparatus for reducing instruction cycles in a digital signal processor wherein the processor includes a multiplier unit, an adder, a memory, and at least one pair of first and second accumulators. The accumulators include respective guard, high and low parts. The method and apparatus enable vectoring the respective first and second high parts from the accumulators to define a single vectored register responsive to a single instruction cycle and processing the data in the vectored register.Type: GrantFiled: September 8, 1997Date of Patent: September 3, 2002Assignee: Agere Systems Guardian Corp.Inventors: Mazhar M. Alidina, Sivanand Simanapalli, Larry R. Tate
-
Publication number: 20020112142Abstract: A technique for handling a conditional move instruction in an out-of-order data processor. The technique involves detecting a conditional move instruction within an instruction stream, and generating multiple instructions according to the detected conditional move instruction. The technique further involves replacing the conditional move instruction within the instruction stream with the generated multiple instructions. The generated multiple instructions are generated such that each of the generated multiple instructions executes using no more than two input ports of an execution unit. The generated multiple instructions include a first generated instruction that produces a condition result indicating whether a condition exists, and a second generated instruction that inputs the condition result as a portion of an operand which identifies a register of the out-of-order data processor.Type: ApplicationFiled: November 18, 1998Publication date: August 15, 2002Inventors: JOEL SPRINGER EMER, BRUCE EDWARDS, DANIEL LAWRENCE LEIBHOLZ, EDWARD J. MCLELLAN, DERRICK R. MEYER
-
Patent number: 6401194Abstract: A vector processor provides a data path divided into smaller slices of data, with each slice processed in parallel with the other slices. Furthermore, an execution unit provides smaller arithmetic and functional units chained together to execute more complex microprocessor instructions requiring multiple cycles by sharing single-cycle operations, thereby reducing both costs and size of the microprocessor. One embodiment handles 288-bit data widths using 36-bit data path slices. Another embodiment executes integer multiply and multiply-and-accumulate and floating point add/subtract and multiply operations using single-cycle arithmetic logic units. Other embodiments support 8-bit, 9-bit, 16-bit, and 32-bit integer data types and 32-bit floating data types.Type: GrantFiled: January 28, 1997Date of Patent: June 4, 2002Assignee: Samsung Electronics Co., Ltd.Inventors: Le Trong Nguyen, Heonchul Park, Roney S. Wong, Ted Nguyen, Edward H. Yu
-
Patent number: 6324600Abstract: A method and an apparatus for controlling movement of data between any host and any network including a set of devices in a computing system environment having a main memory with a queuing mechanism having a plurality of queues capable of being shared between a plurality of independent processes running on at least one host and at least one I/O adapter. A finite-state machine (FSM) is provided in the main memory and the FSM is divided into two disjoint sets of states, one of which represents state-values processed by the host and set by the adapter, and said other set represents state-values processed by the adapter and set by said host. Using each of these set of states free-running, non-deadlocking processes are provided within the host and the adapter so that the processes sequence circularly and continuously through a vector related to the FSMs.Type: GrantFiled: February 19, 1999Date of Patent: November 27, 2001Assignee: International Business Machines CorporationInventors: Frank W. Brice, Richard P. Tarcza, Leslie W. Wyman
-
Patent number: 6295597Abstract: An apparatus and a method for extended-precision vector arithmetic capable of extremely long precision (i.e., precision to as many bits as a user desires or is limited to due to memory, disk-storage, or other resource constraints). Vector carry-out bits can be used as vector carry-in bits for successive operations. In performing add or subtract operations on integers that are longer than the word size of the computer, the operands a broken into word-sized parts which are used as operands. A vector of long-integer numbers is thus broken into a series of sub-vectors, each having word-sized elements. Vector add or subtract operations are performed successively on the sub-vectors, starting with the lowest-order sub-vectors. Carry-out (or borrow-out) bits from a first vector operation are used as carry-in (or borrow-in) bits for a successive vector operation.Type: GrantFiled: August 11, 1998Date of Patent: September 25, 2001Assignee: Cray, Inc.Inventors: David Resnick, William T. Moore
-
Patent number: 6269435Abstract: A processor implements conditional vector operations in which an input vector containing multiple operands to be used in conditional operations is divided into two or more output vectors based on a condition vector. Each output vector can then be processed at full processor efficiency without cycles wasted due to branch latency. Data to be processed is divided into two groups based on whether or not they satisfy a given condition by, e.g., steering each to one of two index vectors. Once the data has been segregated in this way, subsequent processing can be performed without conditional operations, processor cycles wasted due to branch latency, incorrect speculation or execution of unnecessary instructions due to predication.Type: GrantFiled: September 14, 1998Date of Patent: July 31, 2001Assignees: The Board of Trustees of the Leland Stanford Junior University, The Massachusetts Institute of TechnologyInventors: William J. Dally, Scott Whitney Rixner, John Owens, Ujval J. Kapasi
-
Patent number: 6202141Abstract: A vector multiplication mechanism is provided that partitions vector multiplication operation into even and odd paths. In an odd path, odd data elements of first and second source vectors are selected, and multiplication operation is performed between each of the selected odd data elements of the first source vector and corresponding one of the selected odd data elements of the second source vector. In an even path, even data elements of the source vectors are selected, and multiplication operation is performed between each of the selected even data elements of the first source vector and corresponding one of the selected even data elements of the second source vector. Elements of resultant data of the two paths are merged together in a merge operation. The vector multiplication mechanism of the present invention preferably uses a single general-purpose register to store the resultant data of the odd path and the even path.Type: GrantFiled: June 16, 1998Date of Patent: March 13, 2001Assignee: International Business Machines CorporationInventors: Keith Everett Diefendorff, Pradeep Kumar Dubey, Ronald Ray Hochsprung, Brett Olsson, Hunter Ledbetter Scales, III
-
Patent number: 6073158Abstract: A system and method for time slicing multiple received data streams utilizing multiple processors in such a manner as to ensure that all processors are running at full capability and are efficiently timesharing a global memory storage area. The received data streams are each divided into fixed portions called spans. The invention is operable for sequencing the movement of the time-sliced spans between the processors, adjusting the scheduling of particular ones of the time-sliced spans as a function of either processor availability or maintenance of real-time transmission of the received real-time time-sliced data streams.Type: GrantFiled: July 29, 1993Date of Patent: June 6, 2000Assignee: Cirrus Logic, Inc.Inventors: Robert Marshall Nally, John Charles Schafer
-
Patent number: 6061777Abstract: One aspect of the invention relates to a method for operating a processor. In one version of the invention, the method includes the steps of dispatching an instruction; determining a presently architected RMAP entry for the architectural register targeted by the dispatched instruction; selecting the RMAP entries which are associated with physical registers that contain operands for the dispatched instruction; updating a use indicator in the selected RMAP entries; determining whether the dispatched instruction is interruptible; and updating an architectural indicator and a historical indicator in the presently architected RMAP entry if the dispatched instruction is uninterruptible.Type: GrantFiled: October 28, 1997Date of Patent: May 9, 2000Assignee: International Business Machines CorporationInventors: Hoichi Cheong, Paul Joseph Jordan, Hung Qui Le, Soummya Mallick
-
Patent number: 6023752Abstract: A program driver means is disclosed that allows for the exchange of inforion between a NTDS device and a device having a bus topology, especially a VMEbus. The program driver utilizes chain commands which are fully programmable at the user level. The processor itself is programmed at the register level to assure the fastest data rate possible (32 bit access) across the VMEbus. The processor driver is invisible to the user.Type: GrantFiled: November 25, 1997Date of Patent: February 8, 2000Assignee: The United States of America as represented by the Secretary of the NavyInventor: William M. Huttle