Parallel Patents (Class 708/507)

Floating-point decomposition circuitry with dynamic precision

Patent number: 12197887

Abstract: Circuitry for decomposing block floating-point numbers into lower precision floating-point numbers is provided. The circuitry may include a high precision storage circuit configured to provide high precision floating-point numbers, input selectors configured to receive the high precision floating-point numbers from the high precision storage circuit and to generate corresponding lower precision floating-point components with adjusted exponents, and a low precision block floating-point vector circuit configured to combine the various lower precision floating-point components generated by the input selectors. The lower precision floating-point components may be processed spatially or over multiple iterations over time.

Type: Grant

Filed: March 13, 2020

Date of Patent: January 14, 2025

Assignee: Altera Corporation

Inventors: Roberto DiCecco, Joshua Fender, Shane O'Connell
Reusing adjacent SIMD unit for fast wide result generation

Patent number: 11269651

Abstract: A system for processing instructions with extended results includes a first instruction execution unit having a first result bus for execution of processor instructions. The system further includes a second instruction execution unit having a second result bus for execution of processor instructions. The first instruction execution unit is configured to selectively send a portion of results calculated by the first instruction execution unit to the second instruction execution unit during prosecution of a processor instruction if the second instruction execution unit is not used for executing the processor instruction and if the received processor instruction produces a result having a data width greater than the width of the first result bus. The second instruction execution unit is configured to receive the portion of results calculated by the first instruction execution unit and put the received results on the second results bus.

Type: Grant

Filed: September 10, 2019

Date of Patent: March 8, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Michael Klein, Nicol Hofmann, Cedric Lichtenau, Osher Yifrach
Combining of several execution units to compute a single wide scalar result

Patent number: 10275391

Abstract: A circuit includes reconfigurable units that are reconfigurable to compute a combined result. A first intermediate result of a first reconfigurable unit of the reconfigurable units is exchanged with a second intermediate result of the second reconfigurable unit of the reconfigurable units. The first reconfigurable unit computes a first portion of the combined result utilizing the second intermediate result. The second reconfigurable unit of the reconfigurable units computes a second portion of the combined result utilizing the first intermediate result.

Type: Grant

Filed: January 23, 2017

Date of Patent: April 30, 2019

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Nicol Hofmann, Michael Klein, Cédric Lichtenau
Integrated circuits with specialized processing blocks for performing floating-point fast fourier transforms and complex multiplication

Patent number: 10140091

Abstract: Integrated circuits with specialized processing blocks are provided. A specialized processing block may include one real addition stage and one real multiplier stage. The multiplier stage may simultaneously feed its output to the addition stage and directly to an adjacent specialized processing block. The addition stage may also produce sum and difference outputs in parallel. A group of four such specialized processing blocks may be connected in a chain to implement a radix-2 fast Fourier transform (FFT) butterfly. Multiple radix-2 butterflies may be stacked to form yet higher order radix butterflies. If desired, the specialized processing block may also be used to implement a complex multiply operation. Three or four specialized processing blocks may be chained together and along with one or more adders outside the specialized processing blocks, real and imaginary portions of a complex product can be generated.

Type: Grant

Filed: September 27, 2016

Date of Patent: November 27, 2018

Assignee: Altera Corporation

Inventor: Martin Langhammer
Methods and apparatus for performing product series operations in multiplier accumulator blocks

Patent number: 10037192

Abstract: A specialized processing block on an integrated circuit includes a first and second arithmetic operator stage, an output coupled to another specialized processing block, and configurable interconnect circuitry which may be configured to route signals throughout the specialized processing block, including in and out of the first and second arithmetic operator stages. The configurable interconnect circuitry may further include multiplexer circuitry to route selected signals. The output of the specialized processing block that is coupled to another specialized processing block together with the configurable interconnect circuitry reduces the need to use resources outside the specialized processing block when implementing mathematical functions that require the use of more than one specialized processing block. An example for such mathematical functions include the implementation of scaled product sum operations and the implementation of Horner's rule.

Type: Grant

Filed: October 21, 2015

Date of Patent: July 31, 2018

Assignee: Altera Corporation

Inventor: Martin Langhammer
Double-precision floating-point operation

Patent number: 10007487

Abstract: Systems and methods for using single-precision floating-point operation digital signal processing (DSP) blocks in conjunction to perform double-precision floating-point operations.

Type: Grant

Filed: June 30, 2016

Date of Patent: June 26, 2018

Assignee: Altera Corporation

Inventor: Tomasz Sebastian Czajkowski
Device and method for finding error location

Patent number: 10003360

Abstract: An electronic device for finding error locations in a codeword includes a plurality of power control units configured to find error locations in the codeword. The plurality of power control units are coupled in parallel. Each of the plurality of power control units includes a plurality of corresponding input control circuits to individually turn on or off the corresponding power control unit.

Type: Grant

Filed: October 3, 2016

Date of Patent: June 19, 2018

Assignee: Macronix Internatonal Co., Ltd.

Inventor: Kuan Chieh Wang
Programmable device using fixed and configurable logic to implement recursive trees

Patent number: 9600278

Abstract: A specialized processing block on a programmable integrated circuit device includes a first floating-point arithmetic operator stage, and a floating-point adder stage having at least one floating-point binary adder. Configurable interconnect within the specialized processing block routes signals into and out of each of the first floating-point arithmetic operator stage and the floating-point adder stage. The block has a plurality of block inputs, at least one block output, a direct-connect input for connection to a first other instance of the specialized processing block, and a direct-connect output for connection to a second other instance of the specialized processing block. A plurality of instances of the specialized processing block are together configurable as a binary or ternary recursive adder tree.

Type: Grant

Filed: July 15, 2013

Date of Patent: March 21, 2017

Assignee: Altera Corporation

Inventor: Martin Langhammer
FLOATING-POINT ERROR PROPAGATION IN DATAFLOW

Publication number: 20130173682

Abstract: A process for propagating an error in a floating-point calculation is disclosed. A floating-point error occurring from the floating-point arithmetic calculation is trapped, and a special value is generated. Information regarding the error is stored as a payload of the special value. Program operations are resumed with the special value applied to further calculations dependent on the floating-point arithmetic calculation.

Type: Application

Filed: December 28, 2011

Publication date: July 4, 2013

Applicant: MICROSOFT CORPORATION

Inventor: Marko Radmilac
Compact filter design

Patent number: 8295412

Abstract: An apparatus and method for signal detection in which a digital sample stream is fed round robin into a plurality of buffers, which are sequentially compared with a reference signal to determine a match. A processor determines the chronological order of the samples in each bit of each buffer, and directs a bitwise comparison between the signal in each buffer with the reference to determine a match, e.g., by correlation. The apparatus and method are preferably implemented with a Field-Programmable Gate Array (FPGA). This scheme permits real time correlation of a data stream with a reference without use of shift registers, or a significant number of dedicated logic blocks.

Type: Grant

Filed: September 30, 2010

Date of Patent: October 23, 2012

Assignee: The United States of America as represented by the Secretary of the Navy

Inventor: Jeremy R. O'Neal
Systems, methods, and apparatus for recursive quantum computing algorithms

Patent number: 8244650

Abstract: A recursive approach to quantum computing employs an initial solution, determines intermediate solutions, evaluates the intermediate solutions and repeats using the intermediate solution, if the intermediate solution does not satisfy solution criteria. A best one of the intermediate solutions may be employed in the recursion.

Type: Grant

Filed: June 9, 2008

Date of Patent: August 14, 2012

Assignee: D-Wave Systems Inc.

Inventor: Geordie Rose
Multipurpose functional unit with double-precision and filtering operations

Patent number: 8051123

Abstract: A multipurpose arithmetic functional unit selectively performs planar attribute interpolation, unary function approximation, double-precision arithmetic, and/or arbitrary filtering functions such as texture filtering, bilinear filtering, or anisotropic filtering by iterating through a multi-step multiplication operation with partial products (partial results) accumulated in an accumulation register. Shared multiplier and adder circuits are advantageously used to implement the product and sum operations for unary function approximation and planar interpolation; the same multipliers and adders are also leveraged to implement double-precision multiplication and addition.

Type: Grant

Filed: December 15, 2006

Date of Patent: November 1, 2011

Assignee: NVIDIA Corporation

Inventors: Stuart Oberman, Ming Y. Siu
Multi-gigabit per second concurrent encryption in block cipher modes

Patent number: 7885405

Abstract: One embodiment is a system adapted to encrypt one or more packets of plaintext data in cipher-block chaining (CBC) mode. The system includes a plurality of digital logic components connected in series, where respective components are operative to process one or more rounds of a block cipher algorithm. A plurality of N bit registers are respectively coupled to the plurality of digital logic components. An XOR component receives blocks of plaintext data and blocks of ciphertext data, and XORs blocks of plaintext data for respective plaintext packets with previously encrypted blocks of ciphertext data for those plaintext packets. The XOR component iteratively feeds the XOR'd blocks of data into a first of the plurality of the digital logic components. In addition, a circuit component is operative to selectively pass blocks of ciphertext data fed back from an output of a final logic component to the XOR component.

Type: Grant

Filed: June 4, 2004

Date of Patent: February 8, 2011

Assignee: GlobalFoundries, Inc.

Inventor: William Hock Soon Bong
Methods and apparatus for parallel execution of a process

Patent number: 7814462

Abstract: In one embodiment, a process may be performed in parallel on a parallel server by defining a data type that may be used to reference data stored on the parallel server and overloading a previously-defined operation, such that when the overloaded operation is called, a command is sent to the parallel server to manipulate the data stored on the parallel server. In some embodiments, the previously-defined operation that is overloaded may be an operation of an operating system. Further, in some embodiments, when the data stored on the parallel server is no longer needed, a command may be sent to the parallel server to reallocate the memory used to store the data.

Type: Grant

Filed: August 31, 2005

Date of Patent: October 12, 2010

Assignees: Massachusetts Institute of Technology, The Regents of the University of California

Inventors: Parry Jones Reginald Husbands, Long Yin Choy, Alan Edelman, Eckart Jansen, Viral B. Shah
Apparatus and method for integer to floating-point format conversion

Patent number: 7774393

Abstract: An apparatus and method for integer to floating-point format conversion. A processor may include an adder configured to perform addition of respective mantissas of two floating-point operands to produce a sum, where a smaller-exponent one of the floating-point operands has a respective exponent less than or equal to a respective exponent of a larger-exponent one of the floating-point operands. The processor may further include an alignment shifter coupled to the adder and configured, in a first mode of operation, to align the floating-point operands prior to the addition by shifting the respective mantissa of the smaller-exponent operand towards a least-significant bit position. The alignment shifter may be further configured, in a second mode of operation, to normalize an integer operand by shifting the integer operand towards a most-significant bit position. The second mode of operation may be active during execution of an instruction to convert the integer operand to floating-point format.

Type: Grant

Filed: June 30, 2004

Date of Patent: August 10, 2010

Assignee: Oracle America, Inc.

Inventors: Jeffrey S. Brooks, Sadar U. Ahmed
Efficient parallel cyclic redundancy check calculation using modulo-2 multiplications

Patent number: 7627802

Abstract: A system and method for cyclic redundancy checks (CRC) having a CRC polynomial of width (W) for use in a digital signal processing system is disclosed. The system includes receiving a message ({right arrow over (m)}) and decomposing that message ({right arrow over (m)}) into a series of smaller blocks ({right arrow over (b)}i). Each block ({right arrow over (b)}i) is of size (M) and is related to a unit vector ({right arrow over (e)}i). A summation operation on the blocks ({right arrow over (b)}i) given by CRC({right arrow over (b)})=?bi·CRC({right arrow over (e)}i) is performed. Each CRC of the unit vectors (CRC({right arrow over (e)}i)) is stored in a lookup table. The lookup table is tagged by the “one” bits of the message block. An exclusive OR (XOR) operation is performed on each tagged row of the lookup table to calculate the CRC of the message.

Type: Grant

Filed: August 15, 2006

Date of Patent: December 1, 2009

Assignee: Samsung Electronics Co., Ltd.

Inventors: Eran Pisek, Jasmin Oz
Multiple prime number generation using a parallel prime number search algorithm

Patent number: 7120248

Abstract: A process is provided for searching in parallel for a plurality of prime number values simultaneously includes the steps of: randomly generating a plurality of k random odd numbers (wherein k is preferably more than 2, but could also be one or more) expressed as n0,0, n1,0, . . . n((k?1)),0, each number providing a prime number candidate; determining a plurality of y additional odd numbers based on each one of the randomly generated odd numbers n0,0, n1,0, . . .

Type: Grant

Filed: March 26, 2001

Date of Patent: October 10, 2006

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: W. Dale Hopkins, Thomas W. Collins, Steven W. Wierenga, Ruth A. Wang
Decoder for error correcting block codes

Patent number: 6637002

Abstract: A decoder for decoding block error correction codes is described. The decoder includes a first search circuit to find roots of an error location polynomial corresponding to an error location and a second search circuit to find roots of an error location polynomial corresponding to an error location. A multiplexer is fed by the first search circuit and the second search circuit to produce an error location from the error location polynomial.

Type: Grant

Filed: October 21, 1998

Date of Patent: October 21, 2003

Assignee: Maxtor Corporation

Inventors: Lih-Jyh Weng, Ba-Zhong Shen, Shih Mo, Chung Chang
Method and apparatus to enable job streaming for a set of commonly shared resources

Patent number: 6570670

Abstract: A method and apparatus for prioritizing the use of multifunctional printing system's basic processing resources to permit job streaming. The printing system employs a controller with an improved job contention manager (JCM). A plurality of basic resources of the printing system are provided with a queue. One or more job services, at desired times, signals the JCM to carry out a sub-job of a given job. The signal for each of the sub-jobs includes information about the respective sub-job's, job service and its priority. Responsive to the signal from the job service the JCM adds a corresponding basic resource sub-job to the queues of each basic resource which the sub-job will require to perform the sub-job. A first of the sub-jobs is placed in an “Active” state ready for processing, if the first sub-job is at the top of all of the queues, of all the basic resources, required to perform the first sub-job.

Type: Grant

Filed: November 29, 1999

Date of Patent: May 27, 2003

Assignee: Xerox Corporation

Inventors: David L. Salgado, Rodney L Turmon, Nicholas M. Lamendola
Execution unit for processing a data stream independently and in parallel

Patent number: 6401194

Abstract: A vector processor provides a data path divided into smaller slices of data, with each slice processed in parallel with the other slices. Furthermore, an execution unit provides smaller arithmetic and functional units chained together to execute more complex microprocessor instructions requiring multiple cycles by sharing single-cycle operations, thereby reducing both costs and size of the microprocessor. One embodiment handles 288-bit data widths using 36-bit data path slices. Another embodiment executes integer multiply and multiply-and-accumulate and floating point add/subtract and multiply operations using single-cycle arithmetic logic units. Other embodiments support 8-bit, 9-bit, 16-bit, and 32-bit integer data types and 32-bit floating data types.

Type: Grant

Filed: January 28, 1997

Date of Patent: June 4, 2002

Assignee: Samsung Electronics Co., Ltd.

Inventors: Le Trong Nguyen, Heonchul Park, Roney S. Wong, Ted Nguyen, Edward H. Yu
Data processor and data processing system

Patent number: 6327605

Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.

Type: Grant

Filed: March 19, 2001

Date of Patent: December 4, 2001

Assignee: Hitachi, Ltd.

Inventors: Fumio Arakawa, Norio Nakagawa, Tetsuya Yamada, Yonetaro Totsuka
Data processor and data processing system

Patent number: 6243732

Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.

Type: Grant

Filed: January 7, 2000

Date of Patent: June 5, 2001

Assignee: Hitachi, Ltd.

Inventors: Fumio Arakawa, Norio Nakagawa, Tetsuya Yamada, Yonetaro Totsuka
Data processor and data processing system

Patent number: 6038582

Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.

Type: Grant

Filed: October 15, 1997

Date of Patent: March 14, 2000

Assignee: Hitachi, Ltd.

Inventors: Fumio Arakawa, Norio Nakagawa, Tetsuya Yamada, Yonetaro Totsuka