Matrix Array Patents (Class 708/514)

Apparatus and method for remote atomic floating point operations

Patent number: 12663990

Abstract: Embodiments relate to atomic memory operations with floating point values. An example processor comprises: a control register to store rounding mode and denormal mode control bits to indicate a rounding mode and denormal value processing mode; fetch circuitry to fetch a remote atomic operation (RAO) floating point (FP) instruction comprising a memory location to store at least one FP result value; decode circuitry to decode the instruction and scheduling circuitry to offload the instruction to an execution engine external to the logical processor or to schedule the decoded instruction for local execution, wherein an indication of the rounding mode and denormal value processing mode is communicated to the execution engine, the indication to be communicated by a transfer of one or more of the rounding mode control bits and denormal mode control bits to a storage coupled to the execution engine, or to be communicated in the message.

Type: Grant

Filed: March 28, 2024

Date of Patent: June 23, 2026

Assignee: Intel Corporation

Inventors: Christopher J. Hughes, Jonas Svennebring, Doddaballapur Jayasimha, Lingxiang Xiang, David Koufaty, Samantika Sury
Calculating device

Patent number: 12175253

Abstract: According to one embodiment, a calculating device includes a first memory, a second memory, a third memory, a first arithmetic module, a second arithmetic module, a first conductive line electrically connecting a first output terminal of the first memory and a first input terminal of the first arithmetic module, a second conductive line electrically connecting a second output terminal of the first memory and a first input terminal of the second arithmetic module, a third conductive line electrically connecting a first output terminal of the second memory and a second input terminal of the second arithmetic module, a fourth conductive line electrically connecting a first output terminal of the third memory and a third input terminal of the second arithmetic module, and a fifth conductive line electrically connecting a first output terminal of the second arithmetic module and a second input terminal of the first arithmetic module.

Type: Grant

Filed: March 21, 2023

Date of Patent: December 24, 2024

Assignee: Kabushiki Kaisha Toshiba

Inventors: Kosuke Tatsumura, Hayato Goto
Multi-bit scan chain with error-bit generator

Patent number: 12068745

Abstract: Various implementations described herein are directed to a device having a scan chain that receives a multi-bit input, provides a multi-bit output, and provides a multi-bit multiplexer output based on the multi-bit input and the multi-bit output. The device may have an error-bit generator that receives the multi-bit multiplexer output, receives a portion of the multi-bit input, receives a portion of the multi-bit output, and provides an error-bit output based on the multi-bit multiplexer output, the portion of the multi-bit input, and the portion of the multi-bit output.

Type: Grant

Filed: August 31, 2021

Date of Patent: August 20, 2024

Assignee: Arm Limited

Inventors: Anil Kumar Baratam, Yves Thomas Laplanche
Performing matrix multiplication in hardware

Patent number: 11989258

Abstract: Methods, systems, and apparatus for performing a matrix multiplication using a hardware circuit are described. An example method begins by obtaining an input activation value and a weight input value in a first floating point format. The input activation value and the weight input value are multiplied to generate a product value in a second floating point format that has higher precision than the first floating point format. A partial sum value is obtained in a third floating point format that has a higher precision than the first floating point format. The partial sum value and the product value are combined to generate an updated partial sum value that has the third floating point format.

Type: Grant

Filed: November 9, 2020

Date of Patent: May 21, 2024

Assignee: Google LLC

Inventors: Andrew Everett Phelps, Norman Paul Jouppi
Reconfigurable digital signal processing (DSP) vector engine

Patent number: 11983530

Abstract: Systems and methods described herein may relate to providing a dynamically configurable circuitry able to process data associated with a variety of matrix dimensions using one or more complex number operations, one or more real number operations, or both. Configurations may be applied to the configurable circuitry to program the configurable circuitry for a next operation. The configurable circuitry may process data according to a variety of operations based at least in part on operation of a repeated processing element coupled in a compute network of processing elements.

Type: Grant

Filed: March 27, 2020

Date of Patent: May 14, 2024

Assignee: Intel Corporation

Inventors: Sumeet Singh Nagi, Farhana Sheikh, Scott Jeremy Weber, Uneeb Yaqub Rathore
Multi-memory on-chip computational network

Patent number: 11741345

Abstract: Provided are systems, methods, and integrated circuits for a neural network processing system. In various implementations, the system can include a first array of processing engines coupled to a first set of memory banks and a second array of processing engines coupled to a second set of memory banks. The first and second set of memory banks be storing all the weight values for a neural network, where the weight values are stored before any input data is received. Upon receiving input data, the system performs a task defined for the neural network. Performing the task can include computing an intermediate result using the first array of processing engines, copying the intermediate result to the second set of memory banks, and computing a final result using the second array of processing engines, where the final result corresponds to an outcome of performing the task.

Type: Grant

Filed: September 25, 2020

Date of Patent: August 29, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Randy Huang, Ron Diamant
Instructions for vector multiplication of unsigned words with rounding

Patent number: 11704124

Abstract: Disclosed embodiments relate to executing a vector multiplication instruction. In one example, a processor includes fetch circuitry to fetch the vector multiplication instruction having fields for an opcode, first and second source identifiers, and a destination identifier, decode circuitry to decode the fetched instruction, execution circuitry to, on each of a plurality of corresponding pairs of fixed-sized elements of the identified first and second sources, execute the decoded instruction to generate a double-sized product of each pair of fixed-sized elements, the double-sized product being represented by at least twice a number of bits of the fixed size, and generate an unsigned fixed-sized result by rounding the most significant fixed-sized portion of the double-sized product to fit into the identified destination.

Type: Grant

Filed: January 11, 2022

Date of Patent: July 18, 2023

Assignee: Intel Corporation

Inventors: Venkateswara R. Madduri, Carl Murray, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney, Robert Valentine, Jesus Corbal
Matrix multiplication system and method

Patent number: 11120101

Abstract: The present disclosure advantageously provides a system method for efficiently multiplying matrices with elements that have a value of 0. A bitmap is generated for each matrix. Each bitmap includes a bit position for each matrix element. The value of each bit is set to 0 when the value of the corresponding matrix element is 0, and to 1 when the value of the corresponding matrix element is not 0. Each matrix is compressed into a compressed matrix, which will have fewer elements with a value of 0 than the original matrix. Each bitmap is then adjusted based on the corresponding compressed matrix. The compressed matrices are then multiplied to generate an output matrix. For each element i,j in the output matrix, a dot product of the ith row of the first compressed matrix and the jth column of the second compressed matrix is calculated based on the bitmaps.

Type: Grant

Filed: September 27, 2019

Date of Patent: September 14, 2021

Assignee: Arm Limited

Inventors: Zhi-Gang Liu, Matthew Mattina, Paul Nicholas Whatmough
Floating point processor prototype of multi-channel data

Patent number: 11010130

Abstract: The present invention discloses a floating point processor prototype of multi-channel data. An architecture comprises the following steps: arranging structural data, semi-structured data and unstructured data into a three-way array; decomposing the three-way array into a matrix pattern of a second-order tensor by using higher-order singular value decomposition; and converting the matrix pattern into a sparse domain to conduct block floating point quantization. A floating point processor prototype of multi-channel data is built.

Type: Grant

Filed: August 9, 2017

Date of Patent: May 18, 2021

Assignee: Shanghai DataCenter Science Co., LTD

Inventors: Jun Zhang, Ke Xu, Xiaofeng Chen
System, method and apparatus for data manipulation

Patent number: 10970201

Abstract: A system, apparatus and method for utilizing a transpose function to generate a two-dimensional array from three-dimensional input data. The use of the transpose function reduces redundant elements in the resultant two-dimensional array thereby increasing efficiency and decreasing power consumption.

Type: Grant

Filed: October 24, 2018

Date of Patent: April 6, 2021

Assignee: Arm Limited

Inventor: Paul Nicholas Whatmough
Performing matrix multiplication in hardware

Patent number: 10831862

Abstract: Methods, systems, and apparatus for performing a matrix multiplication using a hardware circuit are described. An example method begins by obtaining an input activation value and a weight input value in a first floating point format. The input activation value and the weight input value are multiplied to generate a product value in a second floating point format that has higher precision than the first floating point format. A partial sum value is obtained in a third floating point format that has a higher precision than the first floating point format. The partial sum value and the product value are combined to generate an updated partial sum value that has the third floating point format.

Type: Grant

Filed: March 20, 2020

Date of Patent: November 10, 2020

Assignee: Google LLC

Inventors: Andrew Everett Phelps, Norman Paul Jouppi
Multi-memory on-chip computational network

Patent number: 10803379

Abstract: Provided are systems, methods, and integrated circuits for a neural network processing system. In various implementations, the system can include a first array of processing engines coupled to a first set of memory banks and a second array of processing engines coupled to a second set of memory banks. The first and second set of memory banks be storing all the weight values for a neural network, where the weight values are stored before any input data is received. Upon receiving input data, the system performs a task defined for the neural network. Performing the task can include computing an intermediate result using the first array of processing engines, copying the intermediate result to the second set of memory banks, and computing a final result using the second array of processing engines, where the final result corresponds to an outcome of performing the task.

Type: Grant

Filed: December 12, 2017

Date of Patent: October 13, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Randy Huang, Ron Diamant
Matrix multiplication on a systolic array

Patent number: 10769238

Abstract: Techniques facilitating matrix multiplication on a systolic array are provided. A computer-implemented method can comprise populating, by a system operatively coupled to a processor, respective first registers of one or more processing elements of a systolic array structure with respective input data bits of a first data matrix. The one or more processing elements can comprise a first processing element that comprises a first input data bit of the first data matrix and a first activation bit of a second data matrix. The method can also include determining, by the system, at the first processing element, a first partial sum of a third data matrix. Further, the method can include streaming, by the system, the first partial sum of the third data matrix from the first processing element.

Type: Grant

Filed: September 19, 2019

Date of Patent: September 8, 2020

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Chia-Yu Chen, Jungwook Choi, Kailash Gopalakrishnan, Victor Han, Vijayalakshmi Srinivasan, Jintao Zhang
Matrix normal/transpose read and a reconfigurable data processor including same

Patent number: 10768899

Abstract: A configurable circuit configurable according to the data width of elements of a matrix is described that includes a memory array, logic to write a matrix to the memory array having elements with a data width which can be specified using configuration data, logic for a transpose read of the matrix as-written and logic for normal read of the matrix as-written. The memory array includes first and second read ports operable in parallel. Transpose read logic and normal read logic can be coupled to the first and second read ports, respectively, allowing transpose and normal read of a matrix simultaneously.

Type: Grant

Filed: January 29, 2019

Date of Patent: September 8, 2020

Assignee: SambaNova Systems, Inc.

Inventors: David Alan Koeplinger, Raghu Prabhakar, Ram Sivaramakrishnan, David Brian Jackson, Mark Luttrell
Performing matrix multiplication in hardware

Patent number: 10621269

Abstract: Methods, systems, and apparatus for performing a matrix multiplication using a hardware circuit are described. An example method begins by obtaining an input activation value and a weight input value in a first floating point format. The input activation value and the weight input value are multiplied to generate a product value in a second floating point format that has higher precision than the first floating point format. A partial sum value is obtained in a third floating point format that has a higher precision than the first floating point format. The partial sum value and the product value are combined to generate an updated partial sum value that has the third floating point format.

Type: Grant

Filed: May 17, 2018

Date of Patent: April 14, 2020

Assignee: Google LLC

Inventors: Andrew Everett Phelps, Norman Paul Jouppi
Memory management for sparse matrix multiplication

Patent number: 10452744

Abstract: Techniques related to memory management for sparse matrix multiplication are disclosed. Computing device(s) may perform a method for multiplying a row of a first sparse matrix with a second sparse matrix to generate a product matrix row. A compressed representation of the second sparse matrix is stored in main memory. The compressed representation comprises a values array that stores non-zero value(s). Tile(s) corresponding to row(s) of second sparse matrix are loaded into scratchpad memory. The tile(s) comprise set(s) of non-zero value(s) of the values array. A particular partition of an uncompressed representation of the product matrix row is generated in the scratchpad memory. The particular partition corresponds to a partition of the second sparse matrix comprising non-zero value(s) included in the tile(s). When a particular tile is determined to comprise non-zero value(s) that are required to generate the particular partition, the particular tile is loaded into the scratchpad memory.

Type: Grant

Filed: March 27, 2017

Date of Patent: October 22, 2019

Assignee: Oracle International Corporation

Inventors: Sandeep R. Agrawal, Sam Idicula, Nipun Agarwal
State estimation processor and state estimation system

Patent number: 10191739

Abstract: A microprocessor unit (MPU) connected to external sensors is provided with an interface unit that acquires detection information acquired by the external sensors and a digital signal processor (DSP) that estimates the state of a target object on the basis of the detection information acquired by the interface part and generates state information. The DSP is provided with a SIMD type arithmetic processing circuitry that processes a plurality of information with one command and is provided with single precision floating point computing units. The interface part outputs the state information generated by the DSP to an externally provided main processor. Therefore, power consumption can be reduced.

Type: Grant

Filed: September 22, 2016

Date of Patent: January 29, 2019

Assignee: MEGACHIPS CORPORATION

Inventors: Mahito Matsumoto, Tomoshige Kato, Takehiro Yoshimura, Takio Yamaoka, Yusuke Sasaki, Shingo Hamaguchi
Transforming character delimited values

Patent number: 9619152

Abstract: Techniques for transforming character delimited values are presented herein. An input module may be configured to read a set of character delimited values. A generation module may be configured to generate, in real-time, a synchronization block for the set of values that includes a nibble for each value in the set of values. The nibbles may represent either a byte size of the associated value or may be a flag representing a predetermined value. An output module may be configured to sequentially output the synchronization block and the set of values to a binary data output stream for output in a device dependent byte order according to the respective byte sizes of the values in the set of values.

Type: Grant

Filed: December 19, 2014

Date of Patent: April 11, 2017

Assignee: eBay Inc.

Inventors: Gang Ye, Thennarasu Ponnusamy, Belinda Liu, Enlin Wang, Mallikarjun Bhaigond, Amit Desai, Xin Zhuang, Preeta Joshi, Hong-Yen Nguyen
Redundant execution for reliability in a super FMA ALU

Patent number: 9329936

Abstract: A system, processor and method to increase computational reliability by using underutilized portions of a data path with a SuperFMA ALU. The method allows the reuse of underutilized hardware to implement spatial redundancy by using detection during the dispatch stage to determine if the operation may be executed by redundant hardware in the ALU. During execution, if determination is made that the correct conditions exists as determined by the redundant execution modes, the SuperFMA ALU performs the operation with redundant execution and compares the results for a match in order to generate a computational result. The method to increase computational reliability by using redundant execution is advantageous because the hardware cost of adding support for redundant execution is low and the complexity of implementation of the disclosed method is minimal due to the reuse of existing hardware.

Type: Grant

Filed: December 31, 2012

Date of Patent: May 3, 2016

Assignee: Intel Corporation

Inventor: Brian J. Hickman
Matrix calculation method, program, and system

Patent number: 9098460

Abstract: A matrix calculation system for calculating funny matrix multiplication (FMM) of a matrix A and a matrix B, including: sequentially calculating a permutation of indices {ai} in which values are arranged in a non-decreasing order with respect to each i-th row where i=1 to the number of rows of the matrix A; storing a value, which is greater than expected as a value of a matrix, for C[i, j] with respect to each j-th column where j=1 to the number of columns of the matrix A in the i-th row; sequentially calculating a permutation of indices {bj} in which values are arranged in a non-decreasing order with respect to each j-th column where j=1 to the number of columns of the matrix B; and setting the values of C[i, j], which are i and j components of the matrix C.

Type: Grant

Filed: August 22, 2012

Date of Patent: August 4, 2015

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Hiroki Yanagisawa
System and method to implement a matrix multiply unit of a broadband processor

Patent number: 8943119

Abstract: A system and a method are configured to improve the performance of general-purpose processors by implementing a functional unit that computes the product of a matrix operand with a vector operand, producing a vector result. The functional unit fully utilizes the entire resources of a 128 b by 128 b multiplier regardless of the operand size, as the number of elements of the matrix and vector operands increase as operand size is reduced. The unit performs both fixed-point and floating-point multiplications and additions with the highest-possible intermediate accuracy with modest resources.

Type: Grant

Filed: May 2, 2012

Date of Patent: January 27, 2015

Assignee: Microunity Systems Engineering, Inc.

Inventors: Craig Hansen, Bruce Bateman, John Moussouris
Hardware architecture and scheduling for high performance and low resource solution for QR decomposition

Patent number: 8782115

Abstract: A matrix decomposition circuit is described. In one implementation, the matrix decomposition circuit includes a processing element to process a plurality of processing cells and a scheduler coupled to the processing element, where the scheduler instructs the processing element to process only required processing cells of the plurality of processing cells. In one specific implementation, the required processing cells are processing cells with non-zero inputs.

Type: Grant

Filed: April 18, 2008

Date of Patent: July 15, 2014

Assignee: Altera Corporation

Inventor: Kulwinder Dhanoa
System and method for implementing elliptic curve scalar multiplication in cryptography

Patent number: 8649508

Abstract: A system and method for implementing the Elliptic Curve scalar multiplication method in cryptography, where the Double Base Number System is expressed in decreasing order of exponents and further on using it to determine Elliptic curve scalar multiplication over a finite elliptic curve.

Type: Grant

Filed: September 29, 2008

Date of Patent: February 11, 2014

Assignee: Tata Consultancy Services Ltd.

Inventor: Natarajan Vijayarangan
Precision measurement of waveforms

Patent number: 8620976

Abstract: A machine-implemented method for computerized digital signal processing including obtaining a digital signal from data storage or from conversion of an analog signal, and determining, from the digital signal, one or more measuring matrices. Each measuring matrix has a plurality of cells, and each cell has an amplitude corresponding to the signal energy in a frequency bin for a time slice. Cells in each measuring matrix having maximum amplitudes along a time slice and/or frequency bin are identified as maximum cells. Maxima that coincide in time and frequency are identified and a correlated maxima matrix, called a “Precision Measuring Matrix” is constructed showing the coinciding maxima and the adjacent marked maxima are linked into partial chains.

Type: Grant

Filed: May 11, 2011

Date of Patent: December 31, 2013

Assignee: Paul Reed Smith Guitars Limited Partnership

Inventors: Paul Reed Smith, Frederick M. Slay, Ernestine M. Smith
Simplified equalization for correlated channels in OFDMA

Patent number: 8612502

Abstract: Systems and methodologies are described that facilitate equalization of received signals in a wireless communication environment. Multiple transmit and/or receive antennas and utilize MIMO technology to enhance performance. A single tile of transmitted data, including a set of modulation symbols, can be received at multiple receive antennas, resulting in multiple tiles of received modulation symbols. Corresponding modulation symbols from multiple received tiles can be processed as a function of channel and interference estimates to generate a single equalized modulation symbol. Typically, the equalization process is computationally expensive. However, the channels are highly correlated. This correlation is reflected in the channel estimates and can be utilized to reduce complex equalization operations. In particular, a subset of the equalizers can be generated based upon the equalizer function and the remainder can be generated using interpolation. In addition, the equalizer function itself can be simplified.

Type: Grant

Filed: March 20, 2008

Date of Patent: December 17, 2013

Assignee: QUALCOMM Incorporated

Inventors: Petru Cristian Budianu, Hermanth Sampath, Alexei Gorokhov, Dhananjay A. Gore
Optimized corner turns for local storage and bandwidth reduction

Patent number: 8554820

Abstract: A block matrix multiplication mechanism is provided for reversing the visitation order of blocks at corner turns when performing a block matrix multiplication operation in a data processing system. By reversing the visitation order, the mechanism eliminates a block load at the corner turns. In accordance with the illustrative embodiment, a corner return is referred to as a “bounce” corner turn and results in a serpentine patterned processing order of the matrix blocks. The mechanism allows the data processing system to perform a block matrix multiplication operation with a maximum of three block transfers per time step. Therefore, the mechanism reduces maximum throughput and increases performance. In addition, the mechanism also reduces the number of multi-buffered local store buffers.

Type: Grant

Filed: April 20, 2012

Date of Patent: October 8, 2013

Assignee: International Business Machines Corporation

Inventors: Daniel A. Brokenshire, John A. Gunnels, Michael D. Kistler
Optimized corner turns for local storage and bandwidth reduction

Patent number: 8533251

Abstract: A block matrix multiplication mechanism is provided for reversing the visitation order of blocks at corner turns when performing a block matrix multiplication operation in a data processing system. By reversing the visitation order, the mechanism eliminates a block load at the corner turns. In accordance with the illustrative embodiment, a corner return is referred to as a “bounce” corner turn and results in a serpentine patterned processing order of the matrix blocks. The mechanism allows the data processing system to perform a block matrix multiplication operation with a maximum of three block transfers per time step. Therefore, the mechanism reduces maximum throughput and increases performance. In addition, the mechanism also reduces the number of multi-buffered local store buffers.

Type: Grant

Filed: May 23, 2008

Date of Patent: September 10, 2013

Assignee: International Business Machines Corporation

Inventors: Daniel A. Brokenshire, John A. Gunnels, Michael D. Kistler
Row-vector norm comparison method and row-vector norm comparison apparatus for inverse matrix

Patent number: 8521799

Abstract: Disclosed are a row-vector norm comparison method and a row-vector norm comparison apparatus for an inverse matrix. A row-vector norm comparison apparatus includes: an input matrix processing module that receives and combines constituent elements of a matrix; a cofactor operation module that multiplexes the combination result of the constituent elements to calculate factors constituting an adjoint matrix; a square calculation module that squares the calculated factors; a summation module that selects a predetermined number of factors among the squared factors and sums the selected factors to calculate the norms of row vectors in an inverse matrix; and a norm comparison module that outputs a comparison result of the calculated norms of the row vectors.

Type: Grant

Filed: June 30, 2008

Date of Patent: August 27, 2013

Assignees: Samsung Electronics Co., Ltd., Electronics and Telecommunications Research Institute

Inventors: Young Ha Lee, Seung Jae Bahng, Youn-Ok Park
Method and apparatus for harmonic balance using direct solution of HB jacobian

Patent number: 8473533

Abstract: A system, computer-readable storage medium, and method directly solves non-linear systems that have the HB Jacobian as the coefficient matrix. The direct solve method can be used to efficiently simulate non-linear circuits in RF or microwave applications. Additionally, the direct solve method can be applied to Fourier envelope applications. Furthermore, the direct solve method can be used together with preconditioners to provide a more efficient iterative solve technique.

Type: Grant

Filed: June 17, 2010

Date of Patent: June 25, 2013

Assignee: Berkeley Design Automation, Inc.

Inventors: Amit Mehrotra, Abhishek Somani
Non-negative matrix factorization as a feature selection tool for maximum margin classifiers

Patent number: 8412757

Abstract: Non-negative matrix factorization, NMF, is combined with identification of a maximum margin classifier by minimizing a cost function that contains a generative component and the discriminative component. The relative weighting between the generative component and the discriminative component are adjusting during subsequent iterations such that initially, when confidence is low, the generative model is favored. But as the iterations proceed, confidence increases and the weight of the discriminative component is steadily increased until it is of equal weight as the generative model. Preferably, the cost function to be minimized is: min F , G ? 0 ? ? X - FG ? 2 + ? ? ( ? w ? 2 + C ? ? i = 1 n ? L ? ( y i , w · g i + b ) ) .

Type: Grant

Filed: December 9, 2009

Date of Patent: April 2, 2013

Assignee: Seiko Epson Corporation

Inventors: Mithun Das Gupta, Jing Xiao
HARDWARE FOR PERFORMING ARITHMETIC OPERATIONS

Publication number: 20130073599

Abstract: Hardware for performing sequences of arithmetic operations. The hardware comprises a scheduler operable to generate a schedule of instructions from a bitmap denoting whether an entry in a matrix is zero or not. An arithmetic circuit is provided which is configured to perform arithmetic operations on the matrix in accordance with the schedule.

Type: Application

Filed: January 7, 2011

Publication date: March 21, 2013

Applicant: LINEAR ALGEBRA TECHNOLOGIES, LIMITED

Inventor: David Maloney
Vector SIMD processor

Patent number: 8341204

Abstract: A data processor whose level of operation parallelism is enhanced by composing floating-point inner product execution units to be compatible with single instruction multiple data (SIMD) and thereby enhancing the operation processing capability is made possible. An operating system that can significantly enhance the level of operation parallelism per instruction while maintaining the efficiency of the floating-point length-4 vector inner product execution units is to be implemented. The floating-point length-4 vector inner product execution units are defined in the minimum width (32 bits for single precision) even where an extensive operating system becomes available, and compose the inner product execution units to be compatible with SIMD. The mutually augmenting effects of the inner product execution units and SIMD-compatible composition enhances the level of operation parallelism dramatically.

Type: Grant

Filed: July 2, 2009

Date of Patent: December 25, 2012

Assignee: Renesas Electronics Corporation

Inventors: Fumio Arakawa, Tetsuya Yamada
Signal separating device, signal separating method, information recording medium, and program

Patent number: 8285773

Abstract: A signal separating device includes an iterative estimator, a repeating calculator, a result output unit, and a repetition controller. The repeating calculator repeatedly causes the iterative estimator to iteratively perform independent component analysis on an observed signal matrix, and to further perform independent component analysis on the source signal matrix obtained as a result. The result output unit outputs the product of the respective mixing matrices obtained during each repetition as a mixing matrix with respect to the observed signal matrix, while also outputting the source signal matrix obtained during the final repetition as a source signal matrix with respect to the observed signal matrix. The repetition controller causes the repeating calculator to repeat the calculation control until all mixing matrices and all source signal matrices satisfy a convergence condition. The iterative estimator may perform a fixed number of iterations, or perform iterations until convergence is obtained.

Type: Grant

Filed: April 27, 2007

Date of Patent: October 9, 2012

Assignee: Riken

Inventors: Andrzej Cichocki, Rafal Zdunek, Shunichi Amari, Gen Hori, Ken Umeno
System and method to implement a matrix multiply unit of a broadband processor

Patent number: 8195735

Abstract: The present invention provides a system and method for improving the performance of general-purpose processors by implementing a functional unit that computes the product of a matrix operand with a vector operand, producing a vector result. The functional unit fully utilizes the entire resources of a 128b by 128b multiplier regardless of the operand size, as the number of elements of the matrix and vector operands increase as operand size is reduced. The unit performs both fixed-point and floating-point multiplications and additions with the highest-possible intermediate accuracy with modest resources.

Type: Grant

Filed: December 9, 2008

Date of Patent: June 5, 2012

Assignee: Microunity Systems Engineering, Inc.

Inventors: Craig Hansen, Bruce Bateman, John Moussouris
Adaptive multi-levels dictionaries and singular value decomposition techniques for autonomic problem determination

Patent number: 8055607

Abstract: A system and method for autonomic problem determination. Events and problems associated with the events are received from a computing resource and are expressed as entries in an event-problem matrix. Expert knowledge is expressed as entries in one or more multi-level structure dictionaries. The system and method enables dynamic interaction between the events in the matrix and the current dictionaries with its entries being updated continuously to maximize correlation among the events and problems. The index of each term in the dictionary is used to calculate the weight of each event in the matrix wherein events having frequent association with a specific problem will be given a higher weight in the matrix. Using singular value decomposition (SVD), the weighted events enable an accelerated and accurate convergence to a set of specific associated problems.

Type: Grant

Filed: March 3, 2008

Date of Patent: November 8, 2011

Assignee: International Business Machines Corporation

Inventors: Hoi Y. Chan, Thomas Y. Kwok
Primitives for fast secure hash functions and stream ciphers

Patent number: 7933404

Abstract: Techniques are disclosed to enable efficient implementation of secure hash functions and/or stream ciphers. More specifically, a family of graphs is described that has relatively large girth, large claw, and/or rapid mixing properties. The graphs are suitable for construction of cryptographic primitives such as collision resistant hash functions and stream ciphers, which allow efficient software implementation.

Type: Grant

Filed: October 16, 2007

Date of Patent: April 26, 2011

Assignee: Microsoft Corporation

Inventors: Ramarathnam Venkatesan, Matthew Cary
Digital Signal Processor Having Instruction Set With One Or More Non-Linear Complex Functions

Publication number: 20100138468

Abstract: Methods and apparatus are provided for a digital signal processor having an instruction set with one or more non-linear complex functions. A method is provided for a processor. One or more non-linear complex software instructions are obtained from a program. The non-linear complex software instructions have at least one complex number as an input. One or more non-linear complex functions are applied from a predefined instruction set to the at least one complex number. An output is generated comprised of one complex number or two real numbers. A functional unit can implement the one or more non-linear complex functions. In one embodiment, a vector-based digital signal processor is disclosed that processes a complex vector comprised of a plurality of complex numbers. The processor can process the plurality of complex numbers in parallel.

Type: Application

Filed: November 28, 2008

Publication date: June 3, 2010

Inventors: Kameran Azadet, Jian-Guo Chen, Samer Hijazi, Joseph Williams
Vector SIMD processor

Patent number: 7567996

Abstract: A data processor whose level of operation parallelism is enhanced by composing floating-point inner product execution units to be compatible with single instruction multiple data (SIMD) and thereby enhancing the operation processing capability is made possible. An operating system that can significantly enhance the level of operation parallelism per instruction while maintaining the efficiency of the floating-point length-4 vector inner product execution units is to be implemented. The floating-point length-4 vector inner product execution units are defined in the minimum width (32 bits for single precision) even where an extensive operating system becomes available, and compose the inner product execution units to be compatible with SIMD. The mutually augmenting effects of the inner product execution units and SIMD-compatible composition enhances the level of operation parallelism dramatically.

Type: Grant

Filed: August 29, 2005

Date of Patent: July 28, 2009

Assignee: Renesas Technology Corp.

Inventors: Fumio Arakawa, Tetsuya Yamada
SYSTEM AND METHOD TO IMPLEMENT A MATRIX MULTIPLY UNIT OF A BROADBAND PROCESSOR

Publication number: 20090094309

Abstract: The present invention provides a system and method for improving the performance of general-purpose processors by implementing a functional unit that computes the product of a matrix operand with a vector operand, producing a vector result. The functional unit fully utilizes the entire resources of a 128b by 128b multiplier regardless of the operand size, as the number of elements of the matrix and vector operands increase as operand size is reduced. The unit performs both fixed-point and floating-point multiplications and additions with the highest-possible intermediate accuracy with modest resources.

Type: Application

Filed: December 9, 2008

Publication date: April 9, 2009

Applicant: MICROUNITY SYSTEMS ENGINEERING, INC.

Inventors: Craig HANSEN, Bruce Bateman, John Moussouris
Integrated conversion method and apparatus

Publication number: 20080104161

Abstract: An integrated transformation apparatus is provided. The apparatus includes a first multiplexer, a second multiplexer, and a transformation unit. The first multiplexer retrieves point data from columns or rows of a multi-dimensional matrix and input data. The second multiplexer retrieves transformation coefficients corresponding to the point data. The transformation unit transforms data blocks of the multi-dimensional matrix to a plurality of sub data blocks according to the input data, the point data, and the transformation coefficients.

Type: Application

Filed: August 24, 2007

Publication date: May 1, 2008

Applicant: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE

Inventors: Yi-Jung Wang, Guo-Zua Wu, Chih-Chi Chang, Oscal Tzyh Chiang Chen
Apparatus and method for isolating noise effects in a signal

Patent number: 7363200

Abstract: A matrix includes samples associated with a first signal and samples associated with a second signal. The second signal includes a first portion associated with the first signal and a second portion associated with at least one disturbance, such as white noise or colored noise. A projection of the matrix is produced using canonical QR-decomposition. Canonical QR-decomposition of the matrix produces an orthogonal matrix and an upper triangular matrix, where each value in the diagonal of the upper triangular matrix is greater than or equal to zero. The projection at least substantially separates the first portion of the second signal from the second portion of the second signal.

Type: Grant

Filed: February 5, 2004

Date of Patent: April 22, 2008

Assignee: Honeywell International Inc.

Inventor: Joseph Z. Lu
Method of watermarking digital data

Patent number: 7137005

Abstract: A method of introducing a non-perceptional signal (watermark) to a digital media data is disclosed. The method is based on the representation of source digital data using a special matrix, insertion of a digital watermark into the special matrix to receive the watermarked matrix, and generation of the watermarked data using the source data and the watermarked matrix. In addition, watermark detection of the watermarked data is performed by calculating the special matrix from the watermarked data.

Type: Grant

Filed: March 27, 2002

Date of Patent: November 14, 2006

Assignee: LG Electronics Inc.

Inventors: Mikhail Anatolyevich Sall, Alexander Leonidovich Mayboroda, Viktor Vikrorovich Redkov, Anatoly Igorevich Tikhotsky
Method and apparatus for matrix reordering and electronic circuit simulation

Patent number: 7089159

Abstract: A matrix reordering method performs reordering of elements of a coefficient matrix created based on coefficients of linear simultaneous equations whose solutions are to be produced by parallel processing of processors of a computer in accordance with Gaussian elimination. Herein, degrees corresponding to numbers of non-zero elements are calculated with respect to all pivots included in the coefficient matrix. Then, a first pivot whose degree is under a threshold (mindeg+?) is selected from among the pivots of the coefficient matrix, while a second pivot whose critical path length is minimum is also selected from among the pivots of the coefficient matrix. Replacement of elements is performed between the first pivot and second pivot to complete reordering with respect to the first pivot. In addition, non-zero elements, which are newly produced by the Gaussian elimination of the first pivot, are added to the coefficient matrix.

Type: Grant

Filed: April 2, 2001

Date of Patent: August 8, 2006

Assignee: NEC Electronics Corporation

Inventor: Koutaro Hachiya
Vector SIMD processor

Patent number: 7028066

Abstract: A data processor whose level of operation parallelism is enhanced by composing floating-point inner product execution units to be compatible with single instruction multiple data (SIMD) and thereby enhancing the operation processing capability is made possible. An operating system that can significantly enhance the level of operation parallelism per instruction while maintaining the efficiency of the floating-point length-4 vector inner product execution units is to be implemented. The floating-point length-4 vector inner product execution units are defined in the minimum width (32 bits for single precision) even where an extensive operating system becomes available, and compose the inner product execution units to be compatible with SIMD. The mutually augmenting effects of the inner product execution units and SIMD-compatible composition enhances the level of operation parallelism dramatically.

Type: Grant

Filed: March 8, 2001

Date of Patent: April 11, 2006

Assignee: Renesas Technology Corp.

Inventors: Fumio Arakawa, Tetsuya Yamada
Batch-based method and tool for graphical manipulation of workflows

Patent number: 7010760

Abstract: An autofill algorithm provides tools for defining and automatically executing batch based procedures in an adaptive hierarchical workflow environment, and may be suitable for a large variety of applications including laboratory procedure planning, execution, documentation, as wells ad driving robotic apparatus.

Type: Grant

Filed: March 11, 2004

Date of Patent: March 7, 2006

Assignee: Teranode Corporation

Inventors: Lawrence F. Arnstein, Zheng Li, John M. Hill, Michael R. Kellen, Christophe Poulain, Neil A. Fanger, Kuang Chen
Null-line based radial interpolation of gridded data

Patent number: 6820074

Abstract: A method and software are disclosed for processing data values of a data array at equally spaced locations in two dimensions where the desired data values are nulls in the data array. The method and software first searches for linear ranges of contiguous nulls, and then performs incidental interpolation of all points in such range.

Type: Grant

Filed: July 7, 1999

Date of Patent: November 16, 2004

Assignee: Landmark Graphics Corporation

Inventor: Anne L. Simpson
Electromagnetic wave analyzer and program for same

Patent number: 6662125

Abstract: An electromagnetic wave analyzer and program which can handle non-uniform cells with smaller computation errors. A given computational domain is divided into a plurality of cells for the purpose of finite difference approximation. For each space point, a cell size identification unit identifies the uniformity of surrounding cells. When the surrounding cells are identified as being uniform in size, a first calculation unit calculates electromagnetic field components at that space point with a first calculation method. When the surrounding cells are identified as being non-uniform in size, a second calculation unit calculates the same with a second calculation method which has smaller computational errors than the first calculation method. A data output unit then outputs the calculated electromagnetic field values.

Type: Grant

Filed: December 20, 2001

Date of Patent: December 9, 2003

Assignee: Fujitsu Limited

Inventor: Takefumi Namiki
Floating point addition pipeline including extreme value, comparison and accumulate functions

Publication number: 20010051969

Abstract: A multimedia execution unit configured to perform vectored floating point and integer instructions. The execution unit may include an add/subtract pipeline having far and close data paths. The far path is configured to handle effective addition operations and effective subtraction operations for operands having an absolute exponent difference greater than one. The close path is configured to handle effective subtraction operations for operands having an absolute exponent difference less than or equal to one. The close path is configured to generate two output values, wherein one output value is the first input operand plus an inverted version of the second input operand, while the second output value is equal to the first output value plus one. Selection of the first or second output value in the close path effectuates the round-to-nearest operation for the output of the adder.

Type: Application

Filed: February 6, 2001

Publication date: December 13, 2001

Inventors: Stuart F. Oberman, Norbert Juffa, Fred Weber, Krishnan Ramani, Ravi Krishna Cherukuri
Vector SIMD processor

Publication number: 20010021941

Abstract: A data processor whose level of operation parallelism is enhanced by composing floating-point inner product execution units to be compatible with SIMD and thereby enhancing the operation processing capability is made possible. An operating system that can significantly enhance the level of operation parallelism per instruction while maintaining the efficiency of the floating-point length-4 vector inner product execution units is to be implemented. The floating-point length-4 vector inner product execution units are defined in the minimum width (32 bits for single precision) even where an extensive operating system becomes available, and compose the inner product execution units to be compatible with SIMD. The mutually augmenting effects of the inner product execution units and SIMD-compatible composition enhances the level of operation parallelism dramatically.

Type: Application

Filed: March 8, 2001

Publication date: September 13, 2001

Inventors: Fumio Arakawa, Tetsuya Yamada
Data processor and data processing system

Publication number: 20010011291

Abstract: A data processor includes an arithmetic portion incorporated in a floating point unit, in which the arithmetic portion includes a plurality of multipliers supplied mantissa part of floating point number from respectively different data input signal line group and performing mutual multiplication of supplied mantissa parts, an aligner receiving outputs of respective multipliers and performing alignment shift, an exponent processing portion for generating number of alignment shift of the aligner and an exponent before normalization on the basis of generation an exponent part of the floating point number, a multi-input adder and the exponent before normalization, reducing scale of the circuit and performing inner product operation and the like with the floating point numbers in high speed and high accuracy.

Type: Application

Filed: March 19, 2001

Publication date: August 2, 2001

Inventors: Fumio Arakawa, Norio Nakagawa, Tetsuya Yamada, Yonetaro Totsuka

1 2 next