Matrix Array Patents (Class 708/520)

QR decomposition in an integrated circuit device

Patent number: 8812576

Abstract: Circuitry for performing QR decomposition of an input matrix includes multiplication/addition circuitry for performing multiplication and addition/subtraction operations on a plurality of inputs, division/square-root circuitry for performing division and square-root operations on an output of the multiplication/addition circuitry, a first memory for storing the input matrix, a second memory for storing a selected vector of the input matrix, and a selector for inputting to the multiplication/addition circuitry any one or more of a vector of the input matrix, the selected vector, and an output of the division/square-root circuitry. On respective successive passes, a respective vector of the input matrix is read from a first memory into a second memory, and elements of a respective vector of an R matrix of the QR decomposition are computed and the respective vector of the input matrix in the first memory is replaced with the respective vector of the R matrix.

Type: Grant

Filed: September 12, 2011

Date of Patent: August 19, 2014

Assignee: Altera Corporation

Inventor: Volker Mauer
Ordering generating method and storage medium, and shared memory scalar parallel computer

Patent number: 8805912

Abstract: When a Cholesky decomposition or a modified Cholesky decomposition is performed on a sparse symmetric positive definite matrix using a shared memory parallel computer, the discrete space in a problem presented by the linear simultaneous equations expressed by the sparse matrix is recursively sectioned into two sectioned areas and a sectional plane between the areas. The sectioning operation is stopped when the number of nodes configuring the sectional plane reaches the width of a super node. Each time the recursively halving process is performed, a number is sequentially assigned to the node in the sectioned area in order from a farther node from the sectional plane. The node in the sectional plane is numbered after assigning a number to the sectioned area each time the recursively halving process is performed.

Type: Grant

Filed: June 21, 2011

Date of Patent: August 12, 2014

Assignee: Fujitsu Limited

Inventor: Makoto Nakanishi
Low order multiple signal classification (MUSIC) method for high spectral resolution signal detection

Patent number: 8799345

Abstract: A new approach for applying the multiple signal classification (MUSIC) method for high spectral resolution signal detection is described. The new approach uses a lower order covariance matrix, or, alternately, an autocorrelation matrix, to calculate only the number of eigenvalues and associated eigenvectors actually needed to solve for the number of signals sought.

Type: Grant

Filed: August 24, 2009

Date of Patent: August 5, 2014

Assignee: The United States of America as represented by the Secretary of the Air Force

Inventors: Lihyeh Liou, David M. Lin, James B. Tsui
Hardware architecture and scheduling for high performance and low resource solution for QR decomposition

Patent number: 8782115

Abstract: A matrix decomposition circuit is described. In one implementation, the matrix decomposition circuit includes a processing element to process a plurality of processing cells and a scheduler coupled to the processing element, where the scheduler instructs the processing element to process only required processing cells of the plurality of processing cells. In one specific implementation, the required processing cells are processing cells with non-zero inputs.

Type: Grant

Filed: April 18, 2008

Date of Patent: July 15, 2014

Assignee: Altera Corporation

Inventor: Kulwinder Dhanoa
Circuits and methods for calculating a cholesky decomposition of a matrix

Patent number: 8775496

Abstract: Approaches for Cholesky decomposition of a matrix are described. A first circuit is configured to generate an inverse square root of an input value. A second circuit is configured to generate a product of a value output by the first circuit and provided at a first input and a value provided at a second input. A third circuit is configured to generate a difference between a value provided at the first input and a value provided at the second input of the third circuit. The first input of the third circuit is coupled to the output of the second circuit. A control circuit is configured to iteratively distribute a plurality of values of the matrix and the outputs of the first, second, and third circuits to the inputs of the first, second, and third circuits such that the Cholesky decomposition of the matrix is output by the third circuit.

Type: Grant

Filed: July 29, 2011

Date of Patent: July 8, 2014

Assignee: Xilinx, Inc.

Inventors: Kaushik Barman, Raghavendar M. Rao
Compression system and method for accelerating sparse matrix computations

Patent number: 8775495

Abstract: The present invention involves a sparse matrix processing system and method which uses sparse matrices that are compressed to reduce memory traffic and improve performance of computations using sparse matrices.

Type: Grant

Filed: February 12, 2007

Date of Patent: July 8, 2014

Assignee: Indiana University Research and Technology

Inventors: Andrew Lumsdaine, Jeremiah Willcock
Efficient Algorithm to Bit Matrix Symmetry

Publication number: 20140188969

Abstract: An algorithm that maintains the symmetry of a symmetric bit matrix stored in computer memory without having to process all of the elements of a transpose column by considering only the elements changed in a row. The algorithm operates on groups of bits forming rows of the matrix rather than processing the individual bit elements of the matrix. Instead of checking whether each bit needs to be modified, the algorithm toggles only the column bits that are the transpose elements of modified row elements, thereby taking advantage of the existing symmetry to eliminate unnecessary conditional operations. As a result, the algorithm modifies the matrix on a row-by-row basis and makes changes to only those column bits that correspond to modified row elements without having to check the value of the transpose column elements that do not require modification.

Type: Application

Filed: December 28, 2012

Publication date: July 3, 2014

Applicant: LSI CORPORATION

Inventors: Deepti P. Chotai, Shankar T. More
Matrix operations in an integrated circuit device

Patent number: 8762443

Abstract: Matrix operations circuitry for performing operations on submatrices of an input matrix includes a first working memory in which individual ones of the submatrices are operated on. The first working memory has a first submatrix size. The matrix operations circuitry also includes a second working memory in which a collection of the submatrices, that have been operated on in the first working memory, is operated on. The second working memory has an optimum burst size, and the first submatrix size is matched to the optimum burst size.

Type: Grant

Filed: November 15, 2011

Date of Patent: June 24, 2014

Assignee: Altera Corporation

Inventor: Brian L. Kurtz
DATA TRANSFORMATION DEVICE, DATA TRANSFORMATION METHOD, AND PROGRAM

Publication number: 20140164466

Abstract: A data transformation device defines a first square submatrix of an m order (m?2) including elements (n, n) in the matrixes A and F being detA?1 and A=GHnHn?1 . . . H1=GF, calculates a first element in the matrix Hi based on elements in a lowest order row of the first square submatrix, defines a second square submatrix of a (m+1) order, and calculates a second element in the matrix Hi based on elements in a lowest order row of the second square submatrix and the first element. The data transformation device calculates all elements in the matrix Hi by iterating processing on the second element until the second square submatrix becomes an n order and calculates elements of the matrix G using elements of matrix A and matrix H1. Then, variable transformation is performed to solve a linear system including n variables and n equations.

Type: Application

Filed: December 11, 2013

Publication date: June 12, 2014

Inventor: Isamu RYU
SYSTEM, METHOD, AND COMPUTER PROGRAM PRODUCT FOR TRANSPOSING A MATRIX

Publication number: 20140149480

Abstract: A system, method, and computer program product are provided for transposing a matrix. In use, a matrix is identified. Additionally, the matrix is transposed utilizing row-wise operations and column-wise operations, where the row-wise operations and the column-wise operations are performed independently.

Type: Application

Filed: October 24, 2013

Publication date: May 29, 2014

Applicant: NVIDIA Corporation

Inventors: Bryan Christopher Catanzaro, Manjunath Kudlur
Methods for efficient state transition matrix based LFSR computations

Patent number: 8719323

Abstract: A method for efficient state transition matrix based LFSR computations are disclosed. A polynomial associated with a linear feedback shift register is defined. This polynomial is used to generate a single step state transition matrix. The single step state transition matrix is then modified into a more general k-step state transition matrix. The resultant combined matrix is reduced in size and can be multiplied by a state input vector, ultimately producing a plurality of next state-input vectors thereby providing improved efficiency in computing a LFSR.

Type: Grant

Filed: October 22, 2010

Date of Patent: May 6, 2014

Assignee: LSI Corporation

Inventor: Meng-Lin Yu
Polynomial data processing operation

Patent number: 8700688

Abstract: A data processing system 2 includes an instruction decoder 22 responsive to polynomial divide instructions DIVL.PN to generate control signals that control processing circuitry 26 to perform a polynomial division operation. The denominator polynomial is represented by a denominator value stored within a register with an assumption that the highest degree term of the polynomial always has a coefficient of “1” such that this coefficient need not be stored within the register storing the denominator value and accordingly the denominator polynomial may have a degree one higher than would be possible with the bit space within the register storing the denominator value alone. The polynomial divide instruction returns a quotient value and a remainder value respectively representing the quotient polynomial and the remainder polynomial.

Type: Grant

Filed: February 23, 2009

Date of Patent: April 15, 2014

Assignee: U-Blox AG

Inventors: Dominic H Symes, Daniel Kershaw, Martinus C Wezelenburg
ORTHOGONAL CODE MATRIX GENERATION METHOD AND RELATED CIRCUIT THEREOF

Publication number: 20140095569

Abstract: An orthogonal code matrix generation method includes: establishing an N×N orthogonal code matrix, wherein an inner product of every two rows of the orthogonal code matrix is 0, and each column of the orthogonal code matrix has a summation of elements equal to a same value, wherein N is a power of 4; and using the N×N orthogonal code matrix as a basic unit to establish a target orthogonal code matrix. An orthogonal code matrix generation circuit includes: an N×N orthogonal code matrix generator, arranged for establishing an N×N orthogonal code matrix, wherein an inner product of every two rows of the orthogonal code matrix is 0, each column of the orthogonal code matrix has a summation of elements equal to a same value; and a target orthogonal code matrix generator, arranged for using the N×N orthogonal code matrix as a basic unit to establish a target orthogonal code matrix.

Type: Application

Filed: January 3, 2013

Publication date: April 3, 2014

Applicant: Raydium Semiconductor Corporation

Inventors: Shih-Lun Huang, Kai-Ming Liu
Latency tolerant system for executing video processing operations

Patent number: 8687008

Abstract: A latency tolerant system for executing video processing operations. The system includes a host interface for implementing communication between the video processor and a host CPU, a scalar execution unit coupled to the host interface and configured to execute scalar video processing operations, and a vector execution unit coupled to the host interface and configured to execute vector video processing operations. A command FIFO is included for enabling the vector execution unit to operate on a demand driven basis by accessing the memory command FIFO. A memory interface is included for implementing communication between the video processor and a frame buffer memory. A DMA engine is built into the memory interface for implementing DMA transfers between a plurality of different memory locations and for loading the command FIFO with data and instructions for the vector execution unit.

Type: Grant

Filed: November 4, 2005

Date of Patent: April 1, 2014

Assignee: NVIDIA Corporation

Inventors: Ashish Karandikar, Shirish Gadre, Stephen D. Lew
Data structure for tiling and packetizing a sparse matrix

Patent number: 8676874

Abstract: A computer system retrieves a slice of sparse matrix data, which includes multiple rows that each includes multiple elements. The computer system identifies one or more non-zero values stored in one or more of the rows. Each identified non-zero value corresponds to a different row, and also corresponds to an element location within the corresponding row. In turn, the computer system stores each of the identified non-zero values and corresponding element locations within a packet at predefined fields corresponding to the different rows.

Type: Grant

Filed: December 6, 2010

Date of Patent: March 18, 2014

Assignee: International Business Machines Corporation

Inventor: Gordon Clyde Fossum
Acceleration of multidimensional scaling by vector extrapolation techniques

Patent number: 8645440

Abstract: A method for multidimensional scaling (MDS) of a data set comprising a plurality of data elements is provided, wherein each data element is identified by its coordinates, the method comprising the steps of: (i) applying an iterative optimization technique, such as SMACOF, a predetermined amount of times on a coordinates vector, said coordinates vector representing the coordinates of a plurality of said data elements, and obtaining a modified coordinates vector; (ii) applying a vector extrapolation technique, such as Minimal Polynomial Extrapolation (MPE) or reduced Rank Extrapolation (RRE) on said modified coordinates vector obtaining a further modified coordinates vector; and (iii) repeating steps (i) and (ii) until one or more predefined conditions are met.

Type: Grant

Filed: June 10, 2008

Date of Patent: February 4, 2014

Inventors: Guy Rosman, Alexander Bronstein, Michael Bronstein, Ron Kimmel
Configuring a programmable integrated circuit device to perform matrix multiplication

Patent number: 8626815

Abstract: In a matrix multiplication in which each element of the resultant matrix is the dot product of a row of a first matrix and a column of a second matrix, each row and column can be broken into manageable blocks, with each block loaded in turn to compute a smaller dot product, and then the results can be added together to obtain the desired row-column dot product. The earliest results for each dot product are saved for a number of clock cycles equal to the number of portions into which each row or column is divided. The results are then added to provide an element of the resultant matrix. To avoid repeated loading and unloading of the same data, all multiplications involving a particular row-block can be performed upon loading that row-block, with the results cached until other multiplications for the resultant elements that use the cached results are complete.

Type: Grant

Filed: March 3, 2009

Date of Patent: January 7, 2014

Assignee: Altera Corporation

Inventor: Martin Langhammer
Minimum mean square error processing

Patent number: 8620984

Abstract: A first systolic array receives an input set of time division multiplexed matrices from a plurality of channel matrices. In a first mode, the first systolic array performs triangularization on the input matrices, producing a first set of matrices, and in a second mode performs back-substitution on the first set, producing a second set of matrices. In a first mode, a second systolic array performs left multiplication on the second set of matrices with the input set of matrices, producing a third set of matrices. In a second mode, the second systolic array performs cross diagonal transposition on the third set of matrices, producing a fourth set of matrices, and performs right multiplication on the second set of matrices with the fourth set of matrices. The first systolic array switches from the first mode to the second mode after the triangularization, and the second systolic array switches from the first mode to the second mode after the left multiplication.

Type: Grant

Filed: November 23, 2009

Date of Patent: December 31, 2013

Assignee: Xilinx, Inc.

Inventors: Raied N. Mazahreh, Hai-Jo Tarn, Raghavendar M. Rao
Precision measurement of waveforms

Patent number: 8620976

Abstract: A machine-implemented method for computerized digital signal processing including obtaining a digital signal from data storage or from conversion of an analog signal, and determining, from the digital signal, one or more measuring matrices. Each measuring matrix has a plurality of cells, and each cell has an amplitude corresponding to the signal energy in a frequency bin for a time slice. Cells in each measuring matrix having maximum amplitudes along a time slice and/or frequency bin are identified as maximum cells. Maxima that coincide in time and frequency are identified and a correlated maxima matrix, called a “Precision Measuring Matrix” is constructed showing the coinciding maxima and the adjacent marked maxima are linked into partial chains.

Type: Grant

Filed: May 11, 2011

Date of Patent: December 31, 2013

Assignee: Paul Reed Smith Guitars Limited Partnership

Inventors: Paul Reed Smith, Frederick M. Slay, Ernestine M. Smith
Computing device, calculating method, and program product

Patent number: 8612507

Abstract: A computing device includes: a deciding unit which, in computation of values of nodes on a lattice in a direction where a value of m representing a horizontal axis coordinate of the lattice increases, decides dummy nodes to be added to m=n?1, so as to enable values of nodes on m=n to be calculated by adding the dummy nodes to m=n?1 and executing a vector operation through the use of the SIMD function by using values of nodes on m=n?1 and values of the added dummy nodes; an adding unit adding the dummy nodes decided by the deciding unit to m=n?1; and a calculating unit calculating the values of the nodes present on m=n by executing the vector operation through the use of the SIMD function by using the values of the nodes on m=n?1 and the values of the dummy nodes added by the adding unit.

Type: Grant

Filed: April 16, 2010

Date of Patent: December 17, 2013

Assignee: NS Solutions Corporation

Inventor: Hiroki Takeshita
Efficient matrix multiplication on a parallel processing device

Patent number: 8589468

Abstract: The present invention enables efficient matrix multiplication operations on parallel processing devices. One embodiment is a method for mapping CTAs to result matrix tiles for matrix multiplication operations. Another embodiment is a second method for mapping CTAs to result tiles. Yet other embodiments are methods for mapping the individual threads of a CTA to the elements of a tile for result tile computations, source tile copy operations, and source tile copy and transpose operations. The present invention advantageously enables result matrix elements to be computed on a tile-by-tile basis using multiple CTAs executing concurrently on different streaming multiprocessors, enables source tiles to be copied to local memory to reduce the number accesses from the global memory when computing a result tile, and enables coalesced read operations from the global memory as well as write operations to the local memory without bank conflicts.

Type: Grant

Filed: September 3, 2010

Date of Patent: November 19, 2013

Assignee: NVIDIA Corporation

Inventors: Norbert Juffa, Radoslav Danilak
Method and apparatus for arithmetic operation by simultaneous linear equations of sparse symmetric positive definite matrix

Patent number: 8583719

Abstract: An arithmetic operation apparatus includes: a branch node set detection unit to detect a set of branch nodes for each parallel level; a subtree memory storage area allocation unit to allocate an arithmetic result of a column vector to a memory storage area selected on a basis of a predetermined selection rule from a plurality of memory storage areas; and a node memory storage area allocation unit to allocate an arithmetic result of a column vector to a memory storage area selected on a basis of a predetermined selecting rule from a plurality of memory storage areas.

Type: Grant

Filed: February 1, 2010

Date of Patent: November 12, 2013

Assignee: Fujitsu Limited

Inventor: Makoto Nakanishi
System for conjugate gradient linear iterative solvers

Patent number: 8577949

Abstract: A system for a conjugate gradient iterative linear solver that calculates the solution to a matrix equation comprises a plurality of gamma processing elements, a plurality of direction vector processing elements, a plurality of x-vector processing elements, an alpha processing element, and a beta processing element. The gamma processing elements may receive an A-matrix and a direction vector, and may calculate a q-vector and a gamma scalar. The direction vector processing elements may receive a beta scalar and a residual vector, and may calculate the direction vector. The x-vector processing elements may receive an alpha scalar, the direction vector, and the q-vector, and may calculate an x-vector and the residual vector. The alpha processing element may receive the gamma scalar and a delta scalar, and may calculate the alpha scalar. The beta processing element may receive the residual vector, and may calculate the delta scalar and the beta scalar.

Type: Grant

Filed: July 7, 2009

Date of Patent: November 5, 2013

Assignee: L-3 Communications Integrated Systems, L.P.

Inventors: Matthew P. DeLaquil, Deepak Prasanna, Antone L. Kusmanoff
Optimized corner turns for local storage and bandwidth reduction

Patent number: 8554820

Abstract: A block matrix multiplication mechanism is provided for reversing the visitation order of blocks at corner turns when performing a block matrix multiplication operation in a data processing system. By reversing the visitation order, the mechanism eliminates a block load at the corner turns. In accordance with the illustrative embodiment, a corner return is referred to as a “bounce” corner turn and results in a serpentine patterned processing order of the matrix blocks. The mechanism allows the data processing system to perform a block matrix multiplication operation with a maximum of three block transfers per time step. Therefore, the mechanism reduces maximum throughput and increases performance. In addition, the mechanism also reduces the number of multi-buffered local store buffers.

Type: Grant

Filed: April 20, 2012

Date of Patent: October 8, 2013

Assignee: International Business Machines Corporation

Inventors: Daniel A. Brokenshire, John A. Gunnels, Michael D. Kistler
MATRIX CALCULATION UNIT

Publication number: 20130262548

Abstract: A matrix calculation unit may include a matrix operation unit and a converting unit. The matrix operation unit may include functions to perform a matrix operation of a first size with respect to data stored in a memory, and to perform a matrix operation of a second size with respect to the data stored in the memory, where the second size is enlarged from the first size. The converting unit may convert in at least one direction in the memory between a data array suited for the matrix operation of the first size and a data array suited for the matrix operation of the second size.

Type: Application

Filed: February 27, 2013

Publication date: October 3, 2013

Applicants: FUJITSU SEMICONDUCTOR LIMITED, FUJITSU LIMITED

Inventors: Yi GE, Hiroshi HATANO, Kazuo HORIO
Modified Gram-Schmidt core implemented in a single field programmable gate array architecture

Patent number: 8543633

Abstract: A modified Gram-Schmidt QR decomposition core implemented in a single field programmable gate array (FPGA) comprises a converter configured to convert a complex fixed point input to a complex floating point input, dual port memory to hold complex entries of an input matrix, normalizer programmable logic module (PLM) to compute a normalization of a column vector. A second PLM performs complex, floating point multiplication on two input matrix columns. A scheduler diverts control of the QRD processing to the normalizer PLM or the second PLM. A top level state machine communicates with scheduler and monitors processing in normalizer PLM and second PLM and communicates the completion of operations to scheduler. A complex divider computes final column for output matrix Q using floating point arithmetic. Multiplexer outputs computed values as elements of output matrix Q or R. Complex floating point operations are performed in a parallel pipelined implementation reducing latencies.

Type: Grant

Filed: September 24, 2010

Date of Patent: September 24, 2013

Assignee: Lockheed Martin Corporation

Inventor: Luke A. Miller
QR decomposition in an integrated circuit device

Patent number: 8539016

Abstract: Circuitry speeds up the QR decomposition of a matrix. The circuitry can be provided in a fixed logic device, or can be configured into a programmable integrated circuit device such as a programmable logic device. This implementation performs Gram-Schmidt orthogonalization with no dependencies between iterations. QR decomposition of a matrix can be performed by processing entire columns at once as a vector operation. Data dependencies within and between matrix columns are removed, as later functions dependent on an earlier result may be generated from partial results somewhere in the datapath, rather than from an earlier completed result. Different passes through the matrix are timed so that different computations requiring the same functional units arrive at different time slots. After the Q matrix has been calculated, the R matrix may be calculated from the Q matrix by taking its transpose and multiplying the transpose by the original input matrix.

Type: Grant

Filed: February 9, 2010

Date of Patent: September 17, 2013

Assignee: Altera Corporation

Inventor: Martin Langhammer
Optimized corner turns for local storage and bandwidth reduction

Patent number: 8533251

Abstract: A block matrix multiplication mechanism is provided for reversing the visitation order of blocks at corner turns when performing a block matrix multiplication operation in a data processing system. By reversing the visitation order, the mechanism eliminates a block load at the corner turns. In accordance with the illustrative embodiment, a corner return is referred to as a “bounce” corner turn and results in a serpentine patterned processing order of the matrix blocks. The mechanism allows the data processing system to perform a block matrix multiplication operation with a maximum of three block transfers per time step. Therefore, the mechanism reduces maximum throughput and increases performance. In addition, the mechanism also reduces the number of multi-buffered local store buffers.

Type: Grant

Filed: May 23, 2008

Date of Patent: September 10, 2013

Assignee: International Business Machines Corporation

Inventors: Daniel A. Brokenshire, John A. Gunnels, Michael D. Kistler
Method and structure for producing high performance linear algebra routines using composite blocking based on L1 cache size

Patent number: 8527571

Abstract: A method (and structure) for performing a matrix subroutine, includes storing data for a matrix subroutine call in a computer memory in an increment block size that is based on a cache size.

Type: Grant

Filed: December 22, 2008

Date of Patent: September 3, 2013

Assignee: International Business Machines Corporation

Inventors: Fred Gehrung Gustavson, John A. Gunnels
Row-vector norm comparison method and row-vector norm comparison apparatus for inverse matrix

Patent number: 8521799

Abstract: Disclosed are a row-vector norm comparison method and a row-vector norm comparison apparatus for an inverse matrix. A row-vector norm comparison apparatus includes: an input matrix processing module that receives and combines constituent elements of a matrix; a cofactor operation module that multiplexes the combination result of the constituent elements to calculate factors constituting an adjoint matrix; a square calculation module that squares the calculated factors; a summation module that selects a predetermined number of factors among the squared factors and sums the selected factors to calculate the norms of row vectors in an inverse matrix; and a norm comparison module that outputs a comparison result of the calculated norms of the row vectors.

Type: Grant

Filed: June 30, 2008

Date of Patent: August 27, 2013

Assignees: Samsung Electronics Co., Ltd., Electronics and Telecommunications Research Institute

Inventors: Young Ha Lee, Seung Jae Bahng, Youn-Ok Park
Systolic array for matrix triangularization and back-substitution

Patent number: 8510364

Abstract: Methods for matrix processing and devices therefor are described. A systolic array in an integrated circuit is coupled to receive a first matrix as input; and is capable of operating in two modes, namely a triangularization mode and a back-substitution mode. The systolic array, when in a triangularization mode, is coupled to triangularize the first matrix to provide a second matrix. When in a back-substitution mode, the systolic array is coupled to invert the second matrix.

Type: Grant

Filed: September 1, 2009

Date of Patent: August 13, 2013

Assignee: Xilinx, Inc.

Inventors: Raghavendar M. Rao, Christopher H. Dick
Combining multiple clusterings by soft correspondence

Patent number: 8499022

Abstract: Combining multiple clusterings arises in various important data mining scenarios. However, finding a consensus clustering from multiple clusterings is a challenging task because there is no explicit correspondence between the classes from different clusterings. Provided is a framework based on soft correspondence to directly address the correspondence problem in combining multiple clusterings. Under this framework, an algorithm iteratively computes the consensus clustering and correspondence matrices using multiplicative updating rules. This algorithm provides a final consensus clustering as well as correspondence matrices that gives intuitive interpretation of the relations between the consensus clustering and each clustering from clustering ensembles. Extensive experimental evaluations demonstrate the effectiveness and potential of this framework as well as the algorithm for discovering a consensus clustering from multiple clusterings.

Type: Grant

Filed: May 21, 2012

Date of Patent: July 30, 2013

Assignee: The Research Foundation of State University of New York

Inventors: Bo Long, Zhongfei Mark Zhang
Supervised nonnegative matrix factorization

Patent number: 8498949

Abstract: Supervised nonnegative matrix factorization (SNMF) generates a descriptive part-based representation of data, based on the concept of nonnegative matrix factorization (NMF) aided by the discriminative concept of graph embedding. An iterative procedure that optimizes suggested formulation based on Pareto optimization is presented. The present formulation removes any dependence on combined optimization schemes. Analytical and empirical evidence is presented to show that SNMF has advantages over popular subspace learning techniques as well as current state-of-the-art techniques.

Type: Grant

Filed: August 11, 2010

Date of Patent: July 30, 2013

Assignee: Seiko Epson Corporation

Inventors: Seung-il Huh, Mithun Das Gupta, Jing Xiao
Reordering discrete fourier transform outputs

Patent number: 8484275

Abstract: There is provided a method for generating a table for reordering the output of a Fourier transform, the Fourier transform being performed on a predefined number of input samples, the method comprising performing one or more decomposition stages on a sequence corresponding in number to the predefined number of input samples to form a representation of the output of the Fourier transform; wherein at least one of the decomposition stages comprises a composite operation that is equivalent to two or more operations; and rearranging the representation of the output of the Fourier transform to generate a reordering table.

Type: Grant

Filed: December 7, 2007

Date of Patent: July 9, 2013

Assignee: Altera Corporation

Inventors: Martin Langhammer, Neil Kenneth Thorne
Decoder and process therefor

Patent number: 8473540

Abstract: A decoder, such as for example an MMSE MIMO decoder, and a method for decoding are described. An input channel matrix is obtained, and an extended channel matrix of the input channel matrix is generated. The extended channel matrix is triangularized to provide a triangularized matrix, and the triangularized matrix is inverted to provide an inverted triangular matrix. A left matrix multiplication result matrix associated with multiplication of the input channel matrix and the inverted triangular matrix is generated, and a weight matrix from the left matrix multiplication result matrix and the inverted triangular matrix is generated. A received symbols matrix is obtained, and a weighted estimation is generated and output using the weight matrix and the received symbols matrix to provide an estimate of a transmit symbols matrix for output of estimated data symbols.

Type: Grant

Filed: September 1, 2009

Date of Patent: June 25, 2013

Assignee: Xilinx, Inc.

Inventors: Raghavendar M. Rao, Christopher H. Dick
Modified givens rotation for matrices with complex numbers

Patent number: 8473539

Abstract: Nulling a cell of a complex matrix is described. A complex matrix and a modified Givens rotation matrix are obtained for multiplication by a processing unit, such as a systolic array or a CPU, for example, for the nulling of the cell to provide a modified form of the complex matrix. The modified Givens rotation matrix includes complex numbers c*, c, ?s, and s*, wherein the complex number s* is the complex conjugate of the complex number s, and wherein the complex number c* is the complex conjugate of the complex number c. The complex numbers c and s are associated with complex numbers of the complex matrix including the cell to be nulled. The modified form is then output by the processing unit. The modified Givens rotation matrix may be implemented as a systolic array or otherwise used for processing complex numbers or matrices.

Type: Grant

Filed: September 1, 2009

Date of Patent: June 25, 2013

Assignee: Xilinx, Inc.

Inventors: Raghavendar M. Rao, Christopher H. Dick
MATRIX-BASED DYNAMIC PROGRAMMING

Publication number: 20130159372

Abstract: Embodiments relate to dynamic programming. An aspect includes representing a dynamic programming problem as a matrix of cells, each cell representing an intermediate score to be calculated. Another aspect includes providing a mapping assigning cells of the matrix to elements of a result container data structure, and storing cells of the matrix to elements of the result container data structure in accordance with the mapping. Another aspect includes calculating intermediate scores of all cells of the matrix, whereby intermediate scores of some of the cells of the matrix are stored to a respectively assigned element of the result container data structure in accordance with the mapping. Another aspect includes during the calculation of the intermediate scores, dynamically updating the assignment of cells and elements in the mapping and assembling a final result of the dynamic programming problem from the intermediate scores stored in the result container data structure.

Type: Application

Filed: November 30, 2012

Publication date: June 20, 2013

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: International Business Machines Corporation
MATRIX STORAGE FOR SYSTEM IDENTIFICATION

Publication number: 20130159373

Abstract: A sparse matrix used in the least-squares method is divided into small matrices in accordance with the number of elements of observation. An observation ID is assigned to each element of observation, a parameter ID is assigned to each parameter, and the IDs are associated with parameters of elements as ID mapping. A system determines positions of nonzero elements in accordance with whether or not ID mapping exists, the correspondence between observation IDs and parameter IDs, and the positions of the small matrices, and selects a storage scheme for each small matrix based thereon. The system selects a storage scheme in accordance with conditions, such as whether or not a target element is a diagonal element, whether or not a term decided without ID mapping exists, and whether or not the same ID mapping is referred to.

Type: Application

Filed: December 18, 2012

Publication date: June 20, 2013

Applicant: International Business Machines Corporation

Inventor: International Business Machines Corporation
Optical processor including windowed optical calculations architecture

Patent number: 8463838

Abstract: A windowed optical calculation architecture and process that efficiently performs high speed multi-element multiply and accumulates on a digital data stream. A data point from a digital data stream is impressed onto an optical source to create an optical value. The optical value is split into a number of branches equaling the number of elements used in the calculation. In each branch, the optical value is modulated to reflect the coefficients in the calculation. Then, depending upon the branch, the optical value is delayed depending on its position in the calculation, with optical values at the beginning of the calculation being delayed longer than optical values at the end of the calculation. The outputs from the branches are coupled together to perform an optical sum, and passed to detection/analog-digital conversion circuitry to convert the optical result to a digital result.

Type: Grant

Filed: October 28, 2009

Date of Patent: June 11, 2013

Assignee: Lockheed Martin Corporation

Inventor: Brian L Ulhorn
MINIMUM MEAN SQUARE ERROR PROCESSING

Publication number: 20130138712

Abstract: A first systolic array receives an input set of time division multiplexed matrices from a plurality of channel matrices. In a first mode, the first systolic array performs triangularization on the input matrices, producing a first set of matrices, and in a second mode performs back-substitution on the first set, producing a second set of matrices. In a first mode, a second systolic array performs left multiplication on the second set of matrices with the input set of matrices, producing a third set of matrices. In a second mode, the second systolic array performs cross diagonal transposition on the third set of matrices, producing a fourth set of matrices, and performs right multiplication on the second set of matrices with the fourth set of matrices. The first systolic array switches from the first mode to the second mode after the triangularization, and the second systolic array switches from the first mode to the second mode after the left multiplication.

Type: Application

Filed: January 28, 2013

Publication date: May 30, 2013

Applicant: XILINX, INC.

Inventor: XILINX, INC.
QUANTIFYING MEHTOD FOR INTRINSIC DATA TRANSFER RATE OF ALGORITHMS

Publication number: 20130124593

Abstract: The quantifying method for intrinsic data transfer rate of algorithms is provided. The provided quantifying method for an intrinsic data transfer rate includes steps of: detecting whether or not a datum is used; providing a dataflow graph G including n vertices and m edges, and a Laplacian matrix L having ixj elements L(i,j) when the datum is not reused, wherein each of the vertices represents one of an operation and a datum, each of the edges represents a data transfer, and vi is the ith vertex; and using the Laplacian matrix L to estimate a maximum quantity of the intrinsic data transfer rate.

Type: Application

Filed: July 20, 2011

Publication date: May 16, 2013

Applicant: NATIONAL CHENG KUNG UNIVERSITY

Inventors: Gwo Giun Lee, He-Yuan Lin
Systolic array for cholesky decomposition

Patent number: 8443031

Abstract: A systolic array for Cholesky decomposition of an N×N matrix is described. A plurality of processing cells, including a corner cell, N?1 boundary cells, and (N2?3N+2)/2 internal cells, are arranged into N?1 rows and N columns of processing cells. Each row of processing cells is configured to calculate elements of a respective column of a lower triangular output matrix. Each processing cell of each row is configured to determine a value of a respective element of the lower triangular output matrix using a value of an element calculated in a previous processing cell of the row.

Type: Grant

Filed: July 19, 2010

Date of Patent: May 14, 2013

Assignee: Xilinx, Inc.

Inventor: Raghavendar M. Rao
Multiplication circuit and de/encryption circuit utilizing the same

Patent number: 8443032

Abstract: A multiplication circuit generates a product of a matrix and a first scalar when in matrix mode and a product of a second scalar and a third scalar when in scalar mode. The multiplication circuit comprises a sub-product generator, an accumulator and an adder. The adder is configured to sum outputs of the accumulator to generate the product of the first scalar second scalar and the third scalar when in scalar mode. The sub-product generator generates sub-products of the matrix and the first scalar when in matrix mode and sub-products of the second scalar and the third scalar when in scalar mode. The accumulator is configured to generate the product of the matrix and the first scalar by providing save of the multiplication operation of the outputs from the sub-product generator.

Type: Grant

Filed: March 27, 2008

Date of Patent: May 14, 2013

Assignee: National Tsing Hua University

Inventors: Chen Hsing Wang, Chieh Lin Chuang, Cheng Wen Wu
Methods and apparatus for signature prediction and feature level fusion

Patent number: 8433741

Abstract: A system for signature prediction and feature-level fusion of a target according to various aspects of the present invention includes a first sensing modality for providing a measured data set. The system further includes a processor receiving the measured data set and generating a first k-orthogonal spanning tree constructed from k orthogonal minimal spanning trees having no edge shared between the k minimal spanning trees to define a first data manifold. A method for signature prediction and feature-level fusion of a target according to various aspects of the present invention includes generating a first manifold by developing a connected graph of data from a first sensing modality using a first k-orthogonal spanning tree, generating a second manifold by developing a second connected graph of data from a second sensing modality using a second k-orthogonal spanning tree, and aligning the first manifold and the second manifold to generate a joint-signature manifold in a common embedding space.

Type: Grant

Filed: June 5, 2008

Date of Patent: April 30, 2013

Assignee: Raytheon Company

Inventors: Donald E. Waagen, Samantha S. Livingston, Nitesh N. Shah
Matrix decomposition in an integrated circuit device

Patent number: 8396914

Abstract: Circuitry speeds up the Cholesky decomposition of a matrix. The circuitry can be provided in a fixed logic device, or can be configured into a programmable integrated circuit device such as a programmable logic device. The circuitry implements the following equation: l ij = a ij - ? L i , L j ? a jj - ? L j , L j ? When any lij term is calculated this way, the latency in calculating the ljj term in the denominator has little or no effect on the lij term calculation. And if the calculations are properly pipelined, once the pipeline is filled, a new term can be output on each clock cycle or every few clock cycles.

Type: Grant

Filed: September 11, 2009

Date of Patent: March 12, 2013

Assignee: Altera Corporation

Inventor: Martin Langhammer
Programmable matrix processor

Patent number: 8392487

Abstract: A matrix processor and processing method, the processor including a data encoder for receiving an input data stream; a data controller coupled to the data encoder for arranging the input data in an operand matrix, at least one processing unit for processing the data in matrix form by Boolean matrix-matrix multiplication with a selected operator matrix, and an output control module coupled to the processing unit for outputting desired results therefrom.

Type: Grant

Filed: March 31, 2008

Date of Patent: March 5, 2013

Assignee: Compass Electro-Optical Systems Ltd

Inventors: Michael Mesh, Michael Laor, Alexander Zeltser
Determining index values for bits of binary vector by processing masked sub-vector index values

Patent number: 8392692

Abstract: In one embodiment, the present invention determines index values corresponding to bits of a binary vector that have a value of 1. During each clock cycle, a masking technique is applied to M sub-vector index values, where each sub-vector index value corresponds to a different bit of a sub-vector of the binary vector. The masking technique is applied such that (i) the sub-vector index values that correspond to bits having a value of 0 are zeroed out and (ii) the sub-vector index values that correspond to the bits having a value of 1 are left unchanged. The masked sub-vector index values are sorted, and index values are calculated based on the masked sub-vector index values. The index values generated are then distributed uniformly to a number M of index memories such that the M index memories store substantially the same number of index values.

Type: Grant

Filed: December 12, 2008

Date of Patent: March 5, 2013

Assignee: LSI Corporation

Inventor: Kiran Gunnam
System, method, and computer program product for assigning elements of a matrix to processing threads with increased contiguousness

Patent number: 8380778

Abstract: A system, method, and computer program product are provided for assigning elements of a matrix to processing threads. In use, a matrix is received to be processed by a parallel processing architecture. Such parallel processing architecture includes a plurality of processors each capable of processing a plurality of threads. Elements of the matrix are assigned to each of the threads for processing, utilizing an algorithm that increases a contiguousness of the elements being processed by each thread.

Type: Grant

Filed: October 25, 2007

Date of Patent: February 19, 2013

Assignee: NVIDIA Corporation

Inventors: William N. Bell, Michael J. Garland
Sparse matrix-vector multiplication on graphics processor units

Patent number: 8364739

Abstract: Techniques for optimizing sparse matrix-vector multiplication (SpMV) on a graphics processing unit (GPU) are provided. The techniques include receiving a sparse matrix-vector multiplication, analyzing the sparse matrix-vector multiplication to identify one or more optimizations, wherein analyzing the sparse matrix-vector multiplication to identify one or more optimizations comprises analyzing a non-zero pattern for one or more optimizations and determining whether the sparse matrix-vector multiplication is to be reused across computation, optimizing the sparse matrix-vector multiplication, wherein optimizing the sparse matrix-vector multiplication comprises optimizing global memory access, optimizing shared memory access and exploiting reuse and parallelism, and outputting an optimized sparse matrix-vector multiplication.

Type: Grant

Filed: September 30, 2009

Date of Patent: January 29, 2013

Assignee: International Business Machines Corporation

Inventors: Muthu M. Baskaran, Rajesh J. Bordawekar
Fast singular value decomposition for expediting computer analysis system and application thereof

Patent number: 8341205

Abstract: The present invention uses a computer analysis system of a fast singular value decomposition to overcome the bottleneck of a traditional singular value decomposition that takes much computing time for decomposing a huge number of objects, and the invention can also process a matrix in any form without being limited to symmetric matrixes only. The decomposition and subgroup concept of the fast singular value decomposition works together with the decomposition of a variance matrix and the adjustment of an average vector of a column vector are used for optimizing the singular value decomposition to improve the overall computing speed of the computer analysis system.

Type: Grant

Filed: July 2, 2008

Date of Patent: December 25, 2012

Assignee: Everspeed Technology Limited

Inventor: Jengnan Tzeng

prev 1 2 3 4 5 6 7 next