Matrix Array Patents (Class 708/520)
  • Patent number: 10395381
    Abstract: Disclosed techniques relate to forming a block sum of picture elements employing a vector dot product instruction to sum packed picture elements and the mask producing a vector of masked horizontal picture element. The block sum is formed from plural horizontal sums via vector single instruction multiple data (SIMD) addition.
    Type: Grant
    Filed: March 4, 2019
    Date of Patent: August 27, 2019
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Jayasree Sankaranarayanan, Dipan Kumar Mandal
  • Patent number: 10346507
    Abstract: Embodiments of the present invention are directed to methods and systems for performing block sparse matrix-vector multiplications with improved efficiency through the use of a specific re-ordering the matrix data such that matrix symmetry can be exploited while simultaneously avoiding atomic memory operations or the need for inefficient memory operations in general. One disclosed method includes reordering the matrix data such that, for any column of non-transpose data, and for any row of transpose data simultaneously processed within a single thread-block on a GPU, all matrix elements update independent elements of the output vector. Using the method, the amount of data required to represent the sparse matrix can be reduced by as much as 50%, thereby doubling the effective performance on the GPU, and doubling the size of the matrix that can be accelerated by the GPU.
    Type: Grant
    Filed: October 26, 2017
    Date of Patent: July 9, 2019
    Assignee: Nvidia Corporation
    Inventor: Steve Rennich
  • Patent number: 10310812
    Abstract: Mechanisms are provided for performing a matrix operation. A processor of a data processing system is configured to perform cluster-based matrix reordering of an input matrix. An input matrix, which comprises nodes associated with elements of the matrix, is received. The nodes are clustered into clusters based on numbers of connections with other nodes within and between the clusters, and the clusters are ordered by minimizing a total length of cross cluster connections between nodes of the clusters, to thereby generate a reordered matrix. A lookup table is generated identifying new locations of nodes of the input matrix, in the reordered matrix. A matrix operation is then performed based on the reordered matrix and the lookup table.
    Type: Grant
    Filed: February 6, 2017
    Date of Patent: June 4, 2019
    Assignee: International Business Machines Corporation
    Inventors: Emrah Acar, Rajesh R. Bordawekar, Michele M. Franceschini, Luis A. Lastras-Montano, Ruchir Puri, Haifeng Qian, Livio B. Soares
  • Patent number: 10304008
    Abstract: Systems and methods are disclosed for operating a machine, by receiving training data from one or more sensors; training a machine learning module with the training data by: partitioning a data matrix into smaller submatrices to process in parallel and optimized for each processing node; for each submatrix, performing a greedy search for rank-one solutions; using alternating direction method of multipliers (ADMM) to ensure consistency over different data blocks; and controlling one or more actuators using live data and the learned module during operation.
    Type: Grant
    Filed: March 7, 2016
    Date of Patent: May 28, 2019
    Assignee: NEC Corporation
    Inventors: Renqiang Min, Dongjin Song
  • Patent number: 10275392
    Abstract: A data processing device includes a two-dimensional structure including a plurality of stages in a vertical direction, the stages each including basic units in a horizontal direction such that the number of the basic units is equal to the number of ways. The basic units each includes a memory block having a plurality of ports, an address generator for the ports of the memory block, and a calculation unit.
    Type: Grant
    Filed: April 6, 2016
    Date of Patent: April 30, 2019
    Assignee: NATIONAL UNIVERSITY CORPORATION NARA INSTITUTE OF SCIENCE AND TECHNOLOGY
    Inventors: Yasuhiko Nakashima, Shinya Takamaeda
  • Patent number: 10248426
    Abstract: Techniques are disclosed for restoring register data in a processor. In one embodiment, a method includes receiving an instruction to flush one or more general purpose registers (GPRs) in a processor. The method also includes determining history buffer entries of a history buffer to be restored to the one or more GPRs. The method includes creating a mask vector that indicates which history buffer entries will be restored to the one or more GPRs. The method further includes restoring the indicated history buffer entries to the one or more GPRs. As each indicated history buffer entry is restored, the method includes updating the mask vector to indicate which history buffer entries have been restored.
    Type: Grant
    Filed: May 24, 2016
    Date of Patent: April 2, 2019
    Assignee: International Business Machines Corporation
    Inventors: Brian D. Barrick, Steven J. Battle, Joshua W. Bowman, Christopher M. Mueller, Dung Q. Nguyen, David R. Terry, Eula Faye Tolentino, Jing Zhang
  • Patent number: 10191749
    Abstract: Single Instruction, Multiple Data (SIMD) technologies are described. A processing device can include a processor core and a memory. The processor core can receive, from a software application, a request to perform an operation on a first set of variables that includes a first input value and a register value and perform the operation on a second set of variables that includes a second input value and the first register value. The processor core can vectorize the operation on the first set of variables and the second set of variables. The processor core can perform the operation on the first set of variables and the second set of variables in parallel to obtain a first operation value and a second operation value. The processor core can perform a horizontal add operation on the first operation value and the second operation value and write the result to memory.
    Type: Grant
    Filed: December 24, 2015
    Date of Patent: January 29, 2019
    Assignee: Intel Corporation
    Inventors: Jun Jin, Elmoustapha Ould-Ahmed-Vall
  • Patent number: 10191744
    Abstract: Systems, methods, and apparatuses relating to element sorting of vectors are described. In one embodiment, a processor includes a decoder to decode an instruction into a decoded instruction; and an execution unit to execute the decoded instruction to: provide storage for a comparison matrix to store a comparison value for each element of an input vector compared against the other elements of the input vector, perform a comparison operation on elements of the input vector corresponding to storage of comparison values above a main diagonal of the comparison matrix, perform a different operation on elements of the input vector corresponding to storage of comparison values below the main diagonal of the comparison matrix, and store results of the comparison operation and the different operation in the comparison matrix.
    Type: Grant
    Filed: July 1, 2016
    Date of Patent: January 29, 2019
    Assignee: Intel Corporation
    Inventors: Mikhail Plotnikov, Igor Ermolaev
  • Patent number: 10169239
    Abstract: A prefetch request having a priority assigned thereto is obtained, based on executing a prefetch instruction included within a program. Based on obtaining the prefetch request, a determination is made as to whether the prefetch request may be placed on a prefetch queue. This determination includes determining whether the prefetch queue is full; checking, based on determining the prefetch queue is full, whether the priority of the prefetch request is considered a high priority; determining, based on the checking indicating the priority of the prefetch request is considered a high priority, whether another prefetch request on the prefetch queue may be removed; removing the other prefetch request from the prefetch queue, based on determining the other prefetch request may be removed; and adding the prefetch request to the prefetch queue, based on removing the other prefetch request.
    Type: Grant
    Filed: July 20, 2016
    Date of Patent: January 1, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dan F. Greiner, Michael K. Gschwind, Christian Jacobi, Anthony Saporito, Chung-Lung K. Shum, Timothy J. Slegel
  • Patent number: 10162752
    Abstract: A method for storing data at contiguous memory addresses includes, at a single-instruction-multiple-data (SIMD) processor, executing a parallel-prefix valid count instruction to determine a first offset of a first data vector and to determine a second offset of a second data vector that includes valid data and invalid data. The second offset is based on the first offset and a number of positions in the first data vector that are associated with valid data. The method also includes storing first valid data from the first data vector at a first memory address of a memory and storing second valid data from the second data vector at a particular memory address of the memory. The first memory address is based on the first offset and the particular memory address is based on the second offset.
    Type: Grant
    Filed: September 22, 2016
    Date of Patent: December 25, 2018
    Assignee: QUALCOMM Incorporated
    Inventors: Eric Mahurin, David Hoyle
  • Patent number: 10146740
    Abstract: A computer implemented method is provided for processing sparse data. A sparse data set is received. A modified sparse data set is calculated by replacing all nonzero values in the sparse data set with a common positive integer. The modified sparse data set is transposed to create a transposed data set. A covariance matrix is calculated by multiplying the transposed data set by the modified sparse data set. A tree of a predefined depth is generated by assigning columns of the sparse data set to right and left nodes based on co-occurrence with a first anchor column and a second anchor column. The first anchor column and the second anchor column are determined based on the covariance matrix.
    Type: Grant
    Filed: March 8, 2017
    Date of Patent: December 4, 2018
    Assignee: Symantec Corporation
    Inventors: Nikolaos Vasiloglou, Andrew B. Gardner
  • Patent number: 10097834
    Abstract: A method of encoding image data, including: frequency-transforming input image data to generate an array of frequency-transformed input image coefficients by a matrix-multiplication process, according to a maximum dynamic range of the transformed data and using transform matrices having a data precision; and selecting the maximum dynamic range and/or the data precision of the transform matrices according to the bit depth of the input image data.
    Type: Grant
    Filed: April 4, 2014
    Date of Patent: October 9, 2018
    Assignee: Sony Corporation
    Inventors: David Berry, James Alexander Gamei, Nicholas Ian Saunders, Karl James Sharman
  • Patent number: 10042814
    Abstract: A device, system and method for assigning values to elements in a first register, where each data field in a first register corresponds to a data element to be written into a second register, and where for each data field in the first register, a first value may indicate that the corresponding data element has not been written into the second register and a second value indicates that the corresponding data element has been written into the second register, reading the values of each of the data fields in the first register, and for each data field in the first register having the first value, gathering the corresponding data element and writing the corresponding data element into the second register, and changing the value of the data field in the first register from the first value to the second value. Other embodiments are described and claimed.
    Type: Grant
    Filed: November 14, 2014
    Date of Patent: August 7, 2018
    Assignee: Intel Corporation
    Inventors: Eric Sprangle, Anwar Rohillah, Robert Cavin, Andrew T. Forsyth, Michael Abrash
  • Patent number: 10038918
    Abstract: A video encoding method, a video encoding apparatus, a video decoding method, and a video decoding apparatus are provided. The video encoding method includes producing a fast transform matrix based on a transform matrix which is used for frequency transformation on a block which has a predetermined size; producing a transformed block by transforming the block having the predetermined size by using the fast transform matrix; and performing scaling with respect to the transformed block in order to correct a difference between the transform matrix used for the frequency transformation and the fast transform matrix.
    Type: Grant
    Filed: September 11, 2017
    Date of Patent: July 31, 2018
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Yoon-Mi Hong, Woo-Jin Han, Min-Su Cheon, Jianle Chen
  • Patent number: 9984041
    Abstract: A batched Cholesky decomposition method, system, and non-transitory computer readable medium for a Graphics Processing Unit (GPU) including at least a first problem and a second problem, include mirroring a second problem matrix of the second problem to a first problem matrix of the first problem, combining the first problem matrix and the mirrored second problem matrix into a single problem matrix, and allocating data read to a thread and to the first problem and the second problem, respectively.
    Type: Grant
    Filed: June 30, 2016
    Date of Patent: May 29, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Minsik Cho, David Shing-ki Kung, Ruchir Puri
  • Patent number: 9934195
    Abstract: A multicore processor is achieved by a processor assembly, comprising a first processor having a first core and at least a first and a second unit, each being selected from the group of vector execution units, memory units and accelerators, said first core and first and second units being interconnected by a first network, and a second processor having a second core wherein the first core is arranged to enable the second core to control at least one of the units in the first processor. Each processors generally comprises a combination of execution units, memory units and accelerators, which may be controlled and/or accessed by units in the other processor.
    Type: Grant
    Filed: November 28, 2012
    Date of Patent: April 3, 2018
    Assignee: Mediatek Sweden AB
    Inventors: Anders Nilsson, Eric Tell
  • Patent number: 9870338
    Abstract: Embodiments of systems, apparatuses, and methods for performing in a computer processor vector packed compression and repeat in response to a single vector packed compression and repeat instruction that includes a first and second source vector register operand, a destination vector register operand, and an opcode are described.
    Type: Grant
    Filed: December 23, 2011
    Date of Patent: January 16, 2018
    Assignee: Intel Corporation
    Inventors: Elmoustapha Ould-Ahmed-Vall, Thomas Willhalm
  • Patent number: 9858079
    Abstract: A method and system are described for generating reference tables in object code which specify the addresses of branches, routines called, and data references used by routines in the code. In a suitably equipped processing system, the reference tables can be passed to a memory management processor which can open the appropriate memory pages to expedite the retrieval of data referenced in the execution pipeline. The disclosed method and system create such reference tables at the beginning of each routine so that the table can be passed to the memory management processor in a suitably equipped processor. Resulting object code also allows processors lacking a suitable memory management processor to skip the reference table, preserving upward compatibility.
    Type: Grant
    Filed: October 19, 2015
    Date of Patent: January 2, 2018
    Assignee: Micron Technology, Inc.
    Inventor: Dean A. Klein
  • Patent number: 9846581
    Abstract: A clock-less asynchronous processor comprising a plurality of parallel asynchronous processing logic circuits, each processing logic circuit configured to generate an instruction execution result. The processor comprises an asynchronous instruction dispatch unit coupled to each processing logic circuit, the instruction dispatch unit configured to receive multiple instructions from memory and dispatch individual instructions to each of the processing logic circuits. The processor comprises a crossbar coupled to an output of each processing logic circuit and to the dispatch unit, the crossbar configured to store the instruction execution results.
    Type: Grant
    Filed: September 8, 2014
    Date of Patent: December 19, 2017
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Tao Huang, Yiqun Ge, Qifan Zhang, Wuxian Shi, Wen Tong
  • Patent number: 9830302
    Abstract: Systems and methods for multiplying a sparse matrix by a vector using a single instruction multiple data (SIMD) architecture are provided. An example method includes sorting rows of the sparse matrix by a number of non-zero elements in the rows to generate sorted rows. The sorted rows are split to generate groups of the sorted rows. The number of rows in each group of the sorted rows is equal to the number of rows updated in parallel. The method allows for packing the sorted rows in each of the groups to generate packed rows. Each of the packed rows within the same group has the same length. Per clock cycle, C elements of the packed rows and data for selecting elements of the vector are provided to computational units in the SIMD architecture, where C is the number of computational units.
    Type: Grant
    Filed: April 13, 2015
    Date of Patent: November 28, 2017
    Assignee: Knowles Electronics, LLC
    Inventor: Leonardo Rub
  • Patent number: 9805001
    Abstract: Methods, systems, and apparatus, including a system for transforming sparse elements to a dense matrix. The system is configured to receive a request for an output matrix based on sparse elements including sparse elements associated with a first dense matrix and sparse elements associated with a second dense matrix; obtain the sparse elements associated with the first dense matrix fetched by a first group of sparse element access units; obtain the sparse elements associated with the second dense matrix fetched by a second group of sparse element access units; and transform the sparse elements associated with the first dense matrix and the sparse elements associated with the second dense matrix to generate the output dense matrix that includes the sparse elements associated with the first dense matrix and the sparse elements associated with the second dense matrix.
    Type: Grant
    Filed: February 5, 2016
    Date of Patent: October 31, 2017
    Assignee: Google Inc.
    Inventors: Ravi Narayanaswami, Rahul Nagarajan, Dong Hyuk Woo, Christopher Daniel Leary
  • Patent number: 9798701
    Abstract: Methods, systems, and apparatus, including a system for transforming sparse elements to a dense matrix. The system is configured to receive a request for an output matrix based on sparse elements including sparse elements associated with a first dense matrix and sparse elements associated with a second dense matrix; obtain the sparse elements associated with the first dense matrix fetched by a first group of sparse element access units; obtain the sparse elements associated with the second dense matrix fetched by a second group of sparse element access units; and transform the sparse elements associated with the first dense matrix and the sparse elements associated with the second dense matrix to generate the output dense matrix that includes the sparse elements associated with the first dense matrix and the sparse elements associated with the second dense matrix.
    Type: Grant
    Filed: December 22, 2016
    Date of Patent: October 24, 2017
    Assignee: Google Inc.
    Inventors: Ravi Narayanaswami, Rahul Nagarajan, Dong Hyuk Woo, Christopher Daniel Leary
  • Patent number: 9798594
    Abstract: Disclosed herein is a shared memory systems that use a combination of SBR and MRRR techniques to calculate eigenpairs for dense matrices having very large numbers of rows and columns. The disclosed system allows for the use of a highly scalable tridiagonal eigensolver. The disclosed system likewise allows for allocating a different number of threads to each of the different computational stages of the eigensolver.
    Type: Grant
    Filed: January 17, 2017
    Date of Patent: October 24, 2017
    Assignee: Hewlett Packard Enterprise Development LP
    Inventor: Cheng Liao
  • Patent number: 9792125
    Abstract: A TRANSACTION BEGIN instruction begins execution of a transaction and includes a general register save mask having bits, that when set, indicate registers to be saved in the event the transaction is aborted. At the beginning of the transaction, contents of the registers are saved in memory not accessible to the program, and if the transaction is aborted, the saved contents are copied to the registers.
    Type: Grant
    Filed: May 20, 2016
    Date of Patent: October 17, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Dan F. Greiner, Christian Jacobi, Timothy J. Slegel
  • Patent number: 9788013
    Abstract: A video encoding method, a video encoding apparatus, a video decoding method, and a video decoding apparatus are provided. The video encoding method includes producing a fast transform matrix based on a transform matrix which is used for frequency transformation on a block which has a predetermined size; producing a transformed block by transforming the block having the predetermined size by using the fast transform matrix; and performing scaling with respect to the transformed block in order to correct a difference between the transform matrix used for the frequency transformation and the fast transform matrix.
    Type: Grant
    Filed: May 9, 2016
    Date of Patent: October 10, 2017
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Yoon-Mi Hong, Woo-Jin Han, Min-Su Cheon, Jianle Chen
  • Patent number: 9778907
    Abstract: A microprocessor performs a fused multiply-accumulate operation of a form ±A*B±C using first and second execution units. An input operand analyzer circuit determines whether values of A, B and/or C meet a sufficient condition to perform a joint accumulation of C with partial products of A and B. The first instruction execution unit multiplies A and B and jointly accumulates C to partial products of A and B when the values of A, B and/or C meet a sufficient condition to perform a joint accumulation of C with the partial products of A and B. The second instruction execution unit separately accumulates C to the products of A and B when the values of A, B and/or C do not meet a sufficient condition to perform a joint accumulation of C with the partial products of A and B.
    Type: Grant
    Filed: June 24, 2015
    Date of Patent: October 3, 2017
    Assignee: VIA ALLIANCE SEMICONDUCTOR CO., LTD.
    Inventor: Thomas Elmer
  • Patent number: 9772973
    Abstract: Methods, systems, and apparatus, including a system for transforming sparse elements to a dense matrix. The system is configured to receive a request for an output matrix based on sparse elements including sparse elements associated with a first dense matrix and sparse elements associated with a second dense matrix; obtain the sparse elements associated with the first dense matrix fetched by a first group of sparse element access units; obtain the sparse elements associated with the second dense matrix fetched by a second group of sparse element access units; and transform the sparse elements associated with the first dense matrix and the sparse elements associated with the second dense matrix to generate the output dense matrix that includes the sparse elements associated with the first dense matrix and the sparse elements associated with the second dense matrix.
    Type: Grant
    Filed: February 5, 2016
    Date of Patent: September 26, 2017
    Assignee: Google Inc.
    Inventors: Ravi Narayanaswami, Rahul Nagarajan, Dong Hyuk Woo, Christopher Daniel Leary
  • Patent number: 9772974
    Abstract: Methods, systems, and apparatus, including a system for transforming sparse elements to a dense matrix. The system is configured to receive a request for an output matrix based on sparse elements including sparse elements associated with a first dense matrix and sparse elements associated with a second dense matrix; obtain the sparse elements associated with the first dense matrix fetched by a first group of sparse element access units; obtain the sparse elements associated with the second dense matrix fetched by a second group of sparse element access units; and transform the sparse elements associated with the first dense matrix and the sparse elements associated with the second dense matrix to generate the output dense matrix that includes the sparse elements associated with the first dense matrix and the sparse elements associated with the second dense matrix.
    Type: Grant
    Filed: December 22, 2016
    Date of Patent: September 26, 2017
    Assignee: Google Inc.
    Inventors: Ravi Narayanaswami, Rahul Nagarajan, Dong Hyuk Woo, Christopher Daniel Leary
  • Patent number: 9679247
    Abstract: A method of building a soft linkage between a plurality of graphs includes initializing a correspondence between type-1 and type-2 objects in the plurality of graphs, and reducing a cost function by alternately updating the type-1 correspondence and updating the type-2 correspondence.
    Type: Grant
    Filed: September 19, 2013
    Date of Patent: June 13, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Danai Koutra, David M. Lubensky, Hanghang Tong
  • Patent number: 9652374
    Abstract: Embodiments of the invention relate to sparsity-driven matrix representation. In one embodiment, a sparsity of a matrix is determined and the sparsity is compared to a threshold. Computer memory is allocated to store the matrix in a first data structure format based on the sparsity being greater than the threshold. Computer memory is allocated to store the matrix in a second data structure format based on the sparsity not being greater than the threshold.
    Type: Grant
    Filed: August 31, 2016
    Date of Patent: May 16, 2017
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Berthold Reinwald, Shirish Tatikonda, Yuanyuan Tian
  • Patent number: 9525934
    Abstract: A method of estimating a steering vector of a sensor array of M sensors according to one embodiment of the present disclosure includes estimating a steering vector of a noise source located at an angle ? degrees from a look direction of the array using a least squares estimate of the gains of the sensors in the array, defining a steering vector of a desired sound source in the look direction of the array, and estimating the steering vector by performing element-by-element multiplication of the estimated noise vector and the complex conjugate of steering vector of the desired sound source. The sensors may be microphones.
    Type: Grant
    Filed: December 31, 2014
    Date of Patent: December 20, 2016
    Assignee: STMICROELECTRONICS ASIA PACIFIC PTE LTD.
    Inventors: Samuel Samsudin Ng, Sapna George, Karthik Muralidhar
  • Patent number: 9477477
    Abstract: A system, method, and computer program product are provided for executing casting-arithmetic instructions. The method comprises receiving a casting-arithmetic instruction that specifies an arithmetic operation to be performed on input data and at least one casting operation of an input casting operation and an output casting operation. Upon determining that the casting-arithmetic instruction specifies the input casting operation, the input casting operation is performed on identified terms comprising the input data. Then the arithmetic operation is performed on the input data to generate an arithmetic result. Upon determining that the casting-arithmetic instruction specifies the output casting operation, the output casting operation is performed on the arithmetic result.
    Type: Grant
    Filed: January 22, 2014
    Date of Patent: October 25, 2016
    Assignee: NVIDIA Corporation
    Inventor: William J. Dally
  • Patent number: 9460456
    Abstract: Briefly, embodiments of methods and/or systems of computation via array decomposition are disclosed. For one embodiment, as an example, a system may be capable of implementation of an advertising audience overlap analysis dashboard in which for an audience exceeding 100 million users and exceeding 10,000 user groups. Such a system embodiment, for example, may be capable of computing an exact count of user overlap among the user groups in less than two hours.
    Type: Grant
    Filed: March 21, 2014
    Date of Patent: October 4, 2016
    Assignee: Yahoo! Inc.
    Inventor: Kevin J. Lang
  • Patent number: 9454472
    Abstract: Embodiments of the invention relate to sparsity-driven matrix representation. In one embodiment, a sparsity of a matrix is determined and the sparsity is compared to a threshold. Computer memory is allocated to store the matrix in a first data structure format based on the sparsity being greater than the threshold. Computer memory is allocated to store the matrix in a second data structure format based on the sparsity not being greater than the threshold.
    Type: Grant
    Filed: April 15, 2016
    Date of Patent: September 27, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Berthold Reinwald, Shirish Tatikonda, Yuanyuan Tian
  • Patent number: 9436469
    Abstract: According to one embodiment, a code optimizer is configured to receive first code having a program loop implemented with scalar instructions to store values of a first array to a second array based on values of a third array. The code optimizer is configured to generate second code representing the program loop with vector instructions including a shuffle instruction and a store instruction, the store instruction to shuffle using a shuffle table elements of the first array based on the second array in a vector manner, the store instruction to store using a mask store table the shuffled elements in the third array in a vector manner.
    Type: Grant
    Filed: December 15, 2011
    Date of Patent: September 6, 2016
    Assignee: Intel Corporation
    Inventors: Tal Uliel, Elmoustapha Ould-Ahmedvall, Bret T. Toll
  • Patent number: 9396164
    Abstract: Embodiments of the invention relate to sparsity-driven matrix representation. In one embodiment, a sparsity of a matrix is determined and the sparsity is compared to a threshold. Computer memory is allocated to store the matrix in a first data structure format based on the sparsity being greater than the threshold. Computer memory is allocated to store the matrix in a second data structure format based on the sparsity not being greater than the threshold.
    Type: Grant
    Filed: October 21, 2013
    Date of Patent: July 19, 2016
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Berthold Reinwald, Shirish Tatikonda, Yuanyuan Tian
  • Patent number: 9367519
    Abstract: Various embodiments relating to encoding a sparse matrix into a data structure format that may be efficiently processed via parallel processing of a computing system are provided. In one embodiment, a sparse matrix may be received. A set of designated rows of the sparse matrix may be traversed until all non-zero elements in the sparse matrix have been placed in a first array. Each time a row in the set is traversed, a next non-zero element in that row may be placed in the first array. If all non-zero elements for a given row of the set of designated rows have been placed in the first array, the given row may be replaced in the set of designated rows with a next unprocessed row of the sparse matrix. The data structure in which the sparse matrix is encoded may be outputted. The data structure may include the first array.
    Type: Grant
    Filed: August 30, 2013
    Date of Patent: June 14, 2016
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Karin Strauss, Jeremy Fowers, Kalin Ovtcharov
  • Patent number: 9215012
    Abstract: Systems and methods are provided for mitigating natural and man-made interference through the use of one or more orthogonal, or nearly-orthogonal, projections of the received signal, which is assumed to be contaminated with interference, into one or more orthogonal projection spaces based on properties of the signal of interest. Once separated into orthogonal projection space(s), the system and method use information contained in the orthogonal projection space(s) to separate the signal of interest, or target signal, from the interference and to mitigate the interference.
    Type: Grant
    Filed: April 26, 2013
    Date of Patent: December 15, 2015
    Assignee: Propagation Research Associates, Inc.
    Inventors: Ernest Jefferson Holder, George Martin Hall
  • Patent number: 9201849
    Abstract: System and method for computing QR matrix decomposition and inverse matrix R?1. A circuit is configured to implement a QR decomposition of a matrix A into two matrices Q and R using a Modified Gram Schmidt (MGS) process. The circuit includes a specified portion dedicated to computing matrix Q. Matrix Q is computed via the specified portion based on first inputs using the MGS process, where the first inputs include the matrix A and possibly a scaling factor ?. The identity matrix may be scaled by the scaling factor ?, thereby generating scaled identity matrix ?I. Scaled matrix ?R?1 (or unscaled R?1) may be computed via the specified portion based on second inputs provided to the portion using the MGS process, where the second inputs include the (possibly scaled) identity matrix. If scaled, the scaled matrix ?R?1 may be unscaled, thereby computing matrix R?1. Matrix R?1 is stored and/or output.
    Type: Grant
    Filed: April 18, 2013
    Date of Patent: December 1, 2015
    Assignee: National Instruments Corporation
    Inventor: Yong Rao
  • Patent number: 9176931
    Abstract: System and method for developing a circuit for QR decomposition with auxiliary functionality. A first function is included in a first program. The first function is configurable to specify an auxiliary function to be performed by a modified QR decomposition circuit in addition to QR decomposition of a matrix A into two matrices Q and R using a Modified Gram Schmidt process. A second program is automatically generated based on configuration of the QR decomposition and the first function. The second program includes program code implementing the QR decomposition and the auxiliary function for the first function in the first program. A hardware configuration program (HCP) may be automatically generated based on the first program, including the second program, where the HCP is deployable to hardware, e.g., a programmable hardware element, thereby implementing the modified QR decomposition circuit, including the QR decomposition of the matrix A and the auxiliary function.
    Type: Grant
    Filed: July 12, 2013
    Date of Patent: November 3, 2015
    Assignee: National Instruments Corporation
    Inventor: Yong Rao
  • Patent number: 9170789
    Abstract: Embodiments of computer-implemented methods, systems, computing devices, and computer-readable media (transitory and non-transitory) are described herein for analyzing execution of a plurality of executable instructions and, based on the analysis, providing an indication of a benefit to be obtained by vectorization of at least a subset of the plurality of executable instructions. In various embodiments, the analysis may include identification of the subset of the plurality of executable instructions suitable for conversion to one or more single-instruction multiple-data (“SIMD”) instructions.
    Type: Grant
    Filed: March 5, 2013
    Date of Patent: October 27, 2015
    Assignee: Intel Corporation
    Inventors: Ruchira Sasanka, Jeffrey J. Cook, Abhinav Das, Jayaram Bobba, Michael R. Greenfield, Suresh Srinivas
  • Patent number: 9170836
    Abstract: A system and method for re-factorizing a square input matrix on a parallel processor. In one embodiment, the system includes: (1) a matrix generator operable to generate an intermediate matrix by embedding a permuted form of the input matrix in a zeroed-out sparsity pattern of a combination of lower and upper triangular matrices resulting from a prior LU factorization of a previous matrix having a same sparsity pattern, reordering to minimize fill-in and pivoting strategy as the input matrix and (2) a re-factorizer associated with the matrix generator and operable to use parallel threads to apply an incomplete-LU factorization with zero fill-in on the intermediate matrix.
    Type: Grant
    Filed: January 9, 2013
    Date of Patent: October 27, 2015
    Assignee: NVIDIA CORPORATION
    Inventors: Maxim Naumov, Sharanyan Chetlur, Lung Sheng Chien, Robert Strzodka, Philippe Vandermersch
  • Patent number: 9160341
    Abstract: Systems and methods are provided for generating at least one high fidelity resource state. A classical code and punctured to provide a first set of generators and a second set of generators. The first set of generators is mapped to a set of stabilizer generators, and the second set of generators is mapped to a set of logical operators. A set of resource states are prepared in physical qubits. A decoding process is performed on the resource states according to a quantum code represented by the set of stabilizer generators and the set of logical operators, and qubits corresponding to the stabilizers are measured.
    Type: Grant
    Filed: December 29, 2014
    Date of Patent: October 13, 2015
    Assignee: NORTHROP GRUMMAN SYSTEMS CORPORATION
    Inventor: Bryan K. Eastin
  • Patent number: 9118898
    Abstract: In general, techniques are described for implementing an 8-point inverse discrete cosine transform (IDCT). An apparatus comprising an 8-point inverse discrete cosine transform (IDCT) hardware unit may implement these techniques to transform media data from a frequency domain to a spatial domain. The 8-point IDCT hardware unit includes an even portion comprising factors A, B that are related to a first scaled factor (?) in accordance with a first relationship. The 8-point IDCT hardware unit also includes an odd portion comprising third, fourth, fifth and sixth internal factors (G, D, E, Z) that are related to a second scaled factor (?) in accordance with a second relationship. The first relationship relates the first scaled factor to the first and second internal factors. The second relationship relates the second scaled factor to the third, fourth, fifth and sixth internal factors.
    Type: Grant
    Filed: June 22, 2010
    Date of Patent: August 25, 2015
    Assignee: Qualcomm Incorporated
    Inventors: Yuriy Reznik, Rajan L. Joshi, Marta Karczewicz
  • Patent number: 9110855
    Abstract: Embodiments relate to dynamic programming. An aspect includes representing a dynamic programming problem as a matrix of cells, each cell representing an intermediate score to be calculated. Another aspect includes providing a mapping assigning cells of the matrix to elements of a result container data structure, and storing cells of the matrix to elements of the result container data structure in accordance with the mapping. Another aspect includes calculating intermediate scores of all cells of the matrix, whereby intermediate scores of some of the cells of the matrix are stored to a respectively assigned element of the result container data structure in accordance with the mapping. Another aspect includes during the calculation of the intermediate scores, dynamically updating the assignment of cells and elements in the mapping and assembling a final result of the dynamic programming problem from the intermediate scores stored in the result container data structure.
    Type: Grant
    Filed: November 30, 2012
    Date of Patent: August 18, 2015
    Assignee: International Business Machines Corporation
    Inventors: Tomasz Dziedzicki, Marek Janusz Kiszkis, Grzegorz Kokosinski, Krzysztof Zarzycki
  • Patent number: 9088793
    Abstract: A video decoder includes an entropy decoding device that generates entropy decoded (EDC) data from an encoded video signal. A multi-format video decoding device includes a plurality of vector processor units that generate a decoded video signal from the EDC data. The plurality of vector processing units includes at least one filter vector processor that operates in conjunction with a plurality of programmable filter parameters.
    Type: Grant
    Filed: March 31, 2011
    Date of Patent: July 21, 2015
    Assignee: VIXS Systems, INC.
    Inventors: Edward Hong, Dong Liu, Hongri Wang, Kai Yang, Indra Laksono, Eric Young, Xu Gang Zhao
  • Patent number: 9066658
    Abstract: In a surgical system, a system controller executes a video signature identification and image control routine to maintain quality of a video image taken by a video camera located at a surgical site and provided on a video display. The system includes a video camera/light source handpiece for insertion into a patient body. A tool is inserted separately into the surgical site. Fluid input into the surgical site is provided by a liquid pump or by an insufflator. Video signals are analyzed and fluid input/output, fluid pressure, and/or tool operation is automatically controlled to maintain image quality of the surgical site without manual adjustments.
    Type: Grant
    Filed: March 7, 2011
    Date of Patent: June 30, 2015
    Assignee: STRYKER CORPORATION
    Inventors: Andrew Hamel, Brannon P. Wells, Ruzbeh Shariff
  • Patent number: 9032006
    Abstract: Apparatus and method for processing linear systems of equations and finding a n×1 vector x satisfying Ax=b where A is a symmetric, positive-definite n×n matrix corresponding to n×n predefined high-precision elements and b is an n1 vector corresponding to n predefined high-precision elements. A first iterative process generates n low-precision elements corresponding to an n×1 vector xl satisfying Alxl=bl where Al, bl are elements in low precision. The elements are converted to high-precision data elements to obtain a current solution vector x. A second iterative process generates n low-precision data elements corresponding to an n×1 correction vector dependent on the difference between the vector b and the vector product Ax. Then there is produced from the n low-precision data elements of the correction vector respective high-precision data elements of an n×1 update vector u. The data elements of the current solution vector x are updated such that x=x+u.
    Type: Grant
    Filed: March 3, 2010
    Date of Patent: May 12, 2015
    Assignee: International Business Machines Corporation
    Inventors: Konstantinos Bekas, Alessandro Curioni
  • Patent number: 9020154
    Abstract: An acoustic apparatus including circuitry to correct for acoustic cross-coupling of acoustic drivers mounted in a common acoustic enclosure. A plurality of acoustic drivers are mounted in the acoustic enclosure so that motion of each of the acoustic drivers causes motion in each of the other acoustic drivers. A canceller cancels the motion of each of the acoustic drivers caused by motion of each of the other acoustic drivers. A cancellation adjuster cancels the motion of each of the acoustic drivers that may result from the operation of the canceller.
    Type: Grant
    Filed: April 30, 2010
    Date of Patent: April 28, 2015
    Assignee: Bose Corporation
    Inventors: Klaus Hartung, Roman Katzer
  • Publication number: 20150113032
    Abstract: The present invention provides an innovative method and system for quantifying the binary words symmetry. Information of all kinds is necessarily interpreted by binary words. Quantifying the symmetry of these binary words, regardless of their size, is a new approach that makes available a new measure that can better appreciate the complexity, the information, the redundancy or the physical structure contained in each binary word and hence, in its source. Binary numbers processing can, thanks to this measure, have new tools for new approaches in many areas such as Information Theory and Theory of Symmetry which plays a significant role in Mathematics, Chemistry, Biology, Crystallography, etc. This method is based on computational system that generates the concerned ‘Symmetric Value’ of any binary number as well as its two amazing ‘Symmetric Value Matrixes’ which do not require storage to be known, regardless of their size.
    Type: Application
    Filed: June 18, 2012
    Publication date: April 23, 2015
    Inventor: Lahcen Abellaoui