Matrix Array Patents (Class 708/520)

Single function to perform combined matrix multiplication and bias add operations

Patent number: 12236338

Abstract: A combined function specified by an instruction is performed. The combined function includes a plurality of operations performed as part of one invocation of the combined function. The performing the combined function includes performing a matrix multiplication of a first tensor and a second tensor to obtain one or more intermediate results. The second tensor includes an adjusted weight tensor created using a multiplier. Values of a bias tensor are added to the one or more intermediate results to obtain one or more results for the combined function. The one or more results are at least a part of an output tensor.

Type: Grant

Filed: June 17, 2021

Date of Patent: February 25, 2025

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Cedric Lichtenau, Kailash Gopalakrishnan, Vijayalakshmi Srinivasan, Sunil K. Shukla, Swagath Venkataramani
Restructuring matrix processing on a computing device

Patent number: 12204606

Abstract: In some examples, a system can store a first array, which is a one-dimensional array of values (e.g., matrix values), in memory. The system can also store a second array in the memory, where the second array is a one-dimensional array of pointers that point to positions of a subset of the values in the first array. The subset of values can be a first entry of each row or column of a matrix. The system can then provide the second array as input to a program routine, which can perform a matrix operation. To do so, the program routine can access the first array and the second array in memory, select a set of values for the matrix from the first array by using the pointers, execute the matrix operation using the using the selected set of values, and output the result.

Type: Grant

Filed: August 2, 2024

Date of Patent: January 21, 2025

Assignee: SAS INSTITUTE INC.

Inventor: Alexander Vladimirovich Andrianov
Distributed gaussian process classification computing system

Patent number: 12175374

Abstract: A computing system trains a classification model using distributed training data. A first worker index and a second worker index are received from a controller device and together uniquely identify a segment of a lower triangular matrix. The first and second worker indices have values from one to a predefined block size value. In response to receipt of a first computation request from the controller device, a first kernel matrix block is computed at each computing device based on the first worker index and the second worker index. In response to receipt of a second computation request from the controller device, an objective function value is computed for each observation vector included in an accessed training data subset. The computed objective function value is sent to the controller device. Model parameters for a trained classification model are output.

Type: Grant

Filed: April 15, 2024

Date of Patent: December 24, 2024

Assignee: SAS Institute Inc.

Inventors: Yingjian Wang, Xinmin Wu
Systems and processes for organizing and controlling multiple matrix processor circuits

Patent number: 12141226

Abstract: Artificial intelligence is an increasingly important sector of the computer industry. However, artificial intelligence is extremely computationally intensive field such that it can be expensive, time consuming, and energy consuming. Fortunately, many of the calculations required for artificial intelligence can be performed in parallel such that specialized processors can greatly increase computational performance. Specifically, artificial intelligence generally requires large numbers of matrix operations to implement neural networks such that specialized Matrix Processor circuits can improve performance. But a neural network is more than a collection of matrix operations; it is a set of specifically coordinated matrix operations with complex data dependencies. Without proper coordination, Matrix Processor circuits may end up idle or spending large amounts of time loading in different weight matrix data.

Type: Grant

Filed: April 5, 2019

Date of Patent: November 12, 2024

Assignee: Expedera, Inc.

Inventors: Siyad Chih-Hua Ma, Shang-Tse Chuang, Sharad Vasantrao Chole
Random sparsity handling in a systolic array

Patent number: 12086205

Abstract: Matrix multiply units can take advantage of input sparsity by zero gating ALUs, which saves power consumption, but compute throughput does not increase. To improve compute throughput from sparsity, processing resources in a matrix accelerator can skip computation with zero involved in input or output. If zeros in input can be skipped, the processing units can focus calculations on generating meaningful non-zero output.

Type: Grant

Filed: March 24, 2021

Date of Patent: September 10, 2024

Assignee: Intel Corporation

Inventors: Chunhui Mei, Hong Jiang, Jiasheng Chen, Yongsheng Liu, Yan Li
Storage organization for transposing a matrix using a streaming engine

Patent number: 12045616

Abstract: In some examples, a circuit includes an interface configured to couple to a memory that includes a set of outputs to provide a set of data from the memory. The circuit further includes a rotator coupled to the interface that includes a first set of multiplexors that each include a set of inputs coupled to the set of outputs of the interface and an output. The circuit further includes a storage circuit coupled to the rotator that includes a register file coupled to the outputs of the first set of multiplexors an alignment network. The alignment network includes a second set of multiplexors that each include a set of inputs coupled to the register file and an output.

Type: Grant

Filed: March 8, 2021

Date of Patent: July 23, 2024

Assignee: Texas Instruments Incorporated

Inventors: Jonathan (Son) Hung Tran, Joseph Raymond Michael Zbiciak
Low latency multi-constraint ranking of content items

Patent number: 12001484

Abstract: Methods and systems for low-latency multi-constraint ranking of content items. One of the methods includes receiving a request to rank a plurality of content items for presentation to a user to maximize a primary objective subject to a plurality of constraints; initializing a dual variable vector; updating the dual variable vector, comprising: determining an overall objective score for the dual variable vector; identifying a plurality of candidate dual variable vectors that includes one or more neighboring node dual variable vectors; determining respective overall objective scores for each of the one or more candidate dual variable vectors; identifying the candidate with the best overall objective score; and determining whether to update the dual variable vector based on whether the identified candidate has a better overall objective score than the dual variable vector; and determining a final ranking for the content items based on the dual variable vector.

Type: Grant

Filed: February 16, 2021

Date of Patent: June 4, 2024

Assignee: DeepMind Technologies Limited

Inventors: Timothy Arthur Mann, Ivan Lobov, Anton Zhernov, Krishnamurthy Dvijotham, Xiaohong Gong, Dan-Andrei Calian
Sparse matrix-vector multiplication

Patent number: 11995149

Abstract: A processing system includes a first set and a second set of general-purpose registers (GPRs) and memory access circuitry that fetches nonzero values of a sparse matrix into consecutive slots in the first set. The memory access circuitry also fetches values of an expanded matrix into consecutive slots in the second set of GPRs. The expanded matrix is formed based on values of a vector and locations of the nonzero values in the sparse matrix. The processing system also includes a set of multipliers that concurrently perform multiplication of the nonzero values in slots of the first set of GPRs with the values of the vector in corresponding slots of the second set. Reduced sum circuitry accumulates results from the set of multipliers for rows of the sparse matrix.

Type: Grant

Filed: December 17, 2020

Date of Patent: May 28, 2024

Assignee: Advanced Micro Devices, Inc.

Inventors: Sateesh Lagudu, Allen H. Rush, Michael Mantor
Method and device for matrix multiplication optimization using vector registers

Patent number: 11921814

Abstract: Methods and devices, the method including receiving a matrix of a neural network model; classifying at least a portion of the matrix as a first section based on a first distribution pattern of non-zero elements of the portion of the matrix; and identifying memory addresses of the non-zero elements in the first section of the matrix for loading, according to a first order determined based on the first distribution pattern, the non-zero elements in the first section into one or more vector registers.

Type: Grant

Filed: June 14, 2022

Date of Patent: March 5, 2024

Assignee: Alibaba Group Holding Limited

Inventors: Guoyang Chen, Yu Pu, Yongzhi Zhang, Weifeng Zhang, Yuan Xie
Methods and systems for product quantization-based compression of a matrix

Patent number: 11914670

Abstract: Methods and systems for compressing a matrix are described. The matrix, having a plurality of rows formed by a respective plurality of vectors, is partitioned into a plurality of submatrices, each submatrix containing sub-vectors from a respective group of one or more contiguous columns of the matrix. For each given submatrix, the sub-vectors are clustered into a plurality of clusters. For each given cluster, a centroid and a variance are computed and stored, based on the sub-vectors belonging to the given cluster. A mapping relating each vector to a respective cluster in each submatrix is stored. The stored centroids, stored variances and stored mapping form a set of compressed data for reconstruction of the matrix.

Type: Grant

Filed: September 8, 2020

Date of Patent: February 27, 2024

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Krtin Kumar, Mehdi Rezagholizadeh, Peyman Passban
FPGA specialist processing block for machine learning

Patent number: 11907719

Abstract: The present disclosure describes a digital signal processing (DSP) block that includes a plurality of columns of weight registers and a plurality of inputs configured to receive a first plurality of values and a second plurality of values. The first plurality of values is stored in the plurality of columns of weight registers after being received. Additionally, the DSP block includes a plurality of multipliers configured to simultaneously multiply each value of the first plurality of values by each value of the second plurality of values.

Type: Grant

Filed: June 26, 2020

Date of Patent: February 20, 2024

Assignee: Intel Corporation

Inventors: Martin Langhammer, Dongdong Chen, Jason R. Bergendahl
Apparatuses, methods, and systems for fused operations using sign modification in a processing element of a configurable spatial accelerator

Patent number: 11907713

Abstract: Systems, methods, and apparatuses relating to a sign modification field for fused operations in a configurable spatial accelerator are described.

Type: Grant

Filed: December 28, 2019

Date of Patent: February 20, 2024

Assignee: Intel Corporation

Inventors: Kermin E. Chofleming, Chuanjun Zhang, Daniel Towner, Simon C. Steely, Jr., Benjamin Keen
Apparatus and method of performing matrix multiplication operation of neural network

Patent number: 11899744

Abstract: A neural network apparatus for performing a matrix multiplication operation includes a memory having at least one program stored therein and a processor to perform one or more operations by executing the at least one program. The processor can determine whether to divide an initial weight in one of a column direction and a row direction according to whether a reshape operation and a transpose operation are performed before or after a matrix multiplication operation and generate division weights by dividing the initial weight by a head count in the determined direction. Also, the processor can generate intermediate feature maps by performing a matrix multiplication operation between the input feature map and the division weights and generate a final feature map based on the intermediate feature maps.

Type: Grant

Filed: April 17, 2020

Date of Patent: February 13, 2024

Assignee: Samsung Electronics Co., Ltd.

Inventors: Songyi Han, Hyunsun Park
Method, system, and computer program product for determining causality

Patent number: 11893079

Abstract: Implementations of the present disclosure relate to a method, system and program product for determining a causality between a plurality of variables.

Type: Grant

Filed: September 29, 2021

Date of Patent: February 6, 2024

Assignee: NEC CORPORATION

Inventors: Lu Feng, Chunchen Liu, Wenjuan Wei
Method, electronic device and storage medium for data projection

Patent number: 11853387

Abstract: A data sparse projection method, includes: randomly initializing a high-dimensional sparse two-dimensional matrix (S1); fixing the high-dimensional sparse two-dimensional matrix, and calculating an optimal output variable by using the high-dimensional sparse two-dimensional matrix (S2); fixing the optimal output variable, and calculating an optimal high-dimensional sparse two-dimensional matrix by using the optimal output variable (S3); and cyclically fixing the high-dimensional sparse two-dimensional matrix and the output variable until the optimal output variable is no longer increased when the high-dimensional sparse two-dimensional matrix is fixed (S4).

Type: Grant

Filed: April 12, 2023

Date of Patent: December 26, 2023

Assignee: THE CHINESE UNIVERSITY OF HONG KONG, SHENZHEN

Inventors: Chonglin Gu, Changyi Ma, Wenye Li, Shuguang Cui
High performance data mirroring in a multi-controller memory subsystem

Patent number: 11836371

Abstract: A storage system memory or memory domain with N memory controllers is organized into N-1 same-size partitions per memory controller or N partitions per memory controller with one partition reserved as spare capacity. The unreserved partitions are assigned to mirror pairs of members such that a first triangular submatrix of a representative matrix of indexed memory controllers and indexed partitions is a transpose of a second triangular submatrix of the representative matrix. The resulting distribution of members is balanced such that additional loading on remaining memory controllers when one of the memory controllers becomes inaccessible is evenly distributed.

Type: Grant

Filed: July 8, 2022

Date of Patent: December 5, 2023

Assignee: Dell Products L.P.

Inventors: Kuolin Hua, Adnan Sahin
Measuring relatedness between prediction tasks in artificial intelligence and continual learning systems

Patent number: 11836751

Abstract: A method for measuring relatedness between prediction tasks includes receiving data for a first prediction task. The method further includes measuring the relatedness of the first prediction task to at least one previous prediction task as a difference between divergence of conditional probabilities of the tasks. The method can be advantageously applied in artificial intelligence or continual learning systems.

Type: Grant

Filed: March 3, 2020

Date of Patent: December 5, 2023

Assignee: NEC CORPORATION

Inventors: Shujian Yu, Ammar Shaker
Data processing method and apparatus

Patent number: 11823303

Abstract: A data processing method and apparatus are disclosed. In various embodiments, R groups of proposal region sequences are obtained. Each group of proposal region sequence includes a plurality of proposal regions. In those embodiments, a VRPAC instruction is invoked to calculate an area of each proposal region in each group of proposal region sequence. For a jth group of proposal region sequence in the R groups of proposal region sequences, a VIOU instruction and a VAADD instruction are invoked to determine j suppression matrices of the jth group of proposal region sequence and determine a suppression vector of the jth group of proposal region sequence based on the j suppression matrices. In those embodiments, an unsuppressed proposal region is determined based on a suppression vector of each group of proposal region sequence.

Type: Grant

Filed: July 19, 2020

Date of Patent: November 21, 2023

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Luping Cui, Jiajin Tu, Hu Liu, Honghui Yuan, Heng Liao, Hou Fun Lam, Bing Li
Apparatus and method for matrix multiplication using processing-in-memory

Patent number: 11797643

Abstract: Embodiments of apparatus and method for matrix multiplication using processing-in-memory (PIM) are disclosed. In an example, an apparatus for matrix multiplication includes an array of tiles that each include one or more PIM blocks. A PIM block may include a hybrid-mode PIM block that may be configured into a digital mode or an analog mode. The PIM block configured into digital mode may perform operations associated with depth-wise (DW) convolution. On the other hand, a PIM block configured into analog mode may perform operations associated with point-wise (PW) convolution. A controller may be used to configure the PIM block into either digital mode or analog mode, depending on the computations.

Type: Grant

Filed: November 9, 2020

Date of Patent: October 24, 2023

Assignee: NEONEXUS PTE. LTD.

Inventor: Qilin Zheng
Vector and matrix computing device

Patent number: 11734383

Abstract: A computing device and related products are provided. The computing device is configured to perform machine learning calculations. The computing device includes an operation unit, a controller unit, and a storage unit. The storage unit includes a data input/output (I/O) unit, a register, and a cache. Technical solution provided by the present disclosure has advantages of fast calculation speed and energy saving.

Type: Grant

Filed: July 29, 2020

Date of Patent: August 22, 2023

Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

Inventors: Tianshi Chen, Xiao Zhang, Shaoli Liu, Yunji Chen
Matrix processing method and apparatus, and logic circuit

Patent number: 11734386

Abstract: A matrix processing method performed by a graphics processing unit (GPU) includes: determining a plurality of non-zero elements in a to-be-processed matrix at a processor in the GPU; generating a distribution matrix of the to-be-processed matrix at the processor, where the distribution matrix comprises identities for indicating positions of the plurality of non-zero elements in the to-be-processed matrix; obtaining a target matrix from another matrix by using the distribution matrix at a logic circuit in the processor, where the target matrix comprises a plurality of target elements from the another matrix; and performing matrix processing on the plurality of non-zero elements and the target matrix to obtain an operation result at the processor.

Type: Grant

Filed: December 23, 2021

Date of Patent: August 22, 2023

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Zhenjiang Dong, Chio In Ieong, Hu Liu, Hai Chen
Iterative energy-scaled variational quantum eigensolver

Patent number: 11734387

Abstract: Techniques regarding an iterative energy-scaled variational quantum eigensolver process are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a read-out component that determines a ground state energy value of a quantum Hamiltonian by employing a variational quantum eigensolver (VQE) algorithm, wherein VQE algorithm utilizes a symmetry that emerges at an energy scale of the quantum Hamiltonian.

Type: Grant

Filed: March 3, 2022

Date of Patent: August 22, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Antonio Mezzacapo, Richard Chen, Marco Pistoia
Arithmetic processing apparatus, non-transitory computer-readable storage medium, and arithmetic processing method

Patent number: 11687616

Abstract: An arithmetic processing apparatus includes a memory and a processor. The processor coupled to memory and configured to determine an individual not to be evolved to an individual of a second generation from among a plurality of individuals in a first generation based on a predetermined reference for calculation completion of fitness calculation for each of the plurality of individuals, the second generation being a generation next to the first generation, and determine to cause the determined individual to evolve to an individual of a generation next or subsequent to the second generation.

Type: Grant

Filed: November 6, 2020

Date of Patent: June 27, 2023

Assignee: FUJITSU LIMITED

Inventors: Yukito Tsunoda, Teruo Ishihara
Systems and methods to load a tile register pair

Patent number: 11609762

Abstract: Embodiments detailed herein relate to systems and methods to load a tile register pair. In one example, a processor includes: decode circuitry to decode a load matrix pair instruction having fields for an opcode and source and destination identifiers to identify source and destination matrices, respectively, each matrix having a PAIR parameter equal to TRUE; and execution circuitry to execute the decoded load matrix pair instruction to load every element of left and right tiles of the identified destination matrix from corresponding element positions of left and right tiles of the identified source matrix, respectively, wherein the executing operates on one row of the identified destination matrix at a time, starting with the first row.

Type: Grant

Filed: August 10, 2021

Date of Patent: March 21, 2023

Assignee: Intel Corporation

Inventors: Raanan Sade, Simon Rubanovich, Amit Gradstein, Zeev Sperber, Alexander Heinecke, Robert Valentine, Mark J. Charney, Bret Toll, Jesus Corbal, Elmoustapha Ould-Ahmed-Vall, Menachem Adelman
Efficient ground truth annotation

Patent number: 11556852

Abstract: A computer-implemented method for determining a set of target items to be annotated for training a machine learning application. The method comprises providing a training data set with a set of data samples and an auto-encoder with a classifier. The auto-encoder comprises an embedding model that maps the set of data samples to a set of compressed feature vectors. The set of compressed feature vectors define a compressed feature matrix. Further provided are: a definition of a graph associated to the compressed feature matrix, applying a clustering-algorithm to identify node clusters of the graph and applying a centrality algorithm to identify central nodes of the node clusters, retrieving from an annotator node labels for the central nodes, propagating the annotated node labels to other nodes of the graph and performing a training of the embedding model and the classifier with the annotated and the propagated node labels.

Type: Grant

Filed: March 6, 2020

Date of Patent: January 17, 2023

Assignee: International Business Machines Corporation

Inventors: Peter Willem Jan Staar, Michele Dolfi, Christoph Auer, Leonidas Georgopoulos, Ralf Kaestner, Alexander Velizhev, Dal Noguer Hidalgo, Rita Kuznetsova, Konstantinos Bekas
Systems and methods for quantum tomography using an ancilla

Patent number: 11550872

Abstract: Quantum computing systems and methods are provided. In one example, a quantum computing system includes a quantum system having one or more quantum system qubits and one or more ancilla qubits. The quantum computing system includes one or more quantum gates implemented by the quantum computing system. The quantum gate(s) are operable to configure the one or more ancilla qubits into a known state. The quantum computing system includes a quantum measurement circuit operable to perform a plurality of measurements on the one or more quantum system qubits using the one or more ancilla qubits. The quantum computing system includes one or more processors operable to determine a reduced density matrix for a subset of the quantum system based on a set of the plurality of measurements that include a number of repeated measurements performed using the quantum measurement circuit.

Type: Grant

Filed: October 15, 2020

Date of Patent: January 10, 2023

Assignee: GOOGLE LLC

Inventor: Zhang Jiang
Support for different matrix multiplications by selecting adder tree intermediate results

Patent number: 11520854

Abstract: A first group of elements is element-wise multiplied with a second group of elements using a plurality of multipliers belonging to a matrix multiplication hardware unit. Results of the plurality of multipliers are added together using a hierarchical tree of adders belonging to the matrix multiplication hardware unit and a final result of the hierarchical tree of adders or any of a plurality of intermediate results of the hierarchical tree of adders is selectively provided for use in determining an output result matrix.

Type: Grant

Filed: October 29, 2019

Date of Patent: December 6, 2022

Assignee: Meta Platforms, Inc.

Inventors: Yuchen Hao, Krishnakumar Narayanan Nair, Ehsan Khish Ardestani Zadeh, Rakesh Komuravelli, Abdulkadir Utku Diril, Thomas Mark Ulrich
Matrix sketching using analog crossbar architectures

Patent number: 11520855

Abstract: A computer-implemented method is presented for performing matrix sketching by employing an analog crossbar architecture. The method includes low rank updating a first matrix for a first period of time, copying the first matrix into a dynamic correction computing device, switching to a second matrix to low rank update the second matrix for a second period of time, as the second matrix is low rank updated, feeding the first matrix with first stochastic pulses to reset the first matrix back to a first matrix symmetry point, copying the second matrix into the dynamic correction computing device, switching back to the first matrix to low rank update the first matrix for a third period of time, and as the first matrix is low rank updated, feeding the second matrix with second stochastic pulses to reset the second matrix back to a second matrix symmetry point.

Type: Grant

Filed: May 15, 2020

Date of Patent: December 6, 2022

Assignees: INTERNATIONAL BUSINESS MACHINES CORPORTATION, RAMOT AT TEL-AVIV UNIVERSITY, LTD.

Inventors: Lior Horesh, Oguzhan Murat Onen, Haim Avron, Tayfun Gokmen, Vasileios Kalantzis, Shashanka Ubaru
Nested loop control

Patent number: 11442709

Abstract: A method for compiling and executing a nested loop includes initializing a nested loop controller with an outer loop count value and an inner loop count value. The nested loop controller includes a predicate FIFO. The method also includes coalescing the nested loop and, during execution of the coalesced nested loop, causing the nested loop controller to populate the predicate FIFO and executing a get predicate instruction having an offset value, where the get predicate returns a value from the predicate FIFO specified by the offset value. The method further includes predicating an outer loop instruction on the returned value from the predicate FIFO.

Type: Grant

Filed: August 3, 2020

Date of Patent: September 13, 2022

Assignee: Texas Instmments Incorporated

Inventors: Kai Chirca, Timothy D. Anderson, Todd T. Hahn, Alan L. Davis
Matrix transpose hardware acceleration

Patent number: 11435941

Abstract: In one example, an apparatus comprises: a memory array having an array of memory elements arranged in rows and columns, each memory element being configured to store a data element; and a memory access circuit configured to: perform a row write operation to store a first group of data elements at a first row of the array of memory elements; perform a column read operation at a first column of the array of memory elements to obtain a second group of data elements; and perform a column write operation to store a third group of data elements at the first column of the array of memory elements to replace the second group of data elements.

Type: Grant

Filed: June 24, 2020

Date of Patent: September 6, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Kun Xu, Paul Gilbert Meyer, Ron Diamant
Syndrome data compression for quantum computing devices

Patent number: 11410070

Abstract: A quantum computing device comprises at least one quantum register including a plurality of logical qubits. A compression engine is coupled to each logical qubit of the plurality of logical qubits. Each compression engine is configured to compress syndrome data. A decompression engine is coupled to each compression engine. Each decompression engine is configured to receive compressed syndrome data, decompress the received compressed syndrome data, and route the decompressed syndrome data to a decoder block.

Type: Grant

Filed: November 18, 2019

Date of Patent: August 9, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Poulami Das, Nicolas Guillaume Delfosse, Christopher Anand Pattison, Srilatha Manne, Douglas Carmean, Krysta Marie Svore, Helmut Gottfried Katzgraber
Dynamically adaptable arrays for vector and matrix operations

Patent number: 11409840

Abstract: An array processor includes processor element arrays distributed in rows and columns. The processor element arrays perform operations on parameter values. The array processor also includes memory interfaces that are dynamically mapped to mutually exclusive subsets of the rows and columns of the processor element arrays based on dimensions of matrices that provide the parameter values to the processor element arrays. In some cases, the processor element arrays are vector arithmetic logic unit (ALU) processors and the memory interfaces are direct memory access (DMA) engines. The rows of the processor element arrays in the subsets are mutually exclusive to the rows in the other subsets and the columns of the processor element arrays in the subsets are mutually exclusive to the columns in the other subsets. The matrices can be symmetric or asymmetric, e.g., one of the matrices can be a vector having a single column.

Type: Grant

Filed: September 25, 2020

Date of Patent: August 9, 2022

Assignee: Advanced Micro Devices, Inc.

Inventors: Sateesh Lagudu, Allen H. Rush, Michael Mantor, Arun Vaidyanathan Ananthanarayan, Prasad Nagabhushanamgari
Instructions for vector multiplication of signed words with rounding

Patent number: 11392379

Abstract: Disclosed embodiments relate to executing a vector multiplication instruction. In one example, a processor includes fetch circuitry to fetch the vector multiplication instruction having fields for an opcode, first and second source identifiers, and a destination identifier, decode circuitry to decode the fetched instruction, execution circuitry to, on each of a plurality of corresponding pairs of fixed-sized elements of the identified first and second sources, execute the decoded instruction to generate a double-sized product of each pair of fixed-sized elements, the double-sized product being represented by at least twice a number of bits of the fixed size, and generate a signed fixed-sized result by rounding the most significant fixed-sized portion of the double-sized product to fit into the identified destination.

Type: Grant

Filed: September 27, 2017

Date of Patent: July 19, 2022

Assignee: Intel Corporation

Inventors: Venkateswara R. Madduri, Carl Murray, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney, Robert Valentine, Jesus Corbal
System-based extension of qEOM algorithm for quantum computation of excited-state properties

Patent number: 11392849

Abstract: Systems and methods that facilitate motion formalism utilizing quantum computing, to compute matrix operators in terms of commutators between qubit operators and measurements on the quantum hardware, wherein the commutators are computed utilizing symbolic calculus. Embodiments reduce computational cost of generalized eigenvalue synthesis relying on symbolic calculus and parallelization. Embodiments disclosed herein can also develop estimators of excited-states properties, considering constants of motion (e.g. spin) and non-constants of motions (e.g. dipoles, density matrices).

Type: Grant

Filed: September 18, 2020

Date of Patent: July 19, 2022

Assignees: INTERNATIONAL BUSINESS MACHINES CORPORATION, JSR CORPORATION

Inventors: Mario Motta, Pauline Ollitrault, Stephen Wood, Panagiotis Barkoutsos, Joseph Latone, Ivano Tavernelli, Gavin Jones, Edward Pyzer-Knapp, Yuya Onishi
Matrix multiplication device and operation method thereof

Patent number: 11379185

Abstract: A matrix multiplication device and an operation method thereof are provided. The matrix multiplication device includes a plurality of unit circuits. Each of the unit circuits includes a multiplying-adding circuit, a first register, and a second register. A first input terminal and a second input terminal of the multiplying-adding circuit are respectively coupled to a corresponding first input line and a corresponding second input line. An input terminal and an output terminal of the first register are respectively coupled to an output terminal and a third input terminal of the multiplying-adding circuit. The second register is coupled to the first register to receive and temporarily store a multiplication accumulation result. Wherein, the second registers of the unit circuits output the multiplication accumulation results in a column direction in a first output mode, and output the multiplication accumulation results in a row direction in a second output mode.

Type: Grant

Filed: September 21, 2020

Date of Patent: July 5, 2022

Assignee: NEUCHIPS CORPORATION

Inventors: Jian-Wen Chen, Chiung-Liang Lin
Main processor prefetching operands for coprocessor operations

Patent number: 11334355

Abstract: Technology for providing data to a processing unit is disclosed. A computer processor may be divided into a master processing unit and consumer processing units. The master processing unit at least partially decodes a machine instruction and determines whether data is needed to execute the machine instruction. The master processing unit sends a request to memory for the data. The request may indicate that the data is to be sent from the memory to a consumer processing unit. The data sent by the memory in response to the request may be stored in local read storage that is close to the consumer processing unit for fast access. The master processing unit may also provide the machine instruction to the consumer processing unit. The consumer processing unit may access the data from the local read storage and execute the machine instruction based on the accessed data.

Type: Grant

Filed: May 4, 2017

Date of Patent: May 17, 2022

Assignee: Futurewei Technologies, Inc.

Inventors: Alan Gatherer, Sushma Wokhlu, Peter Yan, Ywhpyng Harn, Ashish Rai Shrivastava, Tong Sun, Lee Dobson McFearin
Iterative energy-scaled variational quantum eigensolver

Patent number: 11294986

Abstract: Techniques regarding an iterative energy-scaled variational quantum eigensolver process are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a read-out component that determines a ground state energy value of a quantum Hamiltonian by employing a variational quantum eigensolver (VQE) algorithm, wherein VQE algorithm utilizes a symmetry that emerges at an energy scale of the quantum Hamiltonian.

Type: Grant

Filed: November 22, 2019

Date of Patent: April 5, 2022

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Antonio Mezzacapo, Richard Chen, Marco Pistoia
Crossbar allocation for matrix-vector multiplications

Patent number: 11269973

Abstract: Repeating patterns are identified in a matrix. Based on the identification of the repeating patterns, instructions are generated, which are executable by processing cores of a dot product engine to allocate analog multiplication crossbars of the dot product engine to perform multiplication of the matrix with a vector.

Type: Grant

Filed: April 28, 2020

Date of Patent: March 8, 2022

Assignee: Hewlett Packard Enterprise Development LP

Inventors: Mashood Abdulla Kodavanji, Soumitra Chatterjee, Chinmay Ghosh, Mohan Parthasarathy
Vector processor and control method therefor

Patent number: 11263018

Abstract: A vector processor is disclosed. The vector processor includes a plurality of register files provided to each of a plurality of single instruction multiple data (SIMD) lanes, storing each of a plurality of pieces of data, and respectively outputting input data to be used in a current cycle among the plurality of pieces of data, a shuffle unit for receiving a plurality of pieces of input data outputted from the plurality of register files, and performing shuffling such that the received plurality of pieces of input data respectively correspond to the plurality of SIMD lanes and outputting the same; and a command execution unit for performing a parallel operation by receiving input data outputted from the shuffle unit.

Type: Grant

Filed: October 23, 2017

Date of Patent: March 1, 2022

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Ki-seok Kwon, Jae-un Park, Dong-kwan Suh, Kang-jin Yoon
Neural network processor incorporating separate control and data fabric

Patent number: 11263512

Abstract: A novel and useful neural network (NN) processing core adapted to implement artificial neural networks (ANNs) and incorporating strictly separate control and data planes. The NN processor is constructed from self-contained computational units organized in a hierarchical architecture. The homogeneity enables simpler management and control of similar computational units, aggregated in multiple levels of hierarchy. Computational units are designed with minimal overhead as possible, where additional features and capabilities are aggregated at higher levels in the hierarchy. On-chip memory provides storage for content inherently required for basic operation at a particular hierarchy and is coupled with the computational resources in an optimal ratio. Lean control provides just enough signaling to manage only the operations required at a particular hierarchical level. Dynamic resource assignment agility is provided which can be adjusted as required depending on resource availability and capacity of the device.

Type: Grant

Filed: April 3, 2018

Date of Patent: March 1, 2022

Inventors: Avi Baum, Or Danon, Hadar Zeitlin, Daniel Ciubotariu, Rami Feig
Inserting null vectors into a stream of vectors

Patent number: 11256508

Abstract: Software instructions are executed on a processor within a computer system to configure a steaming engine with stream parameters to define a multidimensional array. The stream parameters define a size for each dimension of the multidimensional array, a null vector count (N), and a selected dimension. Data is fetched from a memory coupled to the streaming engine responsive to the stream parameters. A stream of vectors is formed for the multidimensional array responsive to the stream parameters from the data fetched from memory. N null stream vectors are inserted into the stream of vectors for the selected dimension without fetching respective null data from the memory.

Type: Grant

Filed: May 23, 2019

Date of Patent: February 22, 2022

Assignee: Texas Instruments Incorporated

Inventors: Asheesh Bhardwaj, William Franklin Leven, Son Hung Tran, Timothy David Anderson
Two-dimensional zero padding in a stream of matrix elements

Patent number: 11249759

Abstract: Software instructions are executed on a processor within a computer system to configure a steaming engine with stream parameters to define a multidimensional array. The stream parameters define a size for each dimension of the multidimensional array and a specified width for two selected dimensions of the array. Data is fetched from a memory coupled to the streaming engine responsive to the stream parameters. A stream of vectors is formed for the multidimensional array responsive to the stream parameters from the data fetched from memory. When either selected dimension in the stream of vectors exceeds a respective specified width, the streaming engine inserts null elements into each portion of a respective vector for the selected dimension that exceeds the specified width in the stream of vectors. Stream vectors that are completely null are formed by the streaming engine without accessing the system memory for respective data.

Type: Grant

Filed: May 23, 2019

Date of Patent: February 15, 2022

Assignee: Texas Instruments Incorporated

Inventors: William Franklin Leven, Asheesh Bhardwaj, Son Hung Tran, Timothy David Anderson
Systems and methods for performing matrix compress and decompress instructions

Patent number: 11249761

Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.

Type: Grant

Filed: July 20, 2020

Date of Patent: February 15, 2022

Assignee: Intel Corporation

Inventors: Dan Baum, Michael Espig, James Guilford, Wajdi K. Feghali, Raanan Sade, Christopher J. Hughes, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney, Vinodh Gopal, Ronen Zohar, Alexander F. Heinecke
One-dimensional zero padding in a stream of matrix elements

Patent number: 11231929

Abstract: Software instructions are executed on a processor within a computer system to configure a steaming engine with stream parameters to define a multidimensional array. The stream parameters define a size for each dimension of the multidimensional array and a specified width for a selected dimension of the array. Data is fetched from a memory coupled to the streaming engine responsive to the stream parameters. A stream of vectors is formed for the multidimensional array responsive to the stream parameters from the data fetched from memory. When the selected dimension in the stream of vectors exceeds the specified width, the streaming engine inserts null elements into each portion of a respective vector for the selected dimension that exceeds the specified width in the stream of vectors. Stream vectors that are completely null are formed by the streaming engine without accessing the system memory for respective data.

Type: Grant

Filed: May 23, 2019

Date of Patent: January 25, 2022

Assignee: Texas Instruments Incorporated

Inventors: Son Hung Tran, Shyam Jagannathan, Timothy David Anderson
Method, system, and computer program product for determining causality

Patent number: 11232175

Abstract: Implementations of the present disclosure relate to a method, system and program product for determining a causality between a plurality of variables.

Type: Grant

Filed: March 28, 2019

Date of Patent: January 25, 2022

Assignee: NEC CORPORATION

Inventors: Lu Feng, Chunchen Liu, Wenjuan Wei
Inverse-image sampling device, inverse-image sampling method, and inverse-image sampling program

Patent number: 11216533

Abstract: A grouping means 11 that extracts basis vectors from a set of basis vectors for a lattice having a predetermined relationship with a matrix used to generate a public key, and that groups the basis vectors such that a predetermined condition is satisfied. A sampling means 12 that samples, for at least one group, the same number of arbitrary values as the number of a plurality of basis vectors included in that group, in parallel for the individual basis vectors, onto a lattice constituted by the plurality of basis vectors, the arbitrary values serving as random numbers following a discrete Gaussian distribution. The predetermined condition is that each of the basis vectors included in a group is orthogonal to the other basis vectors included in the same group and is also orthogonal to Gram-Schmidt basis vectors, which are vectors obtained by orthogonalizing the other basis vectors by Gram-Schmidt orthogonalization.

Type: Grant

Filed: May 12, 2017

Date of Patent: January 4, 2022

Assignee: NEC CORPORATION

Inventors: Yuki Tanaka, Kazuhiko Minematsu
Compute array of a processor with mixed-precision numerical linear algebra support

Patent number: 11188328

Abstract: Aspects include a compute array of a processor with mixed-precision numerical linear algebra support. A first precision and a first shape of a first input matrix and a second precision and a second shape of a second input matrix to the compute array are determined. A number of rank updates of a result matrix to store in an accumulator register having a predetermined size are determined, where the number of rank updates is based on the first precision and the first shape of the first input matrix, the second precision and the second shape of the second input matrix, and the predetermined size of the accumulator register. A plurality of linear algebra operations is repeated in parallel within the compute array to update the result matrix in the accumulator register based on the first input matrix, the second input matrix, and the number of rank updates.

Type: Grant

Filed: December 12, 2019

Date of Patent: November 30, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jose E. Moreira, Brett Olsson, Brian W. Thompto, Silvia Melitta Mueller, Andreas Wagner
Three-dimensional lane predication for matrix operations

Patent number: 11182458

Abstract: Embodiments of the present invention are directed to a new instruction set extension and a method for providing 3D lane predication for matrix operations. In a non-limiting embodiment of the invention, a first input matrix having m rows and k columns and a second input matrix having k rows and n columns are received by a compute array of a processor. A three-dimensional predicate mask having an M-bit row mask, an N-bit column mask, and a K-bit rank mask is generated. A result matrix of up to m rows, up to n columns, and up to k rank updates is determined based on the first input matrix, the second input matrix, and the predicate mask.

Type: Grant

Filed: December 12, 2019

Date of Patent: November 23, 2021

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Brett Olsson, Brian W. Thompto, Jose E. Moreira, Silvia Melitta Mueller, Andreas Wagner
Computationally efficient mixed precision floating point waveform generation

Patent number: 11182126

Abstract: Computationally efficient mixed precision floating point waveform generation takes advantage of the high-speed generation of waveforms with single-precision floating point numbers while reducing the generally unacceptable loss of precision of pure single-precision floating point to generate any waveform that repeats in 2?. This approaches computes a reference phase in double precision as the modulus of the phase with 2? and then computes offsets to that value in single precision. The double precision reference phase is recomputed as needed depending on how quickly the phase grows and how large a machine epsilon is desired.

Type: Grant

Filed: June 25, 2019

Date of Patent: November 23, 2021

Assignee: Raytheon Company

Inventors: Ender Barillas, Brian Filarsky
Apparatus and method for complex multiplication

Patent number: 11169800

Abstract: An embodiment of the invention is a processor including execution circuitry to calculate, in response to a decoded instruction, a result of a complex multiplication of a first complex number and a second complex number. The calculation includes a first operation to calculate a first term of a real component of the result and a first term of the imaginary component of the result. The calculation also includes a second operation to calculate a second term of the real component of the result and a second term of the imaginary component of the result. The processor also includes a decoder, a first source register, and a second source register. The decoder is to decode an instruction to generate the decoded instruction. The first source register is to provide the first complex number and the second source register is to provide the second complex number.

Type: Grant

Filed: October 18, 2019

Date of Patent: November 9, 2021

Assignee: Intel Corporation

Inventors: Robert Valentine, Mark Charney, Raanan Sade, Elmoustapha Ould-Ahmed-Vall, Jesus Corbal, Roman S. Dubtsov

1 2 3 4 5 … next