Multiplication Of Matrices Patents (Class 708/607)
-
Patent number: 12632220Abstract: A multiply-accumulate (MAC) computation circuit includes: a source bit cell block configured to determine a MAC operation result of an input signal based on a plurality of source bit cells; a replica bit cell block comprising a plurality of replica bit cells corresponding to the plurality of source bit cells; and a readout circuit configured to read out a digital value of the MAC operation result using the replica bit cell block.Type: GrantFiled: December 6, 2021Date of Patent: May 19, 2026Assignee: Samsung Electronics Co., Ltd.Inventors: Hyungwoo Lee, Seungchul Jung, Sang Joon Kim, Sungmeen Myung
-
Patent number: 12615299Abstract: Certain aspects of the disclosure provide techniques for access control policy management. A method generally includes factorizing a user access co-occurrence data element to generate two data sub-elements, wherein: the user access co-occurrence data element represents co-occurrences between users of a system and resources of the system, a product of the two data sub-elements approximates the user access co-occurrence data element, and each of the two data sub-elements has reduced dimensionality compared to the user access co-occurrence data element; generating an approximated user access co-occurrence data element based on the product of the two data sub-elements; comparing the user access co-occurrence data element and the approximated user access co-occurrence data element to determine one or more anomalies, wherein each of the one or more anomalies relates to access for a user to a resource of the system; and taking one or more actions to rectify the one or more anomalies.Type: GrantFiled: October 18, 2023Date of Patent: April 28, 2026Assignee: Intuit Inc.Inventors: Yair Horesh, Yaron Sheffer, Boaz Sapir, Margarita Vald, Mike Rooz
-
Patent number: 12585727Abstract: Detailed are embodiments related to bit matrix multiplication in a processor. For example, in some embodiments a processor comprising: decode circuitry to decode an instruction have fields for an opcode, an identifier of a first source bit matrix, an identifier of a second source bit matrix, an identifier of a destination bit matrix, and an immediate; and execution circuitry to execute the decoded instruction to perform a multiplication of a matrix of S-bit elements of the identified first source bit matrix with S-bit elements of the identified second source bit matrix, wherein the multiplication and accumulation operations are selected by the operation selector and store a result of the matrix multiplication into the identified destination bit matrix, wherein S indicates a plural bit size is described.Type: GrantFiled: June 26, 2024Date of Patent: March 24, 2026Assignee: Intel CorporationInventors: Dmitry Y. Babokin, Kshitij A. Doshi, Vadim Sukhomlinov
-
Patent number: 12523473Abstract: The invention relates to a method, navigation device and computer program product for assisting with the navigation of a vehicle provided with a navigation device. The method comprises the following steps: acquiring a priori values of variables of a navigation device of the vehicle; determining current values of the variables and a current uncertainty matrix from previous values of the variables and a previous uncertainty matrix; and determining a correction from the current values of the variables, the current uncertainty matrix and a measurement.Type: GrantFiled: June 3, 2022Date of Patent: January 13, 2026Assignee: SAFRANInventor: Axel Barrau
-
Patent number: 12489604Abstract: A method for encrypting data, comprising: the transformation of a base message into an intermediate message by means of successive matrix rearrangement operations; the definition of a numerical set, which is transformed into a new numerical order, also by means of matrix rearrangement operations; the definition of a substitution alphabet; the establishment of a replacement operation, comprising the replacement of one character of the intermediate message by one character of the substitution alphabet, pursuant to a command defined by the new numerical order, starting from an initial magnitude of displacement, and progressively increasing the magnitude of displacement.Type: GrantFiled: October 10, 2020Date of Patent: December 2, 2025Inventors: Jaime Ricardo Gutiérrez Salazar, Jaime Andrés López Lisboa
-
Patent number: 12450308Abstract: Performing set operations using sparse matrix operations offered by a multi-core processing unit (such as a graphics processing unit). The set operation is converted into operand matrices, and sparse matrix operations, foregoing the use of hash tables. The input set is converted into a matrix, a matrix operation corresponding to the set operation is identified, and one or more operands of the set operation are also represented within a matrix. The matrix operation is then performed on these matrices to obtain an output matrix, which is then converted to an output set.Type: GrantFiled: February 22, 2024Date of Patent: October 21, 2025Assignee: Microsoft Technology Licensing, LLCInventor: Ritwik Das
-
Patent number: 12437026Abstract: According to one example of the present disclosure, a system may be provided. The system may comprise at least one processing core configured to perform computation operations of at least one neural network model associated with tensors, at least one memory circuit configured to store the tensors, a plurality of bus circuits operably coupled to the at least one processing core and the at least one memory circuit. The plurality of bus circuits configured to send the tensors from the at least one memory circuit to the at least one processing core responsive to receiving requests for read operations or write operations, and a controller operably coupled to the plurality of bus circuits, the controller configured to determine a priority of each of the tensors for each bus circuit for the read operations or the write operations.Type: GrantFiled: April 11, 2025Date of Patent: October 7, 2025Assignee: DEEPX CO., LTD.Inventor: Je Ik Choi
-
Patent number: 12423379Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for loading a matrix into a circuit having an array having M×N cells. One of the methods includes: receiving a plurality of non-zero input values from a first input matrix; receiving index metadata that indicates, for each non-zero input value in the plurality of input values, which cell of the M×N cells in the array the non-zero input value should be loaded into; sending the non-zero input values and the index metadata to the M×N cells; and at a particular cell of the M×N cells in the array: receiving a particular non-zero input value and corresponding index metadata; and determining from the corresponding index metadata for the particular non-zero input value whether to store the particular non-zero input value at the cell or to shift the particular non-zero input value to another cell.Type: GrantFiled: July 6, 2021Date of Patent: September 23, 2025Assignee: Google LLCInventors: Reginald Clifford Young, Trevor John Gale
-
Patent number: 12417017Abstract: An electronic device includes a host processor configured to: convert a sparse matrix compressed and expressed in a first compressed format into a second compressed storage format, based on a feature of the sparse matrix; preprocess a vector based on the second compressed storage format; and transmit the sparse matrix converted into the second compressed storage format and the preprocessed vector to a computing device; and the computing device configured to multiply the sparse matrix converted into the second compressed storage format by the preprocessed vector.Type: GrantFiled: November 30, 2023Date of Patent: September 16, 2025Assignee: Samsung Electronics Co., Ltd.Inventors: Hyesun Hong, Jinpil Lee, Dongjin Lee
-
Patent number: 12399879Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing a kNN computation using a hardware accelerator. One of the methods includes obtaining a set of one or more query vectors; obtaining a set of database vectors; and performing, on a hardware accelerator and for each query vector in the set, a search for the k most similar database vectors to the query vector, comprising: computing, by circuitry of the hardware accelerator and for each query vector, a respective similarity value between the query vector and each database vector; and for each query vector, identifying, by the hardware accelerator and for each bin, (i) an index of the most similar database vector within the bin and (ii) the respective similarity value for the most similar database vector within the bin.Type: GrantFiled: June 26, 2023Date of Patent: August 26, 2025Assignee: Google LLCInventors: Felix Ren-Chyan Chern, Blake Alan Hechtman, Andrew Thomas Davis, Ruiqi Guo, Sanjiv Kumar, David Alexander Majnemer
-
Patent number: 12387103Abstract: A computing system, including a processor configured to train a machine learning model in a plurality of backpropagation iterations. Each backpropagation iteration may include generating a coordinate pair sequence. Each coordinate pair may be unique within the coordinate pair sequence and may include non-matching coordinates. The backpropagation iteration may further include receiving parametrizing angles respectively associated with the coordinate pairs. The backpropagation iteration may further include computing a unitary matrix parametrized by the parametrizing angles, computing a loss gradient matrix, and computing a Jacobian-vector product (JVP). Computing the JVP may include computing a rotated unitary matrix and a rotated loss gradient matrix for each coordinate pair. The JVP may be computed from the rotated unitary matrix and the rotated loss gradient matrix. The backpropagation iteration may further include updating the parametrizing angles based at least in part on the JVP.Type: GrantFiled: May 12, 2021Date of Patent: August 12, 2025Assignee: Microsoft Technology Licensing, LLCInventor: Firas Hamze
-
Patent number: 12380946Abstract: A method of performing an in-memory computation includes storing a first subset of data in a first segment of a first memory array and a second subset of the data in a second segment of the first memory array, latching a first data bit from a first column of memory cells in the first segment of the first memory array, sequentially reading a plurality of second data bits from a second column of memory cells in the second segment of the first memory array, and performing a logic operation on each combination of the latched first data bit and each second data bit.Type: GrantFiled: August 10, 2023Date of Patent: August 5, 2025Assignee: TAIWAN SEMICONDUCTOR MANUFACTURING COMPANY, LTD.Inventors: Yen-Huei Chen, Hidehiro Fujiwara, Hung-Jen Liao, Jonathan Tsung-Yung Chang
-
Patent number: 12367254Abstract: Performing set operations using sparse matrix operations offered by a multi-core processing unit (such as a graphics processing unit). The set operation is converted into operand matrices, and sparse matrix operations, foregoing the use of hash tables. The input set is converted into a matrix, a matrix operation corresponding to the set operation is identified, and one or more operands of the set operation are also represented within a matrix. The matrix operation is then performed on these matrices to obtain an output matrix, which is then converted to an output set.Type: GrantFiled: February 22, 2024Date of Patent: July 22, 2025Assignee: Microsoft Technology Licensing, LLCInventor: Ritwik Das
-
Patent number: 12353505Abstract: Methods and apparatus for performing diversity matrix operations within a memory fabric. Various embodiments of the present disclosure are directed to converting a memory array into a matrix fabric for spatial diversity-related matrix transformations and performing matrix operations therein. Exemplary embodiments described herein perform MIMO-related matrix transformations (e.g., precoding, beamforming, or data recovery matrix operations) within a memory device that includes a matrix fabric and matrix multiplication unit (MMU). In one variant, the matrix fabric uses a “crossbar” construction of resistive elements. Each resistive element stores a level of impedance that represents the corresponding matrix coefficient value. The crossbar connectivity can be driven with an electrical signal representing the input vector as an analog voltage. The resulting signals can be converted from analog voltages to a digital values by an MMU to yield a matrix-vector product.Type: GrantFiled: November 6, 2023Date of Patent: July 8, 2025Assignee: Micron Technology, Inc.Inventor: Fa-Long Luo
-
Patent number: 12346403Abstract: Based on a predetermined number of available processor sockets, a plurality of candidate matrix decompositions are identified, which correspond to a multiplication of matrices. Based on a first comparative relationship of a variation of first sizes of the plurality of candidate matrix decompositions along a first dimension and a second comparative relationship of a variation of second sizes of the plurality of candidate matrix decomposition sizes along a second dimension, a given candidate matrix decomposition is selected. Processing of the multiplication among the processor sockets is distributed based on the given candidate matrix decomposition.Type: GrantFiled: June 5, 2024Date of Patent: July 1, 2025Assignee: Hewlett Packard Enterprise Development LPInventor: Aaron M. Collier
-
Patent number: 12340300Abstract: Improved placement of memory and functional modules, ‘tiles’, within a tiled processor architecture are disclosed for linear algebra calculations involving vectors and matrices comprising large amounts of data. The improved placement places the data in close proximity to the functional modules performing calculations using the data. These modules enable these calculations to be performed more quickly while using less energy. These modules, in particular, improve the efficiency of the training and application of deep learning and artificial neural network systems. This Abstract and the independent Claims are concise signifiers of embodiments of the claimed inventions. The Abstract does not limit the scope of the claimed inventions.Type: GrantFiled: March 16, 2021Date of Patent: June 24, 2025Assignee: Groq, Inc.Inventors: Dennis Charles Abts, Jonathan Alexander Ross
-
Patent number: 12339924Abstract: A data operation device is disclosed. The data operation device comprises at least one memory configured to store a first data set represented as a first sparse matrix and a second data set represented as a second matrix, a vector unit configured to perform a row-wise product-based matrix multiplication operation based on the first sparse matrix and the second matrix and output a third data set represented as a third matrix, and a memory load unit configured to load into the vector unit first vector data associated with a row of the first sparse matrix from the first data set, and second vector data associated with a row of the second matrix that corresponds to an order of non-zero vector elements included in the first vector data from the second data set.Type: GrantFiled: November 12, 2024Date of Patent: June 24, 2025Assignee: REBELLIONS INC.Inventor: Minhoo Kang
-
Patent number: 12260214Abstract: A compute channel can have multiple computational circuit blocks coupled in series to form a pipeline. The compute channel can perform a computation on an input tensor to generate an output tensor based on an instruction. When the computational does not require all of the computational circuit blocks, the throughput of the compute channel can be increased by splitting the data elements of the input tensor into multiple input data streams. The multiple input data streams are provided to respective subsets of one or more computational circuit blocks in the pipeline using bypass circuitry of the computational circuit blocks, and the computation can be performed on multiple input data streams in the respective subsets of one or more computational circuit blocks to generate multiple output data streams corresponding to the output tensor.Type: GrantFiled: September 30, 2022Date of Patent: March 25, 2025Assignee: Amazon Technologies, Inc.Inventors: Paul Gilbert Meyer, Ron Diamant, Sundeep Amirineni, Sunil Kumar Bathula
-
Patent number: 12229215Abstract: The present disclosure relates to methods and apparatus for compute processing. For example, disclosed techniques facilitate improving performance of matrix multiplication in streaming processor. Aspects of the present disclosure can execute, with a load control unit, a first load instruction to load a set of input data of an input matrix from a first memory to a second memory. Aspects of the present disclosure can also execute, with the load control unit, a second load instruction to load a set of weight data of a weight matrix from the first memory to the second memory. Additionally, aspects of the present disclosure can perform, with an ALU component, a matrix multiplication operation using the set of input data and the set of weight data to generate an output matrix. Further, aspects of the present disclosure can store the output matrix at a general purpose register accessible to the ALU component.Type: GrantFiled: October 16, 2023Date of Patent: February 18, 2025Assignee: QUALCOMM IncorporatedInventors: Yun Du, Gang Zhong, Fei Wei, Yibin Zhang, Jing Han, Hongjiang Shang, Elina Kamenetskaya, Minjie Huang, Alexei Vladimirovich Bourd, Chun Yu, Andrew Evan Gruber, Eric Demers
-
Patent number: 12223011Abstract: Techniques for data manipulation using integer matrix multiplication using pipelining are disclosed. A first integer matrix with dimensions m×k and a second integer matrix with dimensions k×n are obtained for matrix multiplication within a processor. The first and second integer matrices employ a two's complement variable radix point data representation. The first and second integer matrices are distilled into (j×j) submatrices. A first variable radix point format and an initial value for an accumulator register are configured dynamically. A first variable radix point format is configured dynamically for the first integer matrix and a second variable radix point format is configured dynamically for the second integer matrix. Multiply-accumulate operations are executed in a pipelined fashion on the (j×j) submatrices of the first integer matrix and the second integer matrix, where a third variable radix point format is configured for the result.Type: GrantFiled: November 27, 2023Date of Patent: February 11, 2025Assignee: MIPS Holding, Inc.Inventor: David John Simpson
-
Patent number: 12175246Abstract: Disclosed embodiments relate to matrix compress/decompress instructions. In one example, a processor includes fetch circuitry to fetch a compress instruction having a format with fields to specify an opcode and locations of decompressed source and compressed destination matrices, decode circuitry to decode the fetched compress instructions, and execution circuitry, responsive to the decoded compress instruction, to: generate a compressed result according to a compress algorithm by compressing the specified decompressed source matrix by either packing non-zero-valued elements together and storing the matrix position of each non-zero-valued element in a header, or using fewer bits to represent one or more elements and using the header to identify matrix elements being represented by fewer bits; and store the compressed result to the specified compressed destination matrix.Type: GrantFiled: September 1, 2023Date of Patent: December 24, 2024Assignee: Intel CorporationInventors: Dan Baum, Michael Espig, James Guilford, Wajdi K. Feghali, Raanan Sade, Christopher J. Hughes, Robert Valentine, Bret Toll, Elmoustapha Ould-Ahmed-Vall, Mark J. Charney, Vinodh Gopal, Ronen Zohar, Alexander F. Heinecke
-
Patent number: 12164882Abstract: A memory circuit includes a selection circuit, a column of memory cells, and an adder tree. The selection circuit is configured to receive input data elements, each input data element including a number of bits equal to H, and output a selected set of kth bits of the H bits of the input data elements. Each memory cell of the column of memory cells includes a first storage unit configured to store a first weight data element and a first multiplier configured to generate a first product data element based on the first weight data element and a first kth bit of the selected set of kth bits. The adder tree is configured to generate a summation data element based on each of the first product data elements.Type: GrantFiled: March 16, 2021Date of Patent: December 10, 2024Assignee: TAIWAN SEMICONDUCTOR MANUFACTURING COMPANY, LTD.Inventors: Yu-Der Chih, Hidehiro Fujiwara, Yi-Chun Shih, Po-Hao Lee, Yen-Huei Chen, Chia-Fu Lee, Jonathan Tsung-Yung Chang
-
Patent number: 12153974Abstract: An arithmetic apparatus includes input line pairs and a multiply-accumulate device. A signal pair is input to the input line pairs within an input period. The multiply-accumulate device includes multiplication units, an accumulation unit, a charging unit, and an output unit. The multiplication units generate a positive weight charge and a negative weight charge. The accumulation unit accumulates the positive weight charge and the negative weight charge. The charging unit charges the accumulation unit after the input period. The output unit performs, after charging starts, threshold determination using a predetermined threshold value on a voltage of the accumulation unit, to thereby output a positive multiply-accumulate signal representing a sum of positive weight product values and a negative multiply-accumulate signal representing a sum of negative weight product values.Type: GrantFiled: March 12, 2020Date of Patent: November 26, 2024Assignee: Sony Group CorporationInventor: Hiroshi Yoshida
-
Patent number: 12153899Abstract: An apparatus and method for complex matrix transpose and multiply.Type: GrantFiled: December 23, 2020Date of Patent: November 26, 2024Assignee: Intel CorporationInventors: Menachem Adelman, Robert Valentine, Daniel Towner, Amit Gradstein, Mark Jay Charney
-
Patent number: 12141229Abstract: One embodiment sets forth a technique for performing one or more matrix multiplication operations based on a first matrix and a second matrix. The technique includes receiving data associated with the first matrix from a first traversal engine that accesses nonzero elements included in the first matrix via a first tree structure. The technique also includes performing one or more computations on the data associated with the first matrix and the data associated with the second matrix to produce a plurality of partial results. The technique further includes combining the plurality of partial results into one or more intermediate results and storing the one or more intermediate results in a first buffer memory.Type: GrantFiled: May 19, 2021Date of Patent: November 12, 2024Assignee: NVIDIA CorporationInventors: Hanrui Wang, James Michael O'Connor, Donghyuk Lee
-
Patent number: 12130886Abstract: Methods and systems are disclosed to reduce the time and memory complexities associated with automatic differentiation of tensor models. The disclosed embodiment consists of a tensor contraction gradient calculator (TCGC) method, a tensor automatic differentiation (TAD) method and a TAD system. The disclosed embodiment eliminates the need to compute partial derivatives or Jacobians for computing tensor gradients of tensor contractions and tensor models. The disclosed embodiment computes tensor gradients of any arbitrary tensor model automatically with both memory and time complexities asymptotically equal to those of the evaluation of tensor models that are theoretically the lowest achievable complexities.Type: GrantFiled: January 24, 2024Date of Patent: October 29, 2024Inventor: Mohammad Solgi
-
Patent number: 12093342Abstract: A dynamic bias analog vector-matrix multiplication operation circuit comprises: positive value weight columns (101-10N), constant columns (201-20M) and subtractors (301-30N), wherein the number of the subtractors is equal to the number of the positive value weight columns, the subtractors are correspondingly connected to the positive value weight columns on a one-to-one basis, and the number of the constant columns is less than the number of the positive value weight columns; minuend input ends of the subtractors are correspondingly connected to output ends of the positive value weight columns, subtrahend input ends of a plurality of subtractors are connected to the same constant column, and output ends thereof output operation results. Before a weight is written in a programmable semiconductor device, a constant positive value is added to each element in a weight array, the weight array is written in a positive value weight column, and the constant positive value is written in a constant column.Type: GrantFiled: April 3, 2019Date of Patent: September 17, 2024Assignee: BELJING ZHICUN (WITIN) TECHNOLOGY CORPORATION LIMITEDInventor: Shaodi Wang
-
Patent number: 12072953Abstract: Techniques are described herein for performing efficient matrix multiplication in architectures with scratchpad memories or associative caches using asymmetric allocation of space for the different matrices. The system receives a left matrix and a right matrix. In an embodiment, the system allocates, in a scratchpad memory, asymmetric memory space for tiles for each of the two matrices as well as a dot product matrix. The system proceeds with then performing dot product matrix multiplication involving the tiles of the left and the right matrices, storing resulting dot product values in corresponding allocated dot product matrix tiles. The system then proceeds to write the stored dot product values from the scratchpad memory into main memory.Type: GrantFiled: June 16, 2021Date of Patent: August 27, 2024Assignee: ORACLE INTERNATIONAL CORPORATIONInventors: Gaurav Chadha, Sam Idicula, Sandeep Agrawal, Nipun Agarwal
-
Patent number: 12067401Abstract: Systems, apparatuses, and methods for implementing a low power parallel matrix multiply pipeline are disclosed. In one embodiment, a system includes at least first and second vector register files coupled to a matrix multiply pipeline. The matrix multiply pipeline comprises a plurality of dot product units. The dot product units are configured to calculate dot or outer products for first and second sets of operands retrieved from the first vector register file. The results of the dot or outer product operations are written back to the second vector register file. The second vector register file provides the results from the previous dot or outer product operations as inputs to subsequent dot or outer product operations. The dot product units receive the results from previous phases of the matrix multiply operation and accumulate these previous dot or outer product results with the current dot or outer product results.Type: GrantFiled: December 27, 2017Date of Patent: August 20, 2024Assignee: Advanced Micro Devices, Inc.Inventors: Jiasheng Chen, Yunxiao Zou, Michael J. Mantor, Allen Rush
-
Patent number: 12056489Abstract: Systems, methods, and apparatuses relating to 8-bit floating-point matrix dot product instructions are described.Type: GrantFiled: May 5, 2023Date of Patent: August 6, 2024Assignee: Intel CorporationInventors: Naveen Mellempudi, Alexander F. Heinecke, Robert Valentine, Mark J. Charney, Christopher J. Hughes, Evangelos Georganas, Zeev Sperber, Amit Gradstein, Simon Rubanovich
-
Patent number: 12045308Abstract: Detailed are embodiments related to bit matrix multiplication in a processor. For example, in some embodiments a processor comprising: decode circuitry to decode an instruction have fields for an opcode, an identifier of a first source bit matrix, an identifier of a second source bit matrix, an identifier of a destination bit matrix, and an immediate; and execution circuitry to execute the decoded instruction to perform a multiplication of a matrix of S-bit elements of the identified first source bit matrix with S-bit elements of the identified second source bit matrix, wherein the multiplication and accumulation operations are selected by the operation selector and store a result of the matrix multiplication into the identified destination bit matrix, wherein S indicates a plural bit size is described.Type: GrantFiled: December 16, 2022Date of Patent: July 23, 2024Assignee: Intel CorporationInventors: Dmitry Y. Babokin, Kshitij A. Doshi, Vadim Sukhomlinov
-
Patent number: 12038326Abstract: Aspects of the present disclosure include methods for spectrally resolving light from fluorophores having overlapping fluorescence spectra in a sample. Methods according to certain embodiments include detecting light with a light detection system from a sample having a plurality of fluorophores having overlapping fluorescence spectra and spectrally resolving light from each fluorophore in the sample. In some embodiments, methods include estimating the abundance of one or more of the fluorophores in the sample, such as on a particle. In certain instances, methods include identifying the particle in the sample based on the abundance of each fluorophore and sorting the particle. Methods according to some embodiments includes spectrally resolving the light from each fluorophore by calculating a spectral unmixing matrix for the fluorescence spectra of each fluorophore. Systems and integrated circuit devices (e.g., a field programmable gate array) for practicing the subject methods are also provided.Type: GrantFiled: October 27, 2022Date of Patent: July 16, 2024Assignee: BECTON, DICKINSON AND COMPANYInventors: Peter Mage, Keegan Owsley
-
Patent number: 12001508Abstract: A plurality of chiplets may be used to multiply two matrices A and B. Matrix A may be decomposed into horizontal stripes and matrix B may be decomposed into vertical stripes. Each of the horizontal stripes may be multiplied by each of the vertical stripes to form the output matrix C. Specifically, horizontal stripes may be stored in a stationary, distributed manner across the chiplets, while the vertical stripes (or sub-vertical stripes) may be passed between respective pairs of the chiplets until each of the vertical stripes (or sub-vertical stripes) of matrix B has been received and processed by each of the chiplets. The vertical stripes may be passed along one or more paths that interconnect the chiplets. Similar techniques can be applied to an arrangement in which the vertical stripes are stationary and the horizontal stripes (or sub-horizontal stripes) are passed between respective pairs of the chiplets.Type: GrantFiled: October 23, 2023Date of Patent: June 4, 2024Assignee: Persimmons, Inc.Inventor: James Michael Bodwin
-
Patent number: 11989257Abstract: An apparatus includes a processor and a memory to store instructions. The instructions, when executed by the processor, cause the processor to perform threading of a first matrix along a first dimension of the first matrix and a second dimension of the matrix. The threading represents block sizes of the first matrix to assign to process threads of a multiplication algorithm to determine a third matrix that represents a product of the first matrix and a second matrix. The block sizes include a first block size along the first dimension and a second block size along the second dimension. The second matrix shares the second dimension with the first matrix. The instructions, when executed by the processor, cause the processor to provide data to the multiplication algorithm, which represents the first block size and the second block size.Type: GrantFiled: October 29, 2020Date of Patent: May 21, 2024Assignee: Hewlett Packard Enterprise Development LPInventor: Aaron M. Collier
-
Patent number: 11941078Abstract: Performing set operations using sparse matrix operations offered by a multi-core processing unit (such as a graphics processing unit). The set operation is converted into operand matrices, and sparse matrix operations, foregoing the use of hash tables. The input set is converted into a matrix, a matrix operation corresponding to the set operation is identified, and one or more operands of the set operation are also represented within a matrix. The matrix operation is then performed on these matrices to obtain an output matrix, which is then converted to an output set.Type: GrantFiled: September 30, 2022Date of Patent: March 26, 2024Assignee: Microsoft Technology Licensing, LLCInventor: Ritwik Das
-
Patent number: 11934481Abstract: Embodiments of the present invention disclose a matrix multiplier, and relate to the field of data computing technologies, so as to divide two matrices into blocks for computation. The matrix multiplier includes: a first memory, a second memory, an operation circuit, and a controller, where the operation circuit, the first memory, and the second memory may perform data communication by using a bus; and the controller is configured to control, according to a preset program or instruction, a first matrix and a second matrix to be divided into blocks, and control the operation circuit to perform a multiplication operation on corresponding blocks in the first memory and the second memory based on block division results of the controller. The matrix multiplier may be configured to perform a multiplication operation on two matrices.Type: GrantFiled: April 20, 2022Date of Patent: March 19, 2024Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Hu Liu, Heng Liao, Jiajin Tu, Honghui Yuan, Hou Fun Lam, Fan Zhu
-
Patent number: 11928177Abstract: Methods and apparatus for performing video processing matrix operations within a memory fabric. Various embodiments of the present disclosure are directed to converting a memory array into a matrix fabric for discrete cosine transform (DCT) matrix transformations and performing DCT matrix operations therein. Exemplary embodiments described herein perform DCT matrix-matrix multiplication operations within a memory device that includes a matrix fabric and matrix multiplication unit (MMU). In one embodiment, matrix-matrix multiplication operations are obtained using separate matrix-vector products. In one exemplary embodiment, the matrix fabric uses a “crossbar” construction of resistive elements. Each resistive element stores a level of impedance that represents the corresponding matrix coefficient value. The crossbar connectivity can be driven with an electrical signal representing the input vector as an analog voltage.Type: GrantFiled: September 19, 2022Date of Patent: March 12, 2024Assignee: Micron Technology, Inc.Inventor: Fa-Long Luo
-
Patent number: 11922021Abstract: Data employed in computations is processed so that during computations more of the data can be fit into or maintained in a smaller but higher speed memory than an original source of the data. More specifically, a sensitivity value is determined for various items of the data which reflect the number of bits in the data items that are not garbage bits, and only information in the data items that are indicated by the sensitivity value to not be garbage bits are necessarily effectively retained. At least the information that is not garbage bits and the corresponding associated sensitivity are packed together. The results of computations that are performed using the data items as at least one of the operands for the computation are associated with a sensitivity that is derived from the individual sensitivities of the operands used in the computation.Type: GrantFiled: December 19, 2022Date of Patent: March 5, 2024Assignee: INTELLECTUAL PROPERTY SYSTEMS, LLCInventors: Juan Guillermo Gonzalez, Santiago Andres Fonseca, Rafael Camilo Nunez
-
Patent number: 11922131Abstract: A method for performing vector-matrix multiplication may include converting a digital input vector comprising a plurality of binary-encoded values into a plurality of analog signals using a plurality of one-bit digital to analog converters (DACs); sequentially performing, using an analog vector matrix multiplier and based on bit-order, vector-matrix multiplication operations using a weighting matrix for the plurality of analog signals to generate analog outputs of the analog vector matrix multiplier; sequentially performing an analog-to-digital (ADC) operation on the analog outputs of the analog vector matrix multiplier to generate binary partial output vectors; and combining the binary partial output vectors to generate a result of the vector-matrix multiplication.Type: GrantFiled: November 7, 2020Date of Patent: March 5, 2024Assignee: Applied Materials, Inc.Inventors: Xiaofeng Zhang, She-Hwa Yen
-
Patent number: 11899745Abstract: Disclosed herein includes a system, a method, and a device for processing and converting data using matrix operations. Circuitry can partition an input of a first data format across a plurality of lookup tables each residing in a respective memory. The circuitry can access weight information from a load store memory, and the partitioned input on a per column basis from the plurality of lookup tables. The circuitry can perform a number of multiply-accumulate (MAC) operations per cycle between the weight information from the load store memory and the partitioned input read on a per column basis from the plurality of lookup tables. The number of MAC operations performed per cycle can correspond to a total number of columns of the plurality of lookup tables. The circuitry can generate, responsive to the MAC operations on the partitioned input, a plurality of outputs in a second data format.Type: GrantFiled: August 19, 2020Date of Patent: February 13, 2024Assignee: Meta Platforms Technologies, LLCInventors: Alagappan Valliappan, Ganesh Venkatesh, Pierce I-Jen Chuang
-
Patent number: 11900577Abstract: There is provided with a processing apparatus. A data holder holds at least some of data of a plurality of channels in a target layer among a plurality of layers. Each of a plurality of processors performs, in parallel, a product-sum operation using the data of one channel of the target layer and a coefficient corresponding to the target layer. A selector selects whether to perform first processing or second processing on the basis of information specifying processing in the target layer. The first processing includes inputting the data of one channel of the target layer into one of the plurality of processors. The second processing includes inputting the data of one channel of the target layer to the plurality of processors in parallel.Type: GrantFiled: June 22, 2021Date of Patent: February 13, 2024Assignee: CANON KABUSHIKI KAISHAInventors: Tsewei Chen, Masami Kato, Shiori Wakino
-
Patent number: 11886378Abstract: A processor includes an array of resistive processing units connected between row and column lines with a resistive element. A first single instruction, multiple data processing unit (SIMD) is connected to the row lines. A second SIMD is connected to the column lines. A first instruction issuer is connected to the first SIMD to issue instructions to the first SIMD, and a second instruction issuer is connected to the second SIMD to issue instructions to the second SIMD such that the processor is programmable and configurable for specific operations depending on an issued instruction set.Type: GrantFiled: December 28, 2020Date of Patent: January 30, 2024Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventor: Tayfun Gokmen
-
Patent number: 11880426Abstract: Techniques for data manipulation using integer matrix multiplication using pipelining are disclosed. A first integer matrix with dimensions m×k and a second integer matrix with dimensions k×n are obtained for matrix multiplication within a processor. The first and second integer matrices employ a two's complement variable radix point data representation. The first and second integer matrices are distilled into (j×j) submatrices. A first variable radix point format and an initial value for an accumulator register are configured dynamically. A first variable radix point format is configured dynamically for the first integer matrix and a second variable radix point format is configured dynamically for the second integer matrix. Multiply-accumulate operations are executed in a pipelined fashion on the (j×j) submatrices of the first integer matrix and the second integer matrix, where a third variable radix point format is configured for the result.Type: GrantFiled: July 31, 2022Date of Patent: January 23, 2024Inventor: David John Simpson
-
Patent number: 11860970Abstract: A method for performing a matrix multiplication operation is provided. The method includes: obtaining a matrix B1, a matrix A2, and an index matrix, wherein the index matrix comprises indexes, in a matrix A1, of elements in the matrix A2; generating m matrices B2 based on the index matrix and the matrix B1, wherein the m matrices B2 are all matrices with t rows and n columns, and each row of each matrix B2 is a row indicated in the matrix B1 by a corresponding element in the index matrix; and generating a matrix C based on the matrix A2 and the m matrices B2, wherein the matrix C is a product of the matrix A1 and the matrix B1.Type: GrantFiled: June 15, 2022Date of Patent: January 2, 2024Assignee: HUAWEI TECHNOLOGIES CO., LTD.Inventors: Leijun He, Bin Xu, Kaixing Wang
-
Patent number: 11853717Abstract: Embodiments of the present disclosure include systems and methods for accelerating processing based on sparsity for neural network hardware processors. An input manager determines a pair of non-zero values from a pair of data streams in a plurality of pairs of data streams and retrieve the pair of non-zero values from the pair of data streams. A multiplier performs a multiplication operation on the pair of non-zero values and generate a product of the pair of non-zero values. An accumulator manager receives the product of the pair of non-zero values from the multiplier and sends the product of the pair of non-zero values to a corresponding accumulator in a plurality of accumulators.Type: GrantFiled: January 14, 2021Date of Patent: December 26, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Karthikeyan Avudaiyappan, Jeffrey Andrews
-
Patent number: 11853386Abstract: The invention relates to a method for rapidly calculating a three-dimensional polarimetric dimension, including: determining that an incident light field is a coherence matrix of a partially coherent Schell-model beam, and decomposing the coherence matrix into a form of multiplying an incident electric field by a coherence structure matrix of the incident light field; obtaining an electric field near a focal field after the incident electric field passes through a tight focusing system according to the vector diffraction theory, and describing a second-order correlation characteristic of a partially coherent vector beam near a tightly focused field by using a coherence matrix; obtaining a tightly focused polarization matrix based on the tightly focused coherence matrix; and rotating the tightly focused polarization matrix into an intrinsic coordinate frame of the tightly focused polarization matrix, and calculating a three-dimensional polarimetric dimension of the partially coherent Schell-model beam in the tType: GrantFiled: February 11, 2022Date of Patent: December 26, 2023Assignee: SOOCHOW UNIVERSITYInventors: Yahong Chen, Chencheng Yan, Fei Wang, Yangjian Cai
-
Patent number: 11853385Abstract: Methods and apparatus for performing diversity matrix operations within a memory fabric. Various embodiments of the present disclosure are directed to converting a memory array into a matrix fabric for spatial diversity-related matrix transformations and performing matrix operations therein. Exemplary embodiments described herein perform MIMO-related matrix transformations (e.g., precoding, beamforming, or data recovery matrix operations) within a memory device that includes a matrix fabric and matrix multiplication unit (MMU). In one variant, the matrix fabric uses a “crossbar” construction of resistive elements. Each resistive element stores a level of impedance that represents the corresponding matrix coefficient value. The crossbar connectivity can be driven with an electrical signal representing the input vector as an analog voltage. The resulting signals can be converted from analog voltages to a digital values by an MMU to yield a matrix-vector product.Type: GrantFiled: December 5, 2019Date of Patent: December 26, 2023Assignee: Micron Technology, Inc.Inventor: Fa-Long Luo
-
Patent number: 11847106Abstract: The disclosure is directed to various ways of improving the functioning of computer systems, information networks, data stores, search engine systems and methods, and other advantages. Among other things, provided herein are methods, systems, components, processes, modules, blocks, circuits, sub-systems, articles, and other elements (collectively referred to in some cases as the “platform” or the “system”) that collectively enable, in one or more datastores (e.g., where each datastore may include one or more databases) and systems, the creation, development, maintenance, and use of a set of custom objects for use in a wide range of activities, including sales activities, marketing activities, service activities, content development activities, and others, as well as improved methods and systems for sales, marketing and services that make use of such entity resolution systems and methods as well as custom objects.Type: GrantFiled: May 12, 2021Date of Patent: December 19, 2023Assignee: HUBSPOT, INC.Inventors: Hector Urdiales, Marco Lagi, Stephen J. Purcell, Stuart P. Layton, Bryan Ash, Jared Williams, Sophie Higgs, Robert McEneaney, Dylan Sellberg, Anna Perko
-
Patent number: 11847185Abstract: Disclosed embodiments relate to accelerating multiplication of sparse matrices. In one example, a processor is to fetch and decode an instruction having fields to specify locations of first, second, and third matrices, and an opcode indicating the processor is to multiply and accumulate matching non-zero (NZ) elements of the first and second matrices with corresponding elements of the third matrix, and executing the decoded instruction as per the opcode to generate NZ bitmasks for the first and second matrices, broadcast up to two NZ elements at a time from each row of the first matrix and each column of the second matrix to a processing engine (PE) grid, each PE to multiply and accumulate matching NZ elements of the first and second matrices with corresponding elements of the third matrix. Each PE further to store an NZ element for use in a subsequent multiplications.Type: GrantFiled: September 24, 2021Date of Patent: December 19, 2023Assignee: Intel CorporationInventors: Dan Baum, Chen Koren, Elmoustapha Ould-Ahmed-Vall, Michael Espig, Christopher J. Hughes, Raanan Sade, Robert Valentine, Mark J. Charney, Alexander F. Heinecke
-
Patent number: 11830543Abstract: A memory circuit includes a first memory array including first memory cells wherein a plurality of first word lines is coupled with a plurality of rows of first memory cells in a first segment of the first memory array, and a plurality of second word lines is coupled with the plurality of rows of first memory cells in a second segment of the first memory array. The memory circuit also includes a read circuit configured to retrieve data from the first memory cells of the first memory array and a computation circuit configured to perform a matrix computation by combining first data retrieved from the first memory cells of the first segment with second data retrieved from the first memory cells of the second segment.Type: GrantFiled: June 23, 2022Date of Patent: November 28, 2023Assignee: TAIWAN SEMICONDUCTOR MANUFACTURING COMPANY, LTD.Inventors: Yen-Huei Chen, Hidehiro Fujiwara, Hung-Jen Liao, Jonathan Tsung-Yung Chang