Patents Examined by Andrew Caldwell

Method and apparatus with neural network performing deconvolution

Patent number: 10885433

Abstract: A neural network apparatus configured to perform a deconvolution operation includes a memory configured to store a first kernel; and a processor configured to: obtain, from the memory, the first kernel; calculate a second kernel by adjusting an arrangement of matrix elements comprised in the first kernel; generate sub-kernels by dividing the second kernel; perform a convolution operation between an input feature map and the sub-kernels using a convolution operator; and generate an output feature map, as a deconvolution of the input feature map, by merging results of the convolution operation.

Type: Grant

Filed: August 21, 2018

Date of Patent: January 5, 2021

Assignee: Samsung Electronics Co., Ltd.

Inventors: Joonho Song, Sehwan Lee, Junwoo Jang
Accelerated convolution in convolutional neural networks

Patent number: 10878310

Abstract: Described embodiments include a system that includes one or more buffers and circuitry. The circuitry is configured to process a plurality of input values, by identifying each of the input values that is not zero-valued, and, for each value of the identified input values, computing respective products of coefficients of a kernel with the value and storing at least some of the respective products in the buffers. The circuitry is further configured to compute a plurality of output values, by retrieving respective sets of stored values from the buffers, at least some of the retrieved sets including one or more of the products, and summing the retrieved sets. The circuitry is further configured to output the computed output values. Other embodiments are also described.

Type: Grant

Filed: November 15, 2017

Date of Patent: December 29, 2020

Assignee: MELLANOX TECHNOLOGIES, LTD.

Inventors: Dotan Levi, Tal Anker, Ohad Markus
Performing constant modulo arithmetic

Patent number: 10877732

Abstract: A binary logic circuit for determining y=x mod(2m?1), where x is an n-bit integer, y is an m-bit integer, and n>m, includes reduction logic configured to reduce x to a sum of a first m-bit integer ? and a second m-bit integer ?; and addition logic configured to calculate an addition output represented by the m least significant bits of the following sum right-shifted by m: a first binary value of length 2m, the m most significant bits and the m least significant bits each being the string of bit values represented by ?; a second binary value of length 2m, the m most significant bits and the m least significant bits each being the string of bit values represented by ?; and the binary value 1.

Type: Grant

Filed: May 13, 2020

Date of Patent: December 29, 2020

Assignee: Imagination Technologies Limited

Inventor: Thomas Rose
Systems and method for a low power correlator architecture using distributed arithmetic

Patent number: 10879877

Abstract: Provided herein is an implementation of a finite impulse response (FIR) filter that uses a distributed arithmetic architecture. In one or more example, a data sample with multiple bits is processed through a plurality of bit-level multiply and accumulate circuits, wherein each bit of the data sample corresponds to a bit of the data sample. The output of each bit-level multiply and accumulate circuit can then be shifted by an appropriate amount based on the bit placement of the bit of the data sample that corresponds to the bit-level multiply and accumulate circuit. After each output is shifted by the appropriate amount, the outputs can be aggregated to form a final FIR filter result.

Type: Grant

Filed: September 28, 2018

Date of Patent: December 29, 2020

Assignee: The MITRE Corporation

Inventor: Rishi Yadav
Hierarchical Jacobi methods and systems implementing a dense symmetric eigenvalue solver

Patent number: 10867008

Abstract: Embodiments of the present invention provide a hierarchical, multi-layer Jacobi method for implementing a dense symmetric eigenvalue solver using multiple processors. Each layer of the hierarchical method is configured to process problems of different sizes, and the division between the layers is defined according to the configuration of the underlying computer system, such as memory capacity and processing power, as well as the communication overhead between device and host. In general, the higher-level Jacobi kernel methods call the lower level Jacobi kernel methods, and the results are passed up the hierarchy. This process is iteratively performed until a convergence condition is reached. Embodiments of the hierarchical Jacobi method disclosed herein offers controllability of Schur decomposition, robust tolerance for passing data throughout the hierarchy, and significant cost reduction on row update compared to existing methods.

Type: Grant

Filed: September 7, 2018

Date of Patent: December 15, 2020

Assignee: NVIDIA Corporation

Inventor: Lung-Sheng Chien
Accelerator for sparse-dense matrix multiplication

Patent number: 10867009

Abstract: Disclosed embodiments relate to an accelerator for sparse-dense matrix instructions. In one example, a processor to execute a sparse-dense matrix multiplication instruction, includes fetch circuitry to fetch the sparse-dense matrix multiplication instruction having fields to specify an opcode, a dense output matrix, a dense source matrix, and a sparse source matrix having a sparsity of non-zero elements, the sparsity being less than one, decode circuitry to decode the fetched sparse-dense matrix multiplication instruction, execution circuitry to execute the decoded sparse-dense matrix multiplication instruction to, for each non-zero element at row M and column K of the specified sparse source matrix generate a product of the non-zero element and each corresponding dense element at row K and column N of the specified dense source matrix, and generate an accumulated sum of each generated product and a previous value of a corresponding output element at row M and column N of the specified dense output matrix.

Type: Grant

Filed: July 6, 2020

Date of Patent: December 15, 2020

Assignee: Intel Corporation

Inventors: Srinivasan Narayanamoorthy, Nadathur Rajagopalan Satish, Alexey Suprun, Kenneth J. Janik
Apparatuses and methods for determining population count

Patent number: 10861563

Abstract: The present disclosure includes apparatuses and methods related to determining population count. An example apparatus comprises an array of memory cells coupled to sensing circuitry. The apparatus can include a controller configured to cause: summing, in parallel, of data values corresponding to respective ones of a plurality of first vectors stored in memory cells of the array as a data value sum representing a population count thereof, wherein a second vector is stored as the plurality of first vectors, and wherein each first vector of the plurality of first vectors is stored in respective memory cells of the array that are coupled to a respective sense line of a plurality of sense lines; and iteratively summing, in parallel, of data value sums corresponding to the plurality of first vectors to provide a single data value sum corresponding to the second vector.

Type: Grant

Filed: February 10, 2020

Date of Patent: December 8, 2020

Assignee: Micron Technology, Inc.

Inventors: Timothy P. Finkbeiner, Glen E. Hush, Richard C. Murphy
Calculating device, calculation program, recording medium, and calculation method

Patent number: 10860679

Abstract: According to one embodiment, a calculating device includes a processor repeating a processing procedure. The processing procedure includes a first variable update and a second variable update. The first variable update includes updating an ith entry of a first variable xi by adding a first function to the ith entry of the first variable xi before the first variable update. The second variable update includes updating the ith entry of the second variable yi by adding a second function and a third function to the ith entry of the second variable yi before the second variable update. The processor performs at least an output of at least one of the ith entry of the first variable xi obtained after the repeating of the processing procedure or a function of the ith entry of the first variable xi obtained after the repeating of the processing procedure.

Type: Grant

Filed: September 6, 2018

Date of Patent: December 8, 2020

Assignee: Kabushiki Kaisha Toshiba

Inventors: Hayato Goto, Kosuke Tatsumura
Apparatus and methods for matrix addition and subtraction

Patent number: 10860681

Abstract: Aspects for matrix addition in neural network are described herein. The aspects may include a controller unit configured to receive a matrix-add-scalar instruction that includes an address of the first matrix and a scalar value. The aspects may further include a computation module configured to receive the first matrix from a storage device based on the address of the first matrix. The first matrix may include one or more first elements. The one or more first elements are arranged in accordance with a two-dimensional data structure. The computation module may be further configured to respectively add the scalar value to each of the one or more first elements of the first matrix in accordance with the matrix-add-scalar instruction to generate one or more second elements for a second matrix.

Type: Grant

Filed: October 26, 2018

Date of Patent: December 8, 2020

Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

Inventors: Xiao Zhang, Shaoli Liu, Tianshi Chen, Yunji Chen
Apparatus and methods for generating dot product

Patent number: 10860316

Abstract: Aspects for generating a dot product for two vectors in neural network are described herein. The aspects may include a controller unit configured to receive a vector load instruction that includes a first address of a first vector and a length of the first vector. The aspects may further include a direct memory access unit configured to retrieve the first vector from a storage device based on the first address of the first vector. Further still, the aspects may include a caching unit configured to store the first vector.

Type: Grant

Filed: October 26, 2018

Date of Patent: December 8, 2020

Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED

Inventors: Tian Zhi, Qi Guo, Shaoli Liu, Tianshi Chen, Yunji Chen
Multiplier accumulator, network unit, and network apparatus

Patent number: 10853721

Abstract: According to an embodiment, a multiplier accumulator includes a controller, a high-order multiplier, a high-order accumulator, a low-order multiplier, and an output unit. The controller is configured to designate each digit within a range of the most significant digit in a coefficient for an input value to a stop digit as a target digit. The high-order multiplier is configured to calculate a high-order multiplication value by multiplying the input value, and a value and a weight of the target digit. The high-order accumulator is configured to calculate a high-order accumulation value by accumulatively adding the high-order multiplication values for input values. The low-order multiplier is configured to calculate a low-order multiplication value by multiplying an input value and a value of a digit smaller than the stop digit. The output unit is configured to output a value determined based on whether the high-order accumulation value exceeds a boundary value.

Type: Grant

Filed: August 24, 2017

Date of Patent: December 1, 2020

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masafumi Mori, Takao Marukame, Tetsufumi Tanamoto, Satoshi Takaya
Method for operating a digital computer to reduce the computational complexity associated with dot products between large vectors

Patent number: 10853068

Abstract: The present invention includes a method for operating a data processing system to compute an approximation to a scalar product between first and second vectors in which each vector is characterized by N components. The method includes replacing the first vector by a third vector that is a pyramid integer vector characterized by N components and an integer K equal to the sum of the absolute values of the N components, and computing a scalar product of the third vector with the second vector to provide the approximation to the scalar product between the first and second vectors. Computing the scalar product of the second and third vectors can be carried out by K additions followed by one floating point multiply.

Type: Grant

Filed: September 28, 2018

Date of Patent: December 1, 2020

Assignee: Ocean Logic Pty Ltd

Inventor: Vincenzo Liguori
Computer processor for higher precision computations using a mixed-precision decomposition of operations

Patent number: 10853067

Abstract: Embodiments detailed herein relate to arithmetic operations of float-point values. An exemplary processor includes decoding circuitry to decode an instruction, where the instruction specifies locations of a plurality of operands, values of which being in a floating-point format. The exemplary processor further includes execution circuitry to execute the decoded instruction, where the execution includes to: convert the values for each operand, each value being converted into a plurality of lower precision values, where an exponent is to be stored for each operand; perform arithmetic operations among lower precision values converted from values for the plurality of the operands; and generate a floating-point value by converting a resulting value from the arithmetic operations into the floating-point format and store the floating-point value.

Type: Grant

Filed: September 27, 2018

Date of Patent: December 1, 2020

Assignee: Intel Corporation

Inventors: Gregory Henry, Alexander Heinecke
Common factor mass multiplication circuitry

Patent number: 10853034

Abstract: An integrated circuit that includes common factor mass multiplier (CFMM) circuitry is provided that multiplies a common factor operand by a large number of multiplier operands. The CFMM circuitry may be implemented as an instance specific version or a non-instance specific version. The instance specific version might also be fully enumerated so that the hardware doesn't have to be redesigned assuming all possible unique multiplier values are implemented. Either version can be formed on a programmable integrated circuit or an application-specific integrated circuit. CFMM circuitry configured in this way can be used to support convolution neural networks or any operation that requires a straight common factor multiply. Any adder component with the CFMM circuitry may be implemented using bit-serial adders. The bit-serial adders may be further connected in a tree in CNN applications to sum together many input streams.

Type: Grant

Filed: September 28, 2018

Date of Patent: December 1, 2020

Assignee: Intel Corporation

Inventors: Thiam Khean Hah, Jason Gee Hock Ong, Yeong Tat Liew, Carl Ebeling, Vamsi Nalluri
Digital filter, filter processing method, and recording medium

Patent number: 10855255

Abstract: The present invention addresses the problem of increasing the likelihood of making it possible to reduce the consumption of power necessary for filter processing and the amount of heat generated during filter processing. In order to overcome this problem, a second complex signal and a third complex signal are generated from a first complex signal in a frequency domain, the third complex signal being a complex conjugate of the second complex signal. Signal selection is performed from the plurality of types of complex signals having different amounts of change in signal amplitude. Processing is performed on the complex signal selected as the signal using a first filter coefficient and a second filter coefficient. The complex signals after filter processing are synthesized to generate a complex signal, which is then outputted.

Type: Grant

Filed: December 1, 2016

Date of Patent: December 1, 2020

Assignee: NEC CORPORATION

Inventor: Atsufumi Shibayama
Digital filter device, digital filtering method, and program recording medium

Patent number: 10853445

Abstract: In order to reduce a circuit scale and power consumption while maintaining filter performance, a digital filter device includes a first transform circuit for executing a first transform process on data in a predetermined frequency range; a filtering circuit for executing a filtering process by setting an operation bit width of data of a preset first frequency component among the data, on which the first transform process was executed by the first transform circuit, to a different bit width from bit widths of other frequency components; and a second transform circuit for executing a second transform process on the data on which the filtering process was executed by the filtering circuit.

Type: Grant

Filed: April 14, 2017

Date of Patent: December 1, 2020

Assignee: NEC CORPORATION

Inventor: Atsufumi Shibayama
Underflow/overflow detection prior to normalization

Patent number: 10846053

Abstract: Methods and apparatuses for generating a condition code for a floating point number operation prior to normalization. A processor receives an intermediate result for an operation, wherein the intermediate result comprises an intermediate significand and an intermediate exponent. A processor determines a mask based on the value of the intermediate exponent. A processor generates a masked significand by applying the mask to the intermediate significand. A processor generates a condition code based on the masked significand having a predetermined value.

Type: Grant

Filed: June 27, 2014

Date of Patent: November 24, 2020

Assignee: International Business Machines Corporation

Inventors: Son T. Dao, Silvia Melitta Mueller
Unified logic for aliased processor instructions

Patent number: 10846089

Abstract: A binary logic circuit for manipulating an input binary string includes a first stage of a first group of multiplexers arranged to select respective portions of an input binary string and configured to receive a respective first control. A second stage is included in which a plurality of a second group of multiplexers is arranged to select respective portions of the input binary string and configured to receive a respective second control signal. The control signals are provided such that each multiplexer of a second group is configured to select a respective second portion of the first binary string. Control circuitry is configured to generate the first and second control signals such that two or more of the first groups and/or two or more of the second groups of multiplexers are independently controllable.

Type: Grant

Filed: August 31, 2018

Date of Patent: November 24, 2020

Assignee: MIPS Tech, LLC

Inventors: James Hippisley Robinson, Morgyn Taylor
Underflow/overflow detection prior to normalization

Patent number: 10846054

Abstract: Methods and apparatuses for generating a condition code for a floating point number operation prior to normalization. A processor receives an intermediate result for an operation, wherein the intermediate result comprises an intermediate significand and an intermediate exponent. A processor determines a mask based on the value of the intermediate exponent. A processor generates a masked significand by applying the mask to the intermediate significand. A processor generates a condition code based on the masked significand having a predetermined value.

Type: Grant

Filed: December 17, 2014

Date of Patent: November 24, 2020

Assignee: International Business Machines Corporation

Inventors: Son T. Dao, Silvia Melitta Mueller
Sparse matrix multiplication in associative memory device

Patent number: 10846365

Abstract: A method for use in an associative memory device when multiplying by a sparse matrix includes storing only non-zero elements of the sparse matrix in the associative memory device as multiplicands. The storing includes locating the non-zero elements in computation columns of the associative memory device according to linear algebra rules along with their associated multiplicands such that a multiplicand and a multiplier of each multiplication operation to be performed are stored in a same computation column. The locating locates one of the non-zero elements in more than one computation column if one of the non-zero elements is utilized in more than one multiplication operation.

Type: Grant

Filed: November 25, 2019

Date of Patent: November 24, 2020

Assignee: GSI Technology Inc.

Inventor: Avidan Akerib

prev 1 2 3 4 5 6 7 … next