Patents Examined by Michael D Yaary
  • Patent number: 11327719
    Abstract: A generation means 11 generates a uniform random number between 0 and a first probability, which is a probability of a stochastic variable becoming a value within a predetermined interval in a positive range in the first discrete distribution. When a uniform random number less than or equal to a second probability is generated, the second probability being a probability of the stochastic variable becoming a value within a predetermined interval in a second discrete distribution, which is a discrete Gaussian distribution on a one-dimensional lattice the center of which is the origin, the selection means 12 selects, as a random number generation method, an accumulation method in which a functional value defining the second discrete distribution is used. When a uniform random number greater than the second probability is generated, the selection means 12 selects a rejection sampling method as the random number generation method.
    Type: Grant
    Filed: August 7, 2017
    Date of Patent: May 10, 2022
    Assignee: NEC CORPORATION
    Inventors: Yuki Tanaka, Kazuhiko Minematsu
  • Patent number: 11327715
    Abstract: Low precision computers can be efficient at finding possible answers to search problems. However, sometimes the task demands finding better answers than a single low precision search. A computer system augments low precision computing with a small amount of high precision computing, to improve search quality with little additional computing.
    Type: Grant
    Filed: July 2, 2021
    Date of Patent: May 10, 2022
    Assignee: Singular Computing LLC
    Inventor: Joseph Bates
  • Patent number: 11327713
    Abstract: A computation unit comprises a floating point input having X bits including a sign bit, an E bit exponent and an M bit mantissa. A first circuit is operatively coupled to receive X-N bits of the input, including e1 bits of the exponent and ml bits of the mantissa, where e1?E, and m1?M, to output values over a first domain of the input. A second circuit is operatively coupled to receive X-K bits of the input, including e2 bits of the exponent, e2<e1, and m2 bits of the mantissa, m2>m1, to output values, over a second domain of the input. A range detector is operatively coupled to the input, to indicate a range in response to a value of the input. A selector can select the output of the first circuit or of the second circuit in response to the range detector.
    Type: Grant
    Filed: October 1, 2019
    Date of Patent: May 10, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Mingran Wang, Xiaoyan Li, Yongning Sheng
  • Patent number: 11327714
    Abstract: Low precision computers can be efficient at finding possible answers to search problems. However, sometimes the task demands finding better answers than a single low precision search. A computer system augments low precision computing with a small amount of high precision computing, to improve search quality with little additional computing.
    Type: Grant
    Filed: July 2, 2021
    Date of Patent: May 10, 2022
    Assignee: Singular Computing LLC
    Inventor: Joseph Bates
  • Patent number: 11314484
    Abstract: A semiconductor device having a novel structure is provided. The semiconductor device includes a plurality of operation circuits that can switch different kinds of operation processing; a plurality of switch circuits that can switch a connection state between the operation circuits; and a controller. The operation circuit includes a first memory that stores data corresponding to a weight parameter used in the plurality of kinds of operation processing. The operation circuit executes a product-sum operation by switching weight data in accordance with a context. The switch circuit includes a second memory that stores data for switching a plurality of connection states in response to switching of a second context signal. The controller generates a second context signal on the basis of a first context signal. The amount of data stored in the second memory can be smaller than the amount of data stored in the first memory in the operation circuit.
    Type: Grant
    Filed: May 7, 2018
    Date of Patent: April 26, 2022
    Assignee: Semiconductor Energy Laboratory Co., Ltd.
    Inventors: Munehiro Kozuma, Takeshi Aoki, Seiichi Yoneda, Yoshiyuki Kurokawa
  • Patent number: 11314845
    Abstract: Interpolation logic described herein provides a good approximation to a bicubic interpolation, which is generally smoother than bilinear interpolation, without performing all the calculations normally needed for a bicubic interpolation. This allows an approximation of smooth bicubic interpolation to be performed on devices (e.g. mobile devices) which have limited processing resources. At each of a set of predetermined interpolation positions within an array of data points, a set of predetermined weights represent a bicubic interpolation which can be applied to the data points. For a plurality of the predetermined interpolation positions which surround the sampling position, the corresponding sets of predetermined weights and the data points are used to determine a plurality of surrounding interpolated values which represent results of performing the bicubic interpolation at the surrounding predetermined interpolation positions.
    Type: Grant
    Filed: June 18, 2020
    Date of Patent: April 26, 2022
    Assignee: Imagination Technologies Limited
    Inventor: Simon Fenney
  • Patent number: 11314504
    Abstract: An integrated circuit including a plurality of processing components, including first and second processing components, wherein each processing component includes first memory to store image data and a plurality of multiplier-accumulator execution pipelines, wherein each multiplier-accumulator execution pipeline includes a plurality of multiplier-accumulator circuits to, in operation, perform multiply and accumulate operations using data from the first memory and filter weights. The first processing component is configured to process all of the data associated with all of stages of a first image frame via the plurality of multiplier-accumulator execution pipelines of the first processing component. The second processing component is configured to process all of the data associated with all of stages of a second image frame via the plurality of multiplier-accumulator execution pipelines of the second processing component, wherein the first image frame and the second image frame are successive image frames.
    Type: Grant
    Filed: March 11, 2020
    Date of Patent: April 26, 2022
    Assignee: Flex Logix Technologies, Inc.
    Inventors: Frederick A. Ware, Cheng C. Wang, Valentin Ossman
  • Patent number: 11308389
    Abstract: Provided are an integrated circuit chip apparatus and a related product, the integrated circuit chip apparatus being used for executing a multiplication operation, a convolution operation or a training operation of a neural network. The present technical solution has the advantages of a small amount of calculation and low power consumption.
    Type: Grant
    Filed: December 19, 2019
    Date of Patent: April 19, 2022
    Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED
    Inventors: Shaoli Liu, Xinkai Song, Bingrui Wang, Yao Zhang, Shuai Hu
  • Patent number: 11307853
    Abstract: A matrix multiplication device and an operation method thereof are provided. The matrix multiplication device includes calculation circuits, a control circuit, a multiplication circuit, and a routing circuit. The calculation circuits produce multiply-accumulate values. The control circuit receives a plurality of first element values of a first matrix. The control circuit classifies the first element values into at least one classification value. The multiplication circuit multiplies the classification value by a second element value of a second matrix in a low power mode to obtain at least one product value. The routing circuit transmits each of the product values to at least one corresponding calculation circuit in the calculation circuits in the low power mode.
    Type: Grant
    Filed: October 29, 2019
    Date of Patent: April 19, 2022
    Assignee: NEUCHIPS CORPORATION
    Inventors: Chiung-Liang Lin, Chao-Yang Kao, Youn-Long Lin, Huang-Chih Kuo, Jian-Wen Chen
  • Patent number: 11301213
    Abstract: An integrated circuit with a large multiplier is provided. The multiplier may be configured to receive large input operands with thousands of bits. The multiplier may be implemented using a multiplier decomposition scheme that is recursively flattened into multiple decomposition levels to expose a tree of adders. The adders may be collapsed into a merged pipelined structure, where partial sums are forwarded from one level to the next while bypassing intervening prefix networks. The final correct sum is not calculated until later. In accordance with the decomposition technique, the partial sums are successively halved, which allows the prefix networks to be smaller from one level to the next. This allows all sums to be calculated at approximately the same pipeline depth, which significantly reduces latency with no or limited pipeline balancing.
    Type: Grant
    Filed: June 24, 2019
    Date of Patent: April 12, 2022
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Bogdan Pasca
  • Patent number: 11301546
    Abstract: A method comprises receiving one or more sizes for each of the dimensions of a kernel that is convolved with an input tensor to generate an output activation, generating a control pattern used to compute output values for the convolution of the input tensor, with the control pattern being a square matrix with each dimension being a size equal to the product of the width and the height of the kernel. The control pattern is generated by generating a value for each position of the control pattern that is based on a location of the position in the control pattern and the one or more sizes of each of the dimensions of the kernel, the value indicating a location from which to access values from a flattened input tensor for the convolution with the kernel.
    Type: Grant
    Filed: November 18, 2019
    Date of Patent: April 12, 2022
    Assignee: Groq, Inc.
    Inventors: Jonathan Alexander Ross, Thomas Hawkins, Gregory Michael Thorson, Matt Boyd
  • Patent number: 11294638
    Abstract: A system for generating random numbers comprises a light source for emitting photons, an optical diffuser element, and a plurality of light detector elements, each being for converting received light into electrical charge. The system further comprises means for converting the electrical charge of each of the plurality of light detector elements into an output value. The light source is for illuminating the plurality of light detectors with the photons, whereby the photons are incident on random ones of the plurality of light detectors. The diffuser is located in a light path between the light source and the plurality of light detector elements, and is for making the degree of illumination of each of the plurality of light detector elements more uniform, whereby the output values of the plurality of light detector elements comprise a set of random numbers each comprising quantum noise.
    Type: Grant
    Filed: December 19, 2017
    Date of Patent: April 5, 2022
    Assignee: CRYPTA LABS LIMITED
    Inventors: Oliver Maynard, Joe Hq Luong
  • Patent number: 11294626
    Abstract: The present disclosure relates generally to techniques for adjusting the number representation (e.g., format) of a variable before and/or after performing one or more arithmetic operations on the variable. In particular, the present disclosure relates to scaling the range of a variable to a suitable representation based on available hardware (e.g., hard logic) in an integrated circuit device. For example, an input in a first number format (e.g., bfloat16) may be scaled to a second number format (e.g., half-precision floating-point) so that circuitry implemented to receive inputs in the second number format may perform one or more arithmetic operations on the input. Further, the output produced by the circuitry may be scaled back to the first number format. Accordingly, arithmetic operations, such as a dot-product, performed in a first format may be emulated by scaling the inputs to and/or the outputs from arithmetic operations performed in another format.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: April 5, 2022
    Assignee: Intel Corporation
    Inventors: Bogdan Mihai Pasca, Martin Langhammer
  • Patent number: 11281463
    Abstract: Methods and apparatus relating to conversion of an unsigned normalized (unorm) integer values to floating-point (float) values in low power are described. In an embodiment, conversion logic converts a unorm integer value to a floating-point value based on detection of whether the unorm integer matches one of three cases, wherein the unorm integer value comprises n bits. Memory stores a count value corresponding to n?1 bits of the unorm integer value after detection of a leading 1 in the unorm integer value. The three cases include: a first case with all zeros, a second case with all ones, and a third case with a combination of one or more zeros and one or more ones. Other embodiments are also disclosed and claimed.
    Type: Grant
    Filed: March 25, 2018
    Date of Patent: March 22, 2022
    Assignee: INTEL CORPORATION
    Inventors: Benjamin Pletcher, Rahul Kumar
  • Patent number: 11281745
    Abstract: Methods and systems of matrix multiplication are described. In an example, a processor can multiply a first entry of a first vector of a first data array with a second vector of a second data array to generate a third vector of a third data array. The processor can store the third vector of the third data array in the second register file. The processor can multiply a second entry of the first vector with the second vector to generate a fourth vector of the third data array. The processor can store the fourth vector of the third data array in the second register file. The processor can combine vectors of the third data array that are stored in the second register file to produce the third data array.
    Type: Grant
    Filed: August 16, 2019
    Date of Patent: March 22, 2022
    Assignee: International Business Machines Corporation
    Inventors: Bruce Fleischer, Jose E. Moreira, Joel A. Silberman
  • Patent number: 11275561
    Abstract: An example computer-implemented method includes receiving a first value, a second value, a third value, and a fourth value, wherein the first value, the second value, the third value, and the fourth value are 16-bit or smaller precision floating-point numbers. The method further includes multiplying the first value and the second value to generate a first product, wherein the first product is a 32-bit floating-point number. The method further includes multiplying the third value and the fourth value to generate a second product, wherein the second product is a 32-bit floating-point number. The method further includes summing the first product and the second product to generate a summed value, wherein the summed value is a 32-bit floating-point number. The method further includes adding the summed value to an addend value to generate a result value, wherein the addend value and the result value are 32-bit floating-point numbers.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: March 15, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Silvia Melitta Mueller, Andreas Wagner, Brian W. Thompto
  • Patent number: 11269629
    Abstract: Many signal processing, machine learning and scientific computing applications require a large number of multiply-accumulate (MAC) operations. This type of operation is demanding in both computation and memory. Process in memory has been proposed as a new technique that computes directly on a large array of data in place, to eliminate expensive data movement overhead. To enable parallel multi-bit MAC operations, both width- and level-modulating memory word lines are applied. To improve performance and provide tolerance against process-voltage-temperature variations, a delay-locked loop is used to generate fine unit pulses for driving memory word lines and a dual-ramp Single-slope ADC is used to convert bit line outputs. The concept is prototyped in a 180 nm CMOS test chip made of four 320×64 compute-SRAMs, each supporting 128× parallel 5 b×5 b MACs with 32 5 b output ADCs and consuming 16.6 mW at 200 MHz.
    Type: Grant
    Filed: November 29, 2018
    Date of Patent: March 8, 2022
    Assignee: The Regents of the University of Michigan
    Inventors: Zhengya Zhang, Thomas Chen, Jacob Christopher Botimer, Shiming Song
  • Patent number: 11262982
    Abstract: A computation circuit includes a plurality of processing elements and a common accumulator. The plurality of processing elements are sequentially coupled in series, and performs a multiply and accumulate (MAC) operation on a weight signal and at least one of two or more input signals received in each unit cycle. The common accumulator is sequentially and cyclically coupled to first to Kth processing elements among the plurality of processing elements, and configured to receive a computation value outputted from a processing element coupled thereto among the first to Kth processing elements, and store computation information. The K is decided based on values of the two or more input signals and the number of guard bits included in one processing element.
    Type: Grant
    Filed: July 22, 2019
    Date of Patent: March 1, 2022
    Assignees: SK hynix Inc., SK Telecom Co., Ltd.
    Inventors: Yong Sang Park, Seok Joong Hwang
  • Patent number: 11250103
    Abstract: A system for determining the frequency coefficients of a one or multi-dimensional signal that is sparse in the frequency domain includes determining the locations of the non-zero frequency coefficients, and then determining values of the coefficients using the determined locations. If N is total number of frequency coefficients across the one or more dimension of the signal, and if R is an upper bound of the number of non-zero ones of these frequency coefficients, the systems requires up to (O (R log(R) (N))) samples and has a computation complexity of up to O (R log2(R) log (N). The system and the processing technique are stable to low-level noise and can exhibit only a small probability of failure. The frequency coefficients can be real and positive or they can be complex numbers.
    Type: Grant
    Filed: January 25, 2017
    Date of Patent: February 15, 2022
    Assignee: Reservoir Labs, Inc.
    Inventor: Pierre-David Letourneau
  • Patent number: 11243744
    Abstract: A method (40) is provided for performing a trustworthiness test on a random number generator, RNG, (20) comprising a physical unclonable function, PUF-module (21). The trustworthiness test is implemented as a known answer test, KAT, and the method (40) comprises: receiving (41), in the PUF-module (21), an input based on test data, T, received from a verifier (11) provided with at least one test data-test result pair, (T, R), providing (42) an output from the PUF-module (21), determining (43) a test result, R?, based on the output from the PUF-module (21), and providing (44) the test result, R?, to the verifier (11). A random number generator (20), computer program and computer program products and a method performed by or in a verifier are also provided.
    Type: Grant
    Filed: November 15, 2016
    Date of Patent: February 8, 2022
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Mats Näslund, Elena Dubrova, Karl Norrman