Patents Examined by Michael D Yaary
  • Patent number: 11366640
    Abstract: Random number generator (GL) comprising adjustable speed ring oscillators (GPRS, GPRS?), which have outputs (o-GPRS, o-GPRS?) connected to inputs (i1-UM, i2-UM) of a metastability circuit (UM) and inputs (i1-DF, i2-DF) of a phase detector (DF), which outputs (o-UM, o-DF) are connected to inputs (r-US?, i-US?) of a control circuit (US?), having output (o-US?) connected to control inputs (s-GPRS, s-GPRS?) of the adjustable speed ring oscillators (GPRS, GPRS?). The outputs (o-UM, o-DF) of the metastability circuit (UM) and the phase detector (DF) are being outputs (o-GL, o2-GL) of the random number generator (GL).
    Type: Grant
    Filed: August 7, 2018
    Date of Patent: June 21, 2022
    Assignee: POLITECHNIKA WARSZAWSKA
    Inventors: Krzysztof Golofit, Piotr Wieczorek
  • Patent number: 11366638
    Abstract: Floating point Multiply-Add, Accumulate Unit, supporting BF16 format for Multiply-Accumulate operations, and FP32 Single-Precision Addition complying with the IEEE 754 Standard. The Multiply-Accumulate unit uses higher radix and longer internal 2's complement significand representation to facilitate precision as well as comparison and operation with negative numbers. The addition can be performed using Carry-Save format to avoid long carry propagation and speed up the operation. The circuit uses early exponent comparison to shorten the accumulate pipeline stage. Operations including overflow detection, zero detection and sign extension are adopted for 2s complement and Carry-Save format.
    Type: Grant
    Filed: September 2, 2021
    Date of Patent: June 21, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Vojin G. Oklobdzija, Matthew M. Kim
  • Patent number: 11361052
    Abstract: A method for formatting a weight matrix including a plurality of sub matrices each being multiplied with an input vector may include sequentially adding weight information included in respective first columns of the plurality of sub matrices to formatted data; and sequentially adding weight information included in respective second columns of the plurality of sub matrices to the formatted data after the weight information from the first columns of the plurality of sub matrices. The weight information may be non-zero weight information.
    Type: Grant
    Filed: January 21, 2020
    Date of Patent: June 14, 2022
    Assignees: SK hynix Inc., POSTECH ACADEMY-INDUSTRY FOUNDATION
    Inventors: Junki Park, Jae-Joon Kim, Youngjae Jin, Euiseok Kim
  • Patent number: 11354096
    Abstract: Low precision computers can be efficient at finding possible answers to search problems. However, sometimes the task demands finding better answers than a single low precision search. A computer system augments low precision computing with a small amount of high precision computing, to improve search quality with little additional computing.
    Type: Grant
    Filed: July 2, 2021
    Date of Patent: June 7, 2022
    Assignee: Singular Computing LLC
    Inventor: Joseph Bates
  • Patent number: 11341397
    Abstract: Some embodiments provide a method for a neural network inference circuit (NNIC) that implements a neural network including multiple computation nodes at multiple layers. Each computation node includes a dot product of input values and weight values and a set of post-processing operations. The method retrieves a set of weight values and a set of input values for a computation node from a set of memories of the NNIC. The method computes a dot product of the retrieved sets of weight values and input values. The method performs the post-processing operations for the computation node on a result of the dot product computation to compute an output value for the computation node. The method stores the output value in the set of memories. No intermediate results of the dot product or the set of post-processing operations are stored in any RAM of the NNIC during the computation.
    Type: Grant
    Filed: December 6, 2018
    Date of Patent: May 24, 2022
    Assignee: PERCEIVE CORPORATION
    Inventors: Kenneth Duong, Jung Ko, Steven L. Teig
  • Patent number: 11341210
    Abstract: This application relates to a multi-layer convolution operation. The multi-layer convolution operation is optimized for a vector processing unit having a number of data paths configured to operate on vector operands containing a number of elements processed in parallel by the data paths. The convolution operation specifies a convolution kernel utilized to filter a multi-channel input and generate a multi-channel output of the convolution operation. A number of threads are generated to process blocks of the multi-channel output, each block comprising a set of windows of a number of channels of the multi-channel output. Each window is a portion of the array of elements in a single layer of the multi-channel output. Each thread processes a block in accordance with an arbitrary width of the block, processing a set of instructions for each sub-block of the block having a well-defined width, the instructions optimized for the vector processing unit.
    Type: Grant
    Filed: May 29, 2019
    Date of Patent: May 24, 2022
    Inventors: Asaf Hargil, Ali Sazegari
  • Patent number: 11327713
    Abstract: A computation unit comprises a floating point input having X bits including a sign bit, an E bit exponent and an M bit mantissa. A first circuit is operatively coupled to receive X-N bits of the input, including e1 bits of the exponent and ml bits of the mantissa, where e1?E, and m1?M, to output values over a first domain of the input. A second circuit is operatively coupled to receive X-K bits of the input, including e2 bits of the exponent, e2<e1, and m2 bits of the mantissa, m2>m1, to output values, over a second domain of the input. A range detector is operatively coupled to the input, to indicate a range in response to a value of the input. A selector can select the output of the first circuit or of the second circuit in response to the range detector.
    Type: Grant
    Filed: October 1, 2019
    Date of Patent: May 10, 2022
    Assignee: SambaNova Systems, Inc.
    Inventors: Mingran Wang, Xiaoyan Li, Yongning Sheng
  • Patent number: 11327714
    Abstract: Low precision computers can be efficient at finding possible answers to search problems. However, sometimes the task demands finding better answers than a single low precision search. A computer system augments low precision computing with a small amount of high precision computing, to improve search quality with little additional computing.
    Type: Grant
    Filed: July 2, 2021
    Date of Patent: May 10, 2022
    Assignee: Singular Computing LLC
    Inventor: Joseph Bates
  • Patent number: 11327719
    Abstract: A generation means 11 generates a uniform random number between 0 and a first probability, which is a probability of a stochastic variable becoming a value within a predetermined interval in a positive range in the first discrete distribution. When a uniform random number less than or equal to a second probability is generated, the second probability being a probability of the stochastic variable becoming a value within a predetermined interval in a second discrete distribution, which is a discrete Gaussian distribution on a one-dimensional lattice the center of which is the origin, the selection means 12 selects, as a random number generation method, an accumulation method in which a functional value defining the second discrete distribution is used. When a uniform random number greater than the second probability is generated, the selection means 12 selects a rejection sampling method as the random number generation method.
    Type: Grant
    Filed: August 7, 2017
    Date of Patent: May 10, 2022
    Assignee: NEC CORPORATION
    Inventors: Yuki Tanaka, Kazuhiko Minematsu
  • Patent number: 11327715
    Abstract: Low precision computers can be efficient at finding possible answers to search problems. However, sometimes the task demands finding better answers than a single low precision search. A computer system augments low precision computing with a small amount of high precision computing, to improve search quality with little additional computing.
    Type: Grant
    Filed: July 2, 2021
    Date of Patent: May 10, 2022
    Assignee: Singular Computing LLC
    Inventor: Joseph Bates
  • Patent number: 11314845
    Abstract: Interpolation logic described herein provides a good approximation to a bicubic interpolation, which is generally smoother than bilinear interpolation, without performing all the calculations normally needed for a bicubic interpolation. This allows an approximation of smooth bicubic interpolation to be performed on devices (e.g. mobile devices) which have limited processing resources. At each of a set of predetermined interpolation positions within an array of data points, a set of predetermined weights represent a bicubic interpolation which can be applied to the data points. For a plurality of the predetermined interpolation positions which surround the sampling position, the corresponding sets of predetermined weights and the data points are used to determine a plurality of surrounding interpolated values which represent results of performing the bicubic interpolation at the surrounding predetermined interpolation positions.
    Type: Grant
    Filed: June 18, 2020
    Date of Patent: April 26, 2022
    Assignee: Imagination Technologies Limited
    Inventor: Simon Fenney
  • Patent number: 11314504
    Abstract: An integrated circuit including a plurality of processing components, including first and second processing components, wherein each processing component includes first memory to store image data and a plurality of multiplier-accumulator execution pipelines, wherein each multiplier-accumulator execution pipeline includes a plurality of multiplier-accumulator circuits to, in operation, perform multiply and accumulate operations using data from the first memory and filter weights. The first processing component is configured to process all of the data associated with all of stages of a first image frame via the plurality of multiplier-accumulator execution pipelines of the first processing component. The second processing component is configured to process all of the data associated with all of stages of a second image frame via the plurality of multiplier-accumulator execution pipelines of the second processing component, wherein the first image frame and the second image frame are successive image frames.
    Type: Grant
    Filed: March 11, 2020
    Date of Patent: April 26, 2022
    Assignee: Flex Logix Technologies, Inc.
    Inventors: Frederick A. Ware, Cheng C. Wang, Valentin Ossman
  • Patent number: 11314484
    Abstract: A semiconductor device having a novel structure is provided. The semiconductor device includes a plurality of operation circuits that can switch different kinds of operation processing; a plurality of switch circuits that can switch a connection state between the operation circuits; and a controller. The operation circuit includes a first memory that stores data corresponding to a weight parameter used in the plurality of kinds of operation processing. The operation circuit executes a product-sum operation by switching weight data in accordance with a context. The switch circuit includes a second memory that stores data for switching a plurality of connection states in response to switching of a second context signal. The controller generates a second context signal on the basis of a first context signal. The amount of data stored in the second memory can be smaller than the amount of data stored in the first memory in the operation circuit.
    Type: Grant
    Filed: May 7, 2018
    Date of Patent: April 26, 2022
    Assignee: Semiconductor Energy Laboratory Co., Ltd.
    Inventors: Munehiro Kozuma, Takeshi Aoki, Seiichi Yoneda, Yoshiyuki Kurokawa
  • Patent number: 11308389
    Abstract: Provided are an integrated circuit chip apparatus and a related product, the integrated circuit chip apparatus being used for executing a multiplication operation, a convolution operation or a training operation of a neural network. The present technical solution has the advantages of a small amount of calculation and low power consumption.
    Type: Grant
    Filed: December 19, 2019
    Date of Patent: April 19, 2022
    Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITED
    Inventors: Shaoli Liu, Xinkai Song, Bingrui Wang, Yao Zhang, Shuai Hu
  • Patent number: 11307853
    Abstract: A matrix multiplication device and an operation method thereof are provided. The matrix multiplication device includes calculation circuits, a control circuit, a multiplication circuit, and a routing circuit. The calculation circuits produce multiply-accumulate values. The control circuit receives a plurality of first element values of a first matrix. The control circuit classifies the first element values into at least one classification value. The multiplication circuit multiplies the classification value by a second element value of a second matrix in a low power mode to obtain at least one product value. The routing circuit transmits each of the product values to at least one corresponding calculation circuit in the calculation circuits in the low power mode.
    Type: Grant
    Filed: October 29, 2019
    Date of Patent: April 19, 2022
    Assignee: NEUCHIPS CORPORATION
    Inventors: Chiung-Liang Lin, Chao-Yang Kao, Youn-Long Lin, Huang-Chih Kuo, Jian-Wen Chen
  • Patent number: 11301213
    Abstract: An integrated circuit with a large multiplier is provided. The multiplier may be configured to receive large input operands with thousands of bits. The multiplier may be implemented using a multiplier decomposition scheme that is recursively flattened into multiple decomposition levels to expose a tree of adders. The adders may be collapsed into a merged pipelined structure, where partial sums are forwarded from one level to the next while bypassing intervening prefix networks. The final correct sum is not calculated until later. In accordance with the decomposition technique, the partial sums are successively halved, which allows the prefix networks to be smaller from one level to the next. This allows all sums to be calculated at approximately the same pipeline depth, which significantly reduces latency with no or limited pipeline balancing.
    Type: Grant
    Filed: June 24, 2019
    Date of Patent: April 12, 2022
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Bogdan Pasca
  • Patent number: 11301546
    Abstract: A method comprises receiving one or more sizes for each of the dimensions of a kernel that is convolved with an input tensor to generate an output activation, generating a control pattern used to compute output values for the convolution of the input tensor, with the control pattern being a square matrix with each dimension being a size equal to the product of the width and the height of the kernel. The control pattern is generated by generating a value for each position of the control pattern that is based on a location of the position in the control pattern and the one or more sizes of each of the dimensions of the kernel, the value indicating a location from which to access values from a flattened input tensor for the convolution with the kernel.
    Type: Grant
    Filed: November 18, 2019
    Date of Patent: April 12, 2022
    Assignee: Groq, Inc.
    Inventors: Jonathan Alexander Ross, Thomas Hawkins, Gregory Michael Thorson, Matt Boyd
  • Patent number: 11294638
    Abstract: A system for generating random numbers comprises a light source for emitting photons, an optical diffuser element, and a plurality of light detector elements, each being for converting received light into electrical charge. The system further comprises means for converting the electrical charge of each of the plurality of light detector elements into an output value. The light source is for illuminating the plurality of light detectors with the photons, whereby the photons are incident on random ones of the plurality of light detectors. The diffuser is located in a light path between the light source and the plurality of light detector elements, and is for making the degree of illumination of each of the plurality of light detector elements more uniform, whereby the output values of the plurality of light detector elements comprise a set of random numbers each comprising quantum noise.
    Type: Grant
    Filed: December 19, 2017
    Date of Patent: April 5, 2022
    Assignee: CRYPTA LABS LIMITED
    Inventors: Oliver Maynard, Joe Hq Luong
  • Patent number: 11294626
    Abstract: The present disclosure relates generally to techniques for adjusting the number representation (e.g., format) of a variable before and/or after performing one or more arithmetic operations on the variable. In particular, the present disclosure relates to scaling the range of a variable to a suitable representation based on available hardware (e.g., hard logic) in an integrated circuit device. For example, an input in a first number format (e.g., bfloat16) may be scaled to a second number format (e.g., half-precision floating-point) so that circuitry implemented to receive inputs in the second number format may perform one or more arithmetic operations on the input. Further, the output produced by the circuitry may be scaled back to the first number format. Accordingly, arithmetic operations, such as a dot-product, performed in a first format may be emulated by scaling the inputs to and/or the outputs from arithmetic operations performed in another format.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: April 5, 2022
    Assignee: Intel Corporation
    Inventors: Bogdan Mihai Pasca, Martin Langhammer
  • Patent number: 11281745
    Abstract: Methods and systems of matrix multiplication are described. In an example, a processor can multiply a first entry of a first vector of a first data array with a second vector of a second data array to generate a third vector of a third data array. The processor can store the third vector of the third data array in the second register file. The processor can multiply a second entry of the first vector with the second vector to generate a fourth vector of the third data array. The processor can store the fourth vector of the third data array in the second register file. The processor can combine vectors of the third data array that are stored in the second register file to produce the third data array.
    Type: Grant
    Filed: August 16, 2019
    Date of Patent: March 22, 2022
    Assignee: International Business Machines Corporation
    Inventors: Bruce Fleischer, Jose E. Moreira, Joel A. Silberman