Patents Examined by Michael D Yaary

Patent number: 11327719Abstract: A generation means 11 generates a uniform random number between 0 and a first probability, which is a probability of a stochastic variable becoming a value within a predetermined interval in a positive range in the first discrete distribution. When a uniform random number less than or equal to a second probability is generated, the second probability being a probability of the stochastic variable becoming a value within a predetermined interval in a second discrete distribution, which is a discrete Gaussian distribution on a onedimensional lattice the center of which is the origin, the selection means 12 selects, as a random number generation method, an accumulation method in which a functional value defining the second discrete distribution is used. When a uniform random number greater than the second probability is generated, the selection means 12 selects a rejection sampling method as the random number generation method.Type: GrantFiled: August 7, 2017Date of Patent: May 10, 2022Assignee: NEC CORPORATIONInventors: Yuki Tanaka, Kazuhiko Minematsu

Patent number: 11327715Abstract: Low precision computers can be efficient at finding possible answers to search problems. However, sometimes the task demands finding better answers than a single low precision search. A computer system augments low precision computing with a small amount of high precision computing, to improve search quality with little additional computing.Type: GrantFiled: July 2, 2021Date of Patent: May 10, 2022Assignee: Singular Computing LLCInventor: Joseph Bates

Patent number: 11327713Abstract: A computation unit comprises a floating point input having X bits including a sign bit, an E bit exponent and an M bit mantissa. A first circuit is operatively coupled to receive XN bits of the input, including e1 bits of the exponent and ml bits of the mantissa, where e1?E, and m1?M, to output values over a first domain of the input. A second circuit is operatively coupled to receive XK bits of the input, including e2 bits of the exponent, e2<e1, and m2 bits of the mantissa, m2>m1, to output values, over a second domain of the input. A range detector is operatively coupled to the input, to indicate a range in response to a value of the input. A selector can select the output of the first circuit or of the second circuit in response to the range detector.Type: GrantFiled: October 1, 2019Date of Patent: May 10, 2022Assignee: SambaNova Systems, Inc.Inventors: Mingran Wang, Xiaoyan Li, Yongning Sheng

Patent number: 11327714Abstract: Low precision computers can be efficient at finding possible answers to search problems. However, sometimes the task demands finding better answers than a single low precision search. A computer system augments low precision computing with a small amount of high precision computing, to improve search quality with little additional computing.Type: GrantFiled: July 2, 2021Date of Patent: May 10, 2022Assignee: Singular Computing LLCInventor: Joseph Bates

Patent number: 11314484Abstract: A semiconductor device having a novel structure is provided. The semiconductor device includes a plurality of operation circuits that can switch different kinds of operation processing; a plurality of switch circuits that can switch a connection state between the operation circuits; and a controller. The operation circuit includes a first memory that stores data corresponding to a weight parameter used in the plurality of kinds of operation processing. The operation circuit executes a productsum operation by switching weight data in accordance with a context. The switch circuit includes a second memory that stores data for switching a plurality of connection states in response to switching of a second context signal. The controller generates a second context signal on the basis of a first context signal. The amount of data stored in the second memory can be smaller than the amount of data stored in the first memory in the operation circuit.Type: GrantFiled: May 7, 2018Date of Patent: April 26, 2022Assignee: Semiconductor Energy Laboratory Co., Ltd.Inventors: Munehiro Kozuma, Takeshi Aoki, Seiichi Yoneda, Yoshiyuki Kurokawa

Patent number: 11314845Abstract: Interpolation logic described herein provides a good approximation to a bicubic interpolation, which is generally smoother than bilinear interpolation, without performing all the calculations normally needed for a bicubic interpolation. This allows an approximation of smooth bicubic interpolation to be performed on devices (e.g. mobile devices) which have limited processing resources. At each of a set of predetermined interpolation positions within an array of data points, a set of predetermined weights represent a bicubic interpolation which can be applied to the data points. For a plurality of the predetermined interpolation positions which surround the sampling position, the corresponding sets of predetermined weights and the data points are used to determine a plurality of surrounding interpolated values which represent results of performing the bicubic interpolation at the surrounding predetermined interpolation positions.Type: GrantFiled: June 18, 2020Date of Patent: April 26, 2022Assignee: Imagination Technologies LimitedInventor: Simon Fenney

Patent number: 11314504Abstract: An integrated circuit including a plurality of processing components, including first and second processing components, wherein each processing component includes first memory to store image data and a plurality of multiplieraccumulator execution pipelines, wherein each multiplieraccumulator execution pipeline includes a plurality of multiplieraccumulator circuits to, in operation, perform multiply and accumulate operations using data from the first memory and filter weights. The first processing component is configured to process all of the data associated with all of stages of a first image frame via the plurality of multiplieraccumulator execution pipelines of the first processing component. The second processing component is configured to process all of the data associated with all of stages of a second image frame via the plurality of multiplieraccumulator execution pipelines of the second processing component, wherein the first image frame and the second image frame are successive image frames.Type: GrantFiled: March 11, 2020Date of Patent: April 26, 2022Assignee: Flex Logix Technologies, Inc.Inventors: Frederick A. Ware, Cheng C. Wang, Valentin Ossman

Patent number: 11308389Abstract: Provided are an integrated circuit chip apparatus and a related product, the integrated circuit chip apparatus being used for executing a multiplication operation, a convolution operation or a training operation of a neural network. The present technical solution has the advantages of a small amount of calculation and low power consumption.Type: GrantFiled: December 19, 2019Date of Patent: April 19, 2022Assignee: CAMBRICON TECHNOLOGIES CORPORATION LIMITEDInventors: Shaoli Liu, Xinkai Song, Bingrui Wang, Yao Zhang, Shuai Hu

Patent number: 11307853Abstract: A matrix multiplication device and an operation method thereof are provided. The matrix multiplication device includes calculation circuits, a control circuit, a multiplication circuit, and a routing circuit. The calculation circuits produce multiplyaccumulate values. The control circuit receives a plurality of first element values of a first matrix. The control circuit classifies the first element values into at least one classification value. The multiplication circuit multiplies the classification value by a second element value of a second matrix in a low power mode to obtain at least one product value. The routing circuit transmits each of the product values to at least one corresponding calculation circuit in the calculation circuits in the low power mode.Type: GrantFiled: October 29, 2019Date of Patent: April 19, 2022Assignee: NEUCHIPS CORPORATIONInventors: ChiungLiang Lin, ChaoYang Kao, YounLong Lin, HuangChih Kuo, JianWen Chen

Patent number: 11301213Abstract: An integrated circuit with a large multiplier is provided. The multiplier may be configured to receive large input operands with thousands of bits. The multiplier may be implemented using a multiplier decomposition scheme that is recursively flattened into multiple decomposition levels to expose a tree of adders. The adders may be collapsed into a merged pipelined structure, where partial sums are forwarded from one level to the next while bypassing intervening prefix networks. The final correct sum is not calculated until later. In accordance with the decomposition technique, the partial sums are successively halved, which allows the prefix networks to be smaller from one level to the next. This allows all sums to be calculated at approximately the same pipeline depth, which significantly reduces latency with no or limited pipeline balancing.Type: GrantFiled: June 24, 2019Date of Patent: April 12, 2022Assignee: Intel CorporationInventors: Martin Langhammer, Bogdan Pasca

Patent number: 11301546Abstract: A method comprises receiving one or more sizes for each of the dimensions of a kernel that is convolved with an input tensor to generate an output activation, generating a control pattern used to compute output values for the convolution of the input tensor, with the control pattern being a square matrix with each dimension being a size equal to the product of the width and the height of the kernel. The control pattern is generated by generating a value for each position of the control pattern that is based on a location of the position in the control pattern and the one or more sizes of each of the dimensions of the kernel, the value indicating a location from which to access values from a flattened input tensor for the convolution with the kernel.Type: GrantFiled: November 18, 2019Date of Patent: April 12, 2022Assignee: Groq, Inc.Inventors: Jonathan Alexander Ross, Thomas Hawkins, Gregory Michael Thorson, Matt Boyd

Patent number: 11294638Abstract: A system for generating random numbers comprises a light source for emitting photons, an optical diffuser element, and a plurality of light detector elements, each being for converting received light into electrical charge. The system further comprises means for converting the electrical charge of each of the plurality of light detector elements into an output value. The light source is for illuminating the plurality of light detectors with the photons, whereby the photons are incident on random ones of the plurality of light detectors. The diffuser is located in a light path between the light source and the plurality of light detector elements, and is for making the degree of illumination of each of the plurality of light detector elements more uniform, whereby the output values of the plurality of light detector elements comprise a set of random numbers each comprising quantum noise.Type: GrantFiled: December 19, 2017Date of Patent: April 5, 2022Assignee: CRYPTA LABS LIMITEDInventors: Oliver Maynard, Joe Hq Luong

Patent number: 11294626Abstract: The present disclosure relates generally to techniques for adjusting the number representation (e.g., format) of a variable before and/or after performing one or more arithmetic operations on the variable. In particular, the present disclosure relates to scaling the range of a variable to a suitable representation based on available hardware (e.g., hard logic) in an integrated circuit device. For example, an input in a first number format (e.g., bfloat16) may be scaled to a second number format (e.g., halfprecision floatingpoint) so that circuitry implemented to receive inputs in the second number format may perform one or more arithmetic operations on the input. Further, the output produced by the circuitry may be scaled back to the first number format. Accordingly, arithmetic operations, such as a dotproduct, performed in a first format may be emulated by scaling the inputs to and/or the outputs from arithmetic operations performed in another format.Type: GrantFiled: September 27, 2018Date of Patent: April 5, 2022Assignee: Intel CorporationInventors: Bogdan Mihai Pasca, Martin Langhammer

Patent number: 11281463Abstract: Methods and apparatus relating to conversion of an unsigned normalized (unorm) integer values to floatingpoint (float) values in low power are described. In an embodiment, conversion logic converts a unorm integer value to a floatingpoint value based on detection of whether the unorm integer matches one of three cases, wherein the unorm integer value comprises n bits. Memory stores a count value corresponding to n?1 bits of the unorm integer value after detection of a leading 1 in the unorm integer value. The three cases include: a first case with all zeros, a second case with all ones, and a third case with a combination of one or more zeros and one or more ones. Other embodiments are also disclosed and claimed.Type: GrantFiled: March 25, 2018Date of Patent: March 22, 2022Assignee: INTEL CORPORATIONInventors: Benjamin Pletcher, Rahul Kumar

Patent number: 11281745Abstract: Methods and systems of matrix multiplication are described. In an example, a processor can multiply a first entry of a first vector of a first data array with a second vector of a second data array to generate a third vector of a third data array. The processor can store the third vector of the third data array in the second register file. The processor can multiply a second entry of the first vector with the second vector to generate a fourth vector of the third data array. The processor can store the fourth vector of the third data array in the second register file. The processor can combine vectors of the third data array that are stored in the second register file to produce the third data array.Type: GrantFiled: August 16, 2019Date of Patent: March 22, 2022Assignee: International Business Machines CorporationInventors: Bruce Fleischer, Jose E. Moreira, Joel A. Silberman

Patent number: 11275561Abstract: An example computerimplemented method includes receiving a first value, a second value, a third value, and a fourth value, wherein the first value, the second value, the third value, and the fourth value are 16bit or smaller precision floatingpoint numbers. The method further includes multiplying the first value and the second value to generate a first product, wherein the first product is a 32bit floatingpoint number. The method further includes multiplying the third value and the fourth value to generate a second product, wherein the second product is a 32bit floatingpoint number. The method further includes summing the first product and the second product to generate a summed value, wherein the summed value is a 32bit floatingpoint number. The method further includes adding the summed value to an addend value to generate a result value, wherein the addend value and the result value are 32bit floatingpoint numbers.Type: GrantFiled: December 12, 2019Date of Patent: March 15, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Silvia Melitta Mueller, Andreas Wagner, Brian W. Thompto

Patent number: 11269629Abstract: Many signal processing, machine learning and scientific computing applications require a large number of multiplyaccumulate (MAC) operations. This type of operation is demanding in both computation and memory. Process in memory has been proposed as a new technique that computes directly on a large array of data in place, to eliminate expensive data movement overhead. To enable parallel multibit MAC operations, both width and levelmodulating memory word lines are applied. To improve performance and provide tolerance against processvoltagetemperature variations, a delaylocked loop is used to generate fine unit pulses for driving memory word lines and a dualramp Singleslope ADC is used to convert bit line outputs. The concept is prototyped in a 180 nm CMOS test chip made of four 320×64 computeSRAMs, each supporting 128× parallel 5 b×5 b MACs with 32 5 b output ADCs and consuming 16.6 mW at 200 MHz.Type: GrantFiled: November 29, 2018Date of Patent: March 8, 2022Assignee: The Regents of the University of MichiganInventors: Zhengya Zhang, Thomas Chen, Jacob Christopher Botimer, Shiming Song

Patent number: 11262982Abstract: A computation circuit includes a plurality of processing elements and a common accumulator. The plurality of processing elements are sequentially coupled in series, and performs a multiply and accumulate (MAC) operation on a weight signal and at least one of two or more input signals received in each unit cycle. The common accumulator is sequentially and cyclically coupled to first to Kth processing elements among the plurality of processing elements, and configured to receive a computation value outputted from a processing element coupled thereto among the first to Kth processing elements, and store computation information. The K is decided based on values of the two or more input signals and the number of guard bits included in one processing element.Type: GrantFiled: July 22, 2019Date of Patent: March 1, 2022Assignees: SK hynix Inc., SK Telecom Co., Ltd.Inventors: Yong Sang Park, Seok Joong Hwang

Patent number: 11250103Abstract: A system for determining the frequency coefficients of a one or multidimensional signal that is sparse in the frequency domain includes determining the locations of the nonzero frequency coefficients, and then determining values of the coefficients using the determined locations. If N is total number of frequency coefficients across the one or more dimension of the signal, and if R is an upper bound of the number of nonzero ones of these frequency coefficients, the systems requires up to (O (R log(R) (N))) samples and has a computation complexity of up to O (R log2(R) log (N). The system and the processing technique are stable to lowlevel noise and can exhibit only a small probability of failure. The frequency coefficients can be real and positive or they can be complex numbers.Type: GrantFiled: January 25, 2017Date of Patent: February 15, 2022Assignee: Reservoir Labs, Inc.Inventor: PierreDavid Letourneau

Patent number: 11243744Abstract: A method (40) is provided for performing a trustworthiness test on a random number generator, RNG, (20) comprising a physical unclonable function, PUFmodule (21). The trustworthiness test is implemented as a known answer test, KAT, and the method (40) comprises: receiving (41), in the PUFmodule (21), an input based on test data, T, received from a verifier (11) provided with at least one test datatest result pair, (T, R), providing (42) an output from the PUFmodule (21), determining (43) a test result, R?, based on the output from the PUFmodule (21), and providing (44) the test result, R?, to the verifier (11). A random number generator (20), computer program and computer program products and a method performed by or in a verifier are also provided.Type: GrantFiled: November 15, 2016Date of Patent: February 8, 2022Assignee: Telefonaktiebolaget LM Ericsson (publ)Inventors: Mats Näslund, Elena Dubrova, Karl Norrman