Patents Examined by Michael D Yaary
-
Patent number: 12045581Abstract: The present disclosure relates generally to techniques for adjusting the number representation (e.g., format) of a variable before and/or after performing one or more arithmetic operations on the variable. In particular, the present disclosure relates to scaling the range of a variable to a suitable representation based on available hardware (e.g., hard logic) in an integrated circuit device. For example, an input in a first number format (e.g., bfloat16) may be scaled to a second number format (e.g., half-precision floating-point) so that circuitry implemented to receive inputs in the second number format may perform one or more arithmetic operations on the input. Further, the output produced by the circuitry may be scaled back to the first number format. Accordingly, arithmetic operations, such as a dot-product, performed in a first format may be emulated by scaling the inputs to and/or the outputs from arithmetic operations performed in another format.Type: GrantFiled: April 1, 2022Date of Patent: July 23, 2024Assignee: Intel CorporationInventors: Bogdan Mihai Pasca, Martin Langhammer
-
Patent number: 12032652Abstract: Example embodiments relate to a computer-implemented method for attributing different system outcomes to one or more of binary input characteristic(s), sizing input characteristic(s), and/or sub-characteristics thereof. For example, a system output may be based on a plurality of interrelated input characteristics. Systems and methods of the present disclosure can provide for disambiguation and tracing of causal chains from input to output, as well as other performance analyses. In some embodiments, performance of a first set of inputs can be compared relative to performance of a reference set of inputs based at least in part on generating a research set of inputs. The research set can include simulated inputs weighted to a neutral position with respect to the reference set.Type: GrantFiled: October 31, 2023Date of Patent: July 9, 2024Assignee: INALYTICS LTDInventors: Riccardo Di Mascio, David Goodman, Katharine Land, Alessandro Lunghi
-
Patent number: 12033060Abstract: Neural networks, in many cases, include convolution layers that are configured to perform many convolution operations that require multiplication and addition operations. Compared with performing multiplication on integer, fixed-point, or floating-point format values, performing multiplication on logarithmic format values is straightforward and energy efficient as the exponents are simply added. However, performing addition on logarithmic format values is more complex. Conventionally, addition is performed by converting the logarithmic format values to integers, computing the sum, and then converting the sum back into the logarithmic format. Instead, logarithmic format values may be added by decomposing the exponents into separate quotient and remainder components, sorting the quotient components based on the remainder components, summing the sorted quotient components using an asynchronous accumulator to produce partial sums, and multiplying the partial sums by the remainder components to produce a sum.Type: GrantFiled: January 23, 2020Date of Patent: July 9, 2024Assignee: NVIDIA CorporationInventors: William James Dally, Rangharajan Venkatesan, Brucek Kurdo Khailany, Stephen G. Tell
-
Patent number: 12014153Abstract: A method for constructing a quantum number generator is disclosed. The method employs an electrodeposited metal on a printed circuit board (PCB) or a glass container filled with tritium containing a scintillator or fluorophore, and an integrated circuit with a detector mounted on the PCB using a flip-chip methodology.Type: GrantFiled: May 23, 2023Date of Patent: June 18, 2024Assignee: RANDAEMON SP. Z O.O.Inventors: Wiesław Bohdan Kuźmicz, Jan Jakub Tatarkiewicz
-
Patent number: 12014152Abstract: A Unit Element (UE) has a digital X input and a digital W input, and comprises groups of NAND gates generating complementary outputs which are coupled to a differential charge transfer bus comprising a positive charge transfer line and a negative charge transfer line. The number of bits in the X input determines the number of NAND gates in a NAND-group and the number of bits in the W input determines the number of NAND groups. Each NAND-group receives one bit of the W input applied to all of the NAND gates of the NAND-group, and each unit element having the bits of X applied to each associated NAND gate input of each unit element. The NAND gate outputs are coupled through binary weighted charge transfer capacitors to a positive charge transfer line and negative charge transfer line.Type: GrantFiled: May 31, 2021Date of Patent: June 18, 2024Assignee: Ceremorphic, Inc.Inventors: Martin Kraemer, Ryan Boesch, Wei Xiong
-
Patent number: 12014151Abstract: A plurality of unit elements share a charge transfer bus, each unit element accepts A and B digital inputs and generates a product P as an analog charge transferred to the charge transfer bus, each unit element comprised of groups of AND gates coupled to charge transfer lines through a capacitor Cu. Each unit element receives one bit of the B input applied to all of the AND gates of the unit element, and each unit element having the bits of A applied to each associated AND gate input of each unit element. The AND gates of each unit element are coupled to charge transfer lines through a capacitor Cu, and the charge transfer lines couple to binary weighted charge summing capacitors which sum and scale the charges contributed by all unit elements to the charge transfer lines according to a bit weight and converted to a digital value output.Type: GrantFiled: December 31, 2020Date of Patent: June 18, 2024Assignee: Ceremorphic, Inc.Inventors: Martin Kraemer, Ryan Boesch, Wei Xiong
-
Patent number: 12008067Abstract: An apparatus to facilitate acceleration of matrix multiplication operations. The apparatus comprises a systolic array including matrix multiplication hardware to perform multiply-add operations on received matrix data comprising data from a plurality of input matrices and sparse matrix acceleration hardware to detect zero values in the matrix data and perform one or more optimizations on the matrix data to reduce multiply-add operations to be performed by the matrix multiplication hardware.Type: GrantFiled: November 16, 2021Date of Patent: June 11, 2024Assignee: Intel CorporationInventors: Subramaniam Maiyuran, Mathew Nevin, Jorge Parra, Ashutosh Garg, Shubra Marwaha, Shubh Shah
-
Patent number: 12001810Abstract: A signal processing circuit has a plurality of first circuits each including a first-time-length-signal output circuit that outputs a first time-length signal representing a time length between first timing at which a first input signal changes and second timing at which a second input signal changes and a second-time-length-signal output circuit that outputs the first time-length signal as a second time-length signal at timing based on a control signal. The signal processing circuit includes a second circuit that outputs the second time-length signal having the longest time length among a plurality of the second time-length signals output respectively from the plurality of first circuits.Type: GrantFiled: July 10, 2019Date of Patent: June 4, 2024Assignee: SONY CORPORATIONInventors: Tomohiro Matsumoto, Yusuke Oike, Akito Sekiya, Hiroyuki Yamagishi, Ryoji Ikegaya
-
Patent number: 11983510Abstract: Apparatus for evaluating a mathematical function for a received input value includes a polynomial block configured to identify a domain interval containing the received input value over which the mathematical function can be evaluated, the mathematical function over the identified interval being approximated by a polynomial function; and evaluate the polynomial function for the received input value using a set of one or more stored values representing the polynomial function over the identified interval to calculate a first evaluation of the mathematical function for the received input value; and a CORDIC block for performing a CORDIC algorithm, configured to initialise the CORDIC algorithm using the first evaluation of the mathematical function for the received input value calculated by the polynomial block; and implement the CORDIC algorithm to calculate a refined evaluation of the mathematical function for the received input value.Type: GrantFiled: November 9, 2021Date of Patent: May 14, 2024Assignee: Imagination Technologies LimitedInventor: Luca Gagliano
-
Patent number: 11983509Abstract: A floating-point accumulator circuit includes an addend input register having an addend exponent and an addend significand and an accumulation register with a first portion to hold a representation of an accumulation exponent and a second portion to hold a representation of an accumulation significand. A control circuit is also included to generate an accumulator zero control signal and an addend zero control signal based on the addend exponent and the accumulation exponent. It also includes an adder circuit with an output an input of the accumulation register. A first zeroing circuit sends either a zero or a value based on the addend significand to a first input of the adder circuit based on the addend zero control signal, and a second zeroing circuit sends either zeros or a value based on the accumulator significand to a second input of the adder circuit, based on the accumulator zero control signal.Type: GrantFiled: September 12, 2022Date of Patent: May 14, 2024Assignee: SambaNova Systems, Inc.Inventors: Vojin G. Oklobdzija, Matthew M. Kim
-
Patent number: 11977600Abstract: This disclosure relates matrix operation acceleration for different matrix sparsity patterns. A matrix operation accelerator may be designed to perform matrix operations more efficiently for a first matrix sparsity pattern rather than for a second matrix sparsity pattern. A matrix with the second sparsity pattern may be converted to a matrix with the first sparsity pattern and provided to the matrix operation accelerator. By rearranging the rows and/or columns of the matrix, the sparsity pattern of the matrix may be converted to a sparsity pattern that is suitable for computation with the matrix operation accelerator.Type: GrantFiled: September 21, 2021Date of Patent: May 7, 2024Assignee: Intel CorporationInventor: Omid Azizi
-
Patent number: 11966715Abstract: A three-dimensional processor (3D-processor) for parallel computing includes a plurality of computing elements. Each computing element comprises at least a three-dimensional memory (3D-M) array for storing at least a portion of a look-up table (LUT) for a mathematical function and an arithmetic logic circuit (ALC) for performing arithmetic operations on the LUT data. Deficiency in latency is offset by a large scale of parallelism.Type: GrantFiled: July 26, 2020Date of Patent: April 23, 2024Assignees: HangZhou HaiCun Information Technology Co., Ltd.Inventors: Guobiao Zhang, Chen Shen
-
Patent number: 11960886Abstract: An integrated circuit including a plurality of processing components to process image data of a plurality of image frames, wherein each image frame includes a plurality of stages. Each processing component includes a plurality of execution pipelines, wherein each pipeline includes a plurality of multiplier-accumulator circuits configurable to perform multiply and accumulate operations using image data and filter weights, wherein: (i) a first processing component is configurable to process all of the data associated with a first plurality of stages of each image frame, and (ii) a second processing component of the plurality of processing components is configurable to process all of the data associated with a second plurality of stages of each image frame. The first and second processing component processes data associated with the first and second plurality of stages, respectively, of a first image frame concurrently.Type: GrantFiled: April 25, 2022Date of Patent: April 16, 2024Assignee: Flex Logix Technologies, Inc.Inventors: Frederick A. Ware, Cheng C. Wang, Valentin Ossman
-
Patent number: 11960567Abstract: A method for performing a fundamental computational primitive in a device is provided, where the device includes a processor and a matrix multiplication accelerator (MMA). The method includes configuring a streaming engine in the device to stream data for the fundamental computational primitive from memory, configuring the MMA to format the data, and executing the fundamental computational primitive by the device.Type: GrantFiled: July 4, 2021Date of Patent: April 16, 2024Assignee: Texas Instruments IncorporatedInventors: Arthur John Redfern, Timothy David Anderson, Kai Chirca, Chenchi Luo, Zhenhua Yu
-
Patent number: 11960566Abstract: Systems and methods are provided to eliminate multiplication operations with zero padding data for convolution computations. A multiplication matrix is generated from an input feature map matrix with padding by adjusting coordinates and dimensions of the input feature map matrix to exclude padding data. The multiplication matrix is used to perform matrix multiplications with respective weight values which results in fewer computations as compared to matrix multiplications which include the zero padding data.Type: GrantFiled: April 13, 2021Date of Patent: April 16, 2024Assignee: Amazon Technologies, Inc.Inventors: Dana Michelle Vantrease, Ron Diamant
-
Patent number: 11954457Abstract: An arithmetic device includes a function storage circuit and an activation function (AF) circuit. The function storage circuit stores and outputs a function selection signal, a first function information signal, and a second function information signal. The AF circuit generates an activation function result data by applying a slope value and a maximum value to a multiplication/accumulation (MAC) result data in a function setting mode that is activated by the function selection signal. The slope value is set based on the first function information signal, and the maximum value is set based on the second function information signal.Type: GrantFiled: December 17, 2020Date of Patent: April 9, 2024Assignee: SK hynix Inc.Inventor: Choung Ki Song
-
Patent number: 11941407Abstract: A unit for accumulating a plurality N of multiplied M bit values includes a receiving unit, a bit-wise multiplier and a bit-wise accumulator. The receiving unit receives a pipeline of multiplicands A and B such that, at each cycle, a new set of multiplicands is received. The bit-wise multiplier bit-wise multiplies bits of a current multiplicand A with bits of a current multiplicand B and to sum and carry between bit-wise multipliers. The bit-wise accumulator accumulates output of the bit-wise multiplier thereby to accumulate the multiplicands during the pipelining process.Type: GrantFiled: April 5, 2020Date of Patent: March 26, 2024Assignee: GSI Technology Inc.Inventor: Avidan Akerib
-
Patent number: 11934479Abstract: A method for performing sparse quantum Fourier transform computation includes defining a set of quantum circuits, each quantum circuit comprising a Hadamard gate and a single frequency rotation operator, said set of quantum circuits being equivalent to a quantum Fourier transform circuit. The method includes constructing a subset of said quantum circuits in a quantum processor, said quantum processor having a quantum representation of a classical distribution loaded into a quantum state of said quantum processor. The method includes executing said subset of said quantum circuits on said quantum state, and performing a measurement in a frequency basis to obtain a frequency distribution corresponding to said quantum state.Type: GrantFiled: October 7, 2020Date of Patent: March 19, 2024Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Tal Kachman, Mark S. Squillante, Lior Horesh, Kenneth Lee Clarkson, John A. Gunnels, Ismail Yunus Akhalwaya, Jayram Thathachar
-
Patent number: 11934824Abstract: Methods, apparatuses, and systems for in- or near-memory processing are described. Strings of bits (e.g., vectors) may be fetched and processed in logic of a memory device without involving a separate processing unit. Operations (e.g., arithmetic operations) may be performed on numbers stored in a bit-parallel way during a single sequence of clock cycles. Arithmetic may thus be performed in a single pass as numbers are bits of two or more strings of bits are fetched and without intermediate storage of the numbers. Vectors may be fetched (e.g., identified, transmitted, received) from one or more bit lines. Registers of a memory array may be used to write (e.g., store or temporarily store) results or ancillary bits (e.g., carry bits or carry flags) that facilitate arithmetic operations. Circuitry near, adjacent, or under the memory array may employ XOR or AND (or other) logic to fetch, organize, or operate on the data.Type: GrantFiled: April 6, 2020Date of Patent: March 19, 2024Assignee: Micron Technology, Inc.Inventors: Dmitri Yudanov, Sean S. Eilert, Sivagnanam Parthasarathy, Shivasankar Gunasekaran, Ameen D. Akel
-
Patent number: 11934798Abstract: The present disclosure is directed to systems and methods for a memory device such as, for example, a Processing-In-Memory Device that is configured to perform multiplication operations in memory using a popcount operation. A multiplication operation may include a summation of multipliers being multiplied with corresponding multiplicands. The inputs may be arranged in particular configurations within a memory array. Sense amplifiers may be used to perform the popcount by counting active bits along bit lines. One or more registers may accumulate results for performing the multiplication operations.Type: GrantFiled: March 31, 2020Date of Patent: March 19, 2024Assignee: Micron Technology, Inc.Inventor: Dmitri Yudanov