Patents Examined by Matthew D Sandifer
  • Patent number: 11568022
    Abstract: Detailed are embodiments related to bit matrix multiplication in a processor. For example, in some embodiments a processor comprising: decode circuitry to decode an instruction have fields for an opcode, an identifier of a first source bit matrix, an identifier of a second source bit matrix, an identifier of a destination bit matrix, and an immediate; and execution circuitry to execute the decoded instruction to perform a multiplication of a matrix of S-bit elements of the identified first source bit matrix with S-bit elements of the identified second source bit matrix, wherein the multiplication and accumulation operations are selected by the operation selector and store a result of the matrix multiplication into the identified destination bit matrix, wherein S indicates a plural bit size is described.
    Type: Grant
    Filed: January 22, 2021
    Date of Patent: January 31, 2023
    Assignee: Intel Corporation
    Inventors: Dmitry Y. Babokin, Kshitij A. Doshi, Vadim Sukhomlinov
  • Patent number: 11567730
    Abstract: A planar fabrication charge transfer capacitor for coupling charge from a Unit Element (UE) generates a positive charge first output V_PP and a positive charge second output V_NP, the first output coupled to a positive charge line comprising a continuous first planar conductor, a continuous second planar conductor parallel to the first planar conductor, and a continuous third planar conductor parallel to the first planar conductor and second planar conductor, the charge transfer capacitor comprising, in sequence: a first co-planar conductor segment, the first planar conductor, a second co-planar conductor segment, the second planar conductor, a third co-planar conductor segment, the third planar conductor, and a fourth coplanar conductor segment, the first and third coplanar conductor segments capacitively edge coupled to the UE first output V_PP, the second and fourth coplanar conductor segments capacitively edge coupled to the UE second output V_NP.
    Type: Grant
    Filed: January 31, 2021
    Date of Patent: January 31, 2023
    Assignee: Ceremorphic, Inc.
    Inventors: Martin Kraemer, Ryan Boesch, Wei Xiong
  • Patent number: 11561768
    Abstract: Provided are a method and system for using a non-linear feedback shift register (NLFSR) for generating a pseudo-random sequence. The method may include generating, for an n-stage NLFSR that requires more than two taps to generate a maximal length pseudo-random sequence, a pseudo-random sequence using a feedback logical operation of only a first logic gate and a second logic gate. Two non-end taps suitable for providing an at least near-maximal length pseudo-random sequence are inputs for the first logic gate, an output of the first logic gate and an end tap are inputs for the second logic gate, and an output of the second logic gate is used as feedback to a first stage of the n-stage NLFSR.
    Type: Grant
    Filed: April 27, 2021
    Date of Patent: January 24, 2023
    Assignee: International Business Machines Corporation
    Inventor: Andrew Johnson
  • Patent number: 11550545
    Abstract: Techniques related to a low-power, low-memory multiply and accumulate (MAC) unit are described. In an example, the MAC unit performs a MAC operation that represents a multiplication of numbers. At least the bit representation of one number is compressed based on a quantization and a clustering of quantization values, whereby index bits are used instead of the actual bit representation. The index bits are loaded in an index buffer and a bit representation of another number is loaded in an input buffer. The index bits are used in a lookup to determine whether the corresponding bit representation and shift operations are applied to the input buffer based on this bit representation, followed by accumulation operations.
    Type: Grant
    Filed: November 6, 2020
    Date of Patent: January 10, 2023
    Assignee: SK hynix Inc.
    Inventors: Fan Zhang, Aman Bhatia
  • Patent number: 11550543
    Abstract: A semiconductor memory device includes a plurality of memory bank groups configured to be accessed in parallel; an internal memory bus configured to receive external data from outside the plurality of memory bank groups; and a first computation circuit configured to receive internal data from a first memory bank group of the plurality of memory bank groups during each first period of a plurality of first periods, receive the external data through the internal memory bus during each second period of a plurality of second periods, the second period being shorter than the first period, and perform a processing in memory (PIM) arithmetic operation on the internal data and the external data during each second period.
    Type: Grant
    Filed: November 21, 2019
    Date of Patent: January 10, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Shinhaeng Kang, Seongil O
  • Patent number: 11550548
    Abstract: Briefly, example methods, apparatuses, and/or articles of manufacture are disclosed that may be implemented, in whole or in part, using one or more computing devices to facilitate and/or support one or more operations and/or techniques for an autonomous pseudo-random seed generator (APRSG) for embedded computing devices, which may include IoT-type devices, such as implemented in connection with one or more computing and/or communication networks and/or protocols.
    Type: Grant
    Filed: March 31, 2020
    Date of Patent: January 10, 2023
    Assignee: Arm Limited
    Inventors: Andrew Neil Sloss, Christopher Neal Hinds, Hannah Marie Peeler, Gary Dale Carpenter
  • Patent number: 11544037
    Abstract: An improved electronic mixed mode multiplier and accumulate circuit for artificial intelligence and computing system applications that perform vector-vector, vector-matrix and other multiply-accumulate computations. The circuit is provided is a high resolution, high linearity, low area, low power multiply—accumulate (MAC) unit to interface with a memory device for storing computation output results. The MAC unit uses a less number of current carrying elements resulting in much lower integrated circuit area, and provides a tight matching between the current elements thus preserving inherent linearity requirements due to current mode operation. Further the MAC performs current scaling using switches and current division where the current switches occupy minimum size transistors requiring a small area to implement that renders it compatible with MRAM such as a magnetic tunnel junction device.
    Type: Grant
    Filed: April 30, 2020
    Date of Patent: January 3, 2023
    Assignee: International Business Machines Corporation
    Inventors: Sudipto Chakraborty, Rajiv Joshi
  • Patent number: 11544349
    Abstract: A method for implementing a neural network system in an integrated circuit includes presenting digital pulses to word line inputs of a matrix vector multiplier including a plurality of word lines, the word lines forming intersections with a plurality of summing bit lines, a programmable Vt transistor at each intersection having a gate connected to the intersecting word line, a source connected to a fixed potential and a drain connected to the intersecting summing bit line, each digital pulse having a pulse width proportional to an analog quantity. During a charge collection time frame charge collected on each of the summing bit lines from current flowing in the programmable Vt transistor is summed. During a pulse generating time frame digital pulses are generated having pulse widths proportional to the amount of charge that was collected on each summing bit line during the charge collection time frame.
    Type: Grant
    Filed: April 15, 2021
    Date of Patent: January 3, 2023
    Assignee: Microsemi SoC Corp.
    Inventors: John L. McCollum, Jonathan W. Greene, Gregory William Bakker
  • Patent number: 11537361
    Abstract: A processing unit and a method for multiplying at least two multiplicands. The multiplicands are present in an exponential notation, that is, each multiplicand is assigned an exponent and a base. The processing unit is configured to carry out a multiplication of the multiplicands and includes at least one bitshift unit, the bitshift unit shifting a binary number a specified number of places, in particular, to the left; an arithmetic unit, which carries out an addition of two input variables and a subtraction of two input variables; and a storage device. A computer program, which is configured to execute the method, and a machine-readable storage element, in which the computer program is stored, are also described.
    Type: Grant
    Filed: May 21, 2019
    Date of Patent: December 27, 2022
    Assignee: Robert Bosch GmbH
    Inventor: Sebastian Vogel
  • Patent number: 11526581
    Abstract: A method of performing matrix computations includes receiving a compression-encoded matrix including a plurality of rows. Each row of the compression-encoded matrix has a plurality of defined element values and, for each such defined element value, a schedule tag indicating a schedule for using the defined element value in a scheduled matrix computation. The method further includes loading the plurality of rows of the compression-encoded matrix into a corresponding plurality of work memory banks, and providing decoded input data to a matrix computation module configured for performing the scheduled matrix computation. For each work memory bank, a next defined element value and a corresponding schedule tag are read. If the schedule tag meets a scheduling condition, the next defined element value is provided to the matrix computation module. Otherwise, a default element value is provided to the matrix computation module.
    Type: Grant
    Filed: October 30, 2020
    Date of Patent: December 13, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shuayb M. Zarar, Amol Ashok Ambardekar, Jun Zhang
  • Patent number: 11520853
    Abstract: A processor system comprises two groups of registers and a hardware channel convolution processor unit. The first group of registers is configured to store data elements of channels of a portion of a convolution data matrix. Each register stores at least one data element from each channel. The second group of registers is configured to store data elements of convolution weight matrices including a separate matrix for each channel. Each register stores at least one data element from each matrix. The hardware channel convolution processor unit is configured to multiply each data element in a first and second portion of the first group of registers with a corresponding data element in the second group of registers to determine corresponding multiplication results and sum together the multiplication results for each specific channel to determine two corresponding channel convolution result data elements in a corresponding channel convolution result matrix.
    Type: Grant
    Filed: February 28, 2020
    Date of Patent: December 6, 2022
    Assignee: Meta Platforms, Inc.
    Inventors: Krishnakumar Narayanan Nair, Rakesh Komuravelli, Abdulkadir Utku Diril, Ehsan Khish Ardestani Zadeh, Yuchen Hao, Martin Schatz, Thomas Mark Ulrich, Olivia Wu, Anup Ramesh Kadkol, Amin Firoozshahian
  • Patent number: 11520856
    Abstract: A system for performing tensor decomposition in a selective expansive and/or recursive manner, a tensor is decomposed into a specified number of components, and one or more tensor components are selected for further decomposition. For each selected component, the significant elements thereof are identified, and using the indices of the significant elements a sub-tensor is formed. In a subsequent iteration, each sub-tensor is decomposed into a respective specified number of components. Additional sub-tensors corresponding to the components generated in the subsequent iteration are formed, and these additional sub-tensors may be decomposed further in yet another iteration, until no additional components are selected. The mode of a sub-tensor can be decreased or increased prior to decomposition thereof. Components likely to reveal information about the data stored in the tensor can be selected for decomposition.
    Type: Grant
    Filed: November 2, 2020
    Date of Patent: December 6, 2022
    Assignee: Qualcomm Incorporated
    Inventors: Muthu M. Baskaran, David Bruns-Smith, James Ezick, Richard A. Lethin
  • Patent number: 11520562
    Abstract: A method comprising storing a plurality of entries, each entry of the plurality of entries associated with a portion of a range of input values, each entry of the plurality of entries comprising a set of coefficients defining a power series approximation; selecting first entry of the plurality of entries based on a determination that a floating point input value is within a portion of the range of input values that is associated with the first entry; and calculating an output value by evaluating the power series approximation defined by the set of coefficients of the first entry at the floating point input value.
    Type: Grant
    Filed: August 30, 2019
    Date of Patent: December 6, 2022
    Assignee: Intel Corporation
    Inventors: Brian J. Hickmann, Nitin N. Garegrat, Maciej Urbanski, Michael Rotzin
  • Patent number: 11509291
    Abstract: A zero-insertion FIR filter architecture for filtering a signal with a target band and a secondary band. Digital filter circuitry includes an L-tap FIR (finite impulse response) filter, with a number L filter tap elements (L=0, 1, 2, . . . (L?1)), each with an assigned coefficient from a defined coefficient sequence. The L-tap FIR filter is configurable with a defined zero-insertion coefficient sequence of a repeating sub-sequence of a nonzero coefficient followed by one or more zero-inserted coefficients, with a number Nj of nonzero coefficients, and a number Nk of zero-inserted coefficients, so that L=Nj+Nk. The L-tap FIR filter is configurable as an M-tap FIR filter with a nonzero coefficient sequence in which each of the L filter tap elements is assigned a non-zero coefficient, the M-tap FIR filter having an effective length of M=(Nj+Nk) non-zero coefficients.
    Type: Grant
    Filed: March 12, 2019
    Date of Patent: November 22, 2022
    Assignee: TEXAS INSTRUMENTS INCORPORATED
    Inventors: Jawaharlal Tangudu, Jaiganesh Balakrishnan
  • Patent number: 11500631
    Abstract: A method is provided that includes performing, by a processor in response to a floating point multiply instruction, multiplication of floating point numbers, wherein determination of values of implied bits of leading bit encoded mantissas of the floating point numbers is performed in parallel with multiplication of the encoded mantissas, and storing, by the processor, a result of the floating point multiply instruction in a storage location indicated by the floating point multiply instruction.
    Type: Grant
    Filed: May 20, 2020
    Date of Patent: November 15, 2022
    Assignee: Texas Instruments Incorporated
    Inventors: Mujibur Rahman, Timothy David Anderson
  • Patent number: 11500964
    Abstract: A device for computing the inner product of vectors includes a vector data arranger, a vector data pre-accumulator, a number converter, and a post-accumulator. The vector data arranger stores a first vector and sequentially outputs a plurality of vector data based on the first vector. The vector data pre-accumulator stores a second vector, receives each of the vector data, and pre-accumulates the second vector, so as to generate a plurality accumulation results. The number converter and the post-accumulator receive and process all the accumulation results corresponding to each of the vector data to generate an inner product value. The present invention implements a lookup table with the vector data pre-accumulator and the number converter to increase calculation speed and reduce power consumption.
    Type: Grant
    Filed: October 21, 2020
    Date of Patent: November 15, 2022
    Assignee: NATIONAL CHUNG CHENG UNIVERSITY
    Inventor: Tay-Jyi Lin
  • Patent number: 11500963
    Abstract: A method of performing Principal Component Analysis is provided. The method includes receiving, by a computing device, evolving data for processing/visualization. The method further includes, by the computing device, a dimensionality for reducing of the evolving data using the PCA, wherein the PCA is performed on analog crossbar hardware. The method also includes using, by the computing device, the evolving data for visualization having the dimensionality thereof reduced by the principal component analysis for a further application.
    Type: Grant
    Filed: October 1, 2020
    Date of Patent: November 15, 2022
    Assignees: INTERNATIONAL BUSINESS MACHINES CORPORATION, RAMOT AT TEL AVIV UNIVERSITY LTD.
    Inventors: Shashanka Ubaru, Vasileios Kalantzis, Lior Horesh, Mark S. Squillante, Haim Avron
  • Patent number: 11494165
    Abstract: An arithmetic circuit includes a LUT generation circuit (1) that, when coefficients c[n] (n=1, . . . , N) are paired two by two, outputs a value calculated for each of the pairs, and distributed arithmetic circuits (2-m) that calculate values z[m] that are sums of products of data x[m, n] of a data set X[m] containing M pairs of data x[m, n] and the coefficients c[n], in parallel for each of the M pairs. The distributed arithmetic circuit (2-m) includes binomial distributed arithmetic circuits that, for each of the pairs, calculate sums of products of a value obtained by pairing N data x[m, n] corresponding to the circuit two by two and a value obtained by pairing the coefficients c[n] two by two, and a figure matching circuit that matches a number of decimal figures of the sums with a predetermined number of decimal figures.
    Type: Grant
    Filed: December 18, 2018
    Date of Patent: November 8, 2022
    Assignees: NTT ELECTRONICS CORPORATION, NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Kenji Kawai, Ryo Awata, Kazuhito Takei, Masaaki Iizuka
  • Patent number: 11487846
    Abstract: Embodiments relate to a neural processor circuit including a plurality of neural engine circuits, a data buffer, and a kernel fetcher circuit. At least one of the neural engine circuits is configured to receive matrix elements of a matrix as at least the portion of the input data from the data buffer over multiple processing cycles. The at least one neural engine circuit further receives vector elements of a vector from the kernel fetcher circuit, wherein each of the vector elements is extracted as a corresponding kernel to the at least one neural engine circuit in each of the processing cycles. The at least one neural engine circuit performs multiplication between the matrix and the vector as a convolution operation to produce at least one output channel of the output data.
    Type: Grant
    Filed: May 4, 2018
    Date of Patent: November 1, 2022
    Assignee: Apple Inc.
    Inventors: Christopher L. Mills, Erik K. Norden, Sung Hee Park
  • Patent number: 11487506
    Abstract: An aspect includes executing, by a binary based floating-point arithmetic unit of a processor, a calculation having two or more operands in hexadecimal format based on a hexadecimal floating-point (HFP) instruction and providing a condition code for a calculation result of the calculation. The floating-point arithmetic unit includes a condition code anticipator circuit that is configured to provide the condition code to the processor prior to availability of the calculation result.
    Type: Grant
    Filed: August 9, 2019
    Date of Patent: November 1, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Silvia Melitta Mueller, Petra Leber, Kerstin Claudia Schelm, Cedric Lichtenau