Patents Examined by Carlo Waje
  • Patent number: 11972229
    Abstract: Semiconductor devices and multiply-accumulate operation devices are disclosed. In one example, a semiconductor device includes synapses in which a nonvolatile variable resistance element taking a first resistance value and a second resistance value lower than the first resistance value and a fixed resistance element having a resistance value higher than the second resistance value are connected in series. An output line outputs a sum of currents flowing through the plurality of synapses.
    Type: Grant
    Filed: March 15, 2019
    Date of Patent: April 30, 2024
    Assignees: Sony Group Corporation, Sony Semiconductor Solutions Corporation
    Inventors: Toshiyuki Kobayashi, Rui Morimoto, Jun Okuno, Masanori Tsukamoto, Yusuke Shuto
  • Patent number: 11966450
    Abstract: According to an embodiment, a calculation device includes a memory and one or more processors configured to update, for elements each associated with first and second variables, the first and second variables for each unit time, sequentially for the unit times and alternately between the first and second variables. In a calculation process for each unit time, the one or more processors are configured to: for each of the elements, update the first variable based on the second variable; update the second variable based on the first variables of the elements; when the first variable is smaller than a first value, change the first variable to a value of the first value or more and a threshold value or less; and when the first variable is greater than a second value, change the first variable to a value of the threshold value or more and the second value or less.
    Type: Grant
    Filed: February 25, 2021
    Date of Patent: April 23, 2024
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Hayato Goto
  • Patent number: 11960856
    Abstract: A system and/or an integrated circuit including a multiplier-accumulator execution pipeline which includes a plurality of MACs to implement a plurality of multiply and accumulate operations. A first memory stores filter weights having a Gaussian floating point (“GFP”) data format and a first bit length. A data format conversion circuitry includes circuitry to convert the filter weights from the GFP data format and the first bit length to filter weights having the data format and bit length that are different from the GFP data format and the first bit length. The converted filter weights are output to the MACs, wherein in operation, the MACs are configured to perform the plurality of multiply operations using (a) the input data and (b) the filter weights having the data format and bit length that are different from the GFP data format and the first bit length, respectively.
    Type: Grant
    Filed: January 4, 2021
    Date of Patent: April 16, 2024
    Assignee: Flex Logix Technologies, Inc.
    Inventor: Frederick A. Ware
  • Patent number: 11941371
    Abstract: Systems, apparatuses, and methods related to bit string accumulation are described. A method for bit string accumulation can include performing an iteration of a recursive operation using a first bit string and a second bit string and modifying a quantity of bits of a result of the iteration of the recursive operation, wherein the modified quantity of bits is less than a threshold quantity of bits. The method can further include writing a first value comprising the modified bits indicative of the result of the iteration of the recursive operation to a first register and writing a second value indicative of the factor corresponding to the result of the iteration of the recursive operation to a second register.
    Type: Grant
    Filed: January 31, 2022
    Date of Patent: March 26, 2024
    Assignee: Micron Technology, Inc.
    Inventors: Vijay S. Ramesh, Katie Blomster Park
  • Patent number: 11922240
    Abstract: A multiplier-accumulator accepts A and B digital inputs and generates a dot product P by applying the bits of the A input and the bits of the B inputs to unit elements comprised of groups of AND gates coupled to charge transfer lines through a capacitor Cu. The number of bits in the B input is a number of AND-groups and the number of bits in A is the number of AND gates in an AND-group. Each unit element receives one bit of the B input applied to all of the AND gates of the unit element, and each unit element having the bits of A applied to each associated AND gate input of each unit element. The AND gates are coupled to charge transfer lines through a capacitor Cu, and the charge transfer lines couple to binary weighted charge summing capacitors which sum and scale the charges from the charge transfer lines, the charge coupled to an analog to digital converter which forms the dot product output. The charge transfer lines may span multiple unit elements.
    Type: Grant
    Filed: December 31, 2020
    Date of Patent: March 5, 2024
    Assignee: Ceremorphic, Inc.
    Inventors: Ryan Boesch, Martin Kraemer, Wei Xiong
  • Patent number: 11922133
    Abstract: A method includes processing, by an arithmetic and logic unit of a processor, masked data, and keeping, by the arithmetic and logic unit of the processor, the masked data masked throughout their processing by the arithmetic and logic unit. A processor includes an arithmetic and logic unit configured to keep masked data masked throughout processing of the masked data in the arithmetic and logic unit.
    Type: Grant
    Filed: September 30, 2020
    Date of Patent: March 5, 2024
    Assignees: STMicroelectronics (Rousset) SAS, STMicroelectronics (Grenoble 2) SAS
    Inventors: Rene Peyrard, Fabrice Romain, Jean-Michel Derien, Christophe Eichwald
  • Patent number: 11909422
    Abstract: A deep neural network (“DNN”) module compresses and decompresses neuron-generated activation data to reduce the utilization of memory bus bandwidth. The compression unit receives an uncompressed chunk of data generated by a neuron in the DNN module. The compression unit generates a mask portion and a data portion of a compressed output chunk. The mask portion encodes the presence and location of the zero and non-zero bytes in the uncompressed chunk of data. The data portion stores truncated non-zero bytes from the uncompressed chunk of data. A decompression unit receives a compressed chunk of data from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion and the data portion.
    Type: Grant
    Filed: November 11, 2022
    Date of Patent: February 20, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Joseph Leon Corkery, Benjamin Eliot Lundell, Larry Marvin Wall, Chad Balling McBride, Amol Ashok Ambardekar, George Petre, Kent D. Cedola, Boris Bobrov
  • Patent number: 11907681
    Abstract: A semiconductor device includes a dynamic reconfiguration processor that performs data processing for input data sequentially input and outputs the results of data processing sequentially as output data, an accelerator including a parallel arithmetic part that performs arithmetic operation in parallel between the output data from the dynamic reconfiguration processor and each of a plurality of predetermined data, and a data transfer unit that selects the plurality of arithmetic operation results by the accelerator in order and outputs them to the dynamic reconfiguration processor.
    Type: Grant
    Filed: January 5, 2022
    Date of Patent: February 20, 2024
    Assignee: RENESAS ELECTRONICS CORPORATION
    Inventors: Taro Fujii, Takao Toi, Teruhito Tanaka, Katsumi Togawa
  • Patent number: 11899742
    Abstract: A quantization parameter providing step of a quantization method is performed to provide a quantization parameter which includes a quantized input activation, a quantized weight and a splitting value. A parameter splitting step is performed to split the quantized weight and the quantized input activation into a plurality of grouped quantized weights and a plurality of grouped activations, respectively, according to the splitting value. A multiply-accumulate step is performed to execute a multiply-accumulate operation with one of the grouped quantized weights and one of the grouped activations, and then generate a convolution output. A convolution quantization step is performed to quantize the convolution output to a quantized convolution output according to a convolution target bit. A convolution merging step is performed to execute a partial-sum operation with the quantized convolution output according to the splitting value, and then generate an output activation.
    Type: Grant
    Filed: July 7, 2020
    Date of Patent: February 13, 2024
    Assignee: NATIONAL TSING HUA UNIVERSITY
    Inventors: Kea-Tiong Tang, Wei-Chen Wei
  • Patent number: 11893078
    Abstract: A dot product multiplier for matrix operations for an A matrix of order 1×m with a coefficient B matrix of order m×m. Processing Elements (PEs) are arranged in an m×m array, the columns of the array summed to provide a dot product result. Each of the PEs contains a sign determiner and a plurality of analog multiplier cells, one multiplier cell for each value bit. The multipliers operate over four clock cycles, initializing a capacitor charge according to sign on a first clock phase, sharing charge on a second phase, canceling charge on a third phase, and outputting the resultant charge on a fourth phase, the resultant charge on each column representing the dot product for that column.
    Type: Grant
    Filed: August 29, 2020
    Date of Patent: February 6, 2024
    Assignee: Ceremorphic, Inc.
    Inventors: Aravinth Kumar Ayyappannair Radhadevi, Sesha Sairam Regulagadda
  • Patent number: 11894822
    Abstract: A filter device includes: delay units serially connected to delay an input signal and output a delayed signal; multiplication units multiplying the delayed signal by a filter coefficient based on a predetermined value and a multiplying factor adjustment value; a coefficient adjustment unit that, when a multiplication result obtained by multiplying the predetermined value by the multiplying factor adjustment value exceeds a maximum value of a filter-coefficient representation range, divides the multiplication result exceeding the maximum value by the maximum value, and outputs a quotient of division as a coefficient adjustment value; a signal conversion unit outputting a signal obtained by adding after-filter-coefficient-multiplication signals outputted by the multiplication units and an adjusted signal obtained by adjusting a corresponding delayed signal using the coefficient adjustment value; and a division unit generating an output signal by dividing the signal outputted by the signal conversion unit by the
    Type: Grant
    Filed: April 11, 2022
    Date of Patent: February 6, 2024
    Assignee: Mitsubishi Electric Corporation
    Inventors: Yasutaka Yamashita, Shigenori Tani, Kazuma Kaneko, Shigeru Uchida
  • Patent number: 11886833
    Abstract: Embodiments of the present disclosure include systems and methods for providing hierarchical and shared exponent floating point data types. First and second shared exponent values are determined based on exponent values of a plurality of floating point values. A third shared exponent value is determined based the first shared exponent value and the second shared exponent value. First and second difference values are determined based on the first shared exponent value, the second shared exponent value, and the third shared exponent value. Sign values and mantissa values are determined for the plurality of floating point values. The sign value and the mantissa value for each floating point value in the plurality of floating point values, the third shared exponent value, the first difference value, and the second difference value are stored in a data structure for a shared exponent floating point data type.
    Type: Grant
    Filed: June 28, 2021
    Date of Patent: January 30, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Bita Darvish Rouhani, Venmugil Elango, Rasoul Shafipour, Jeremy Fowers, Ming Gang Liu, Jinwen Xi, Douglas C. Burger, Eric S. Chung
  • Patent number: 11868741
    Abstract: The present disclosure discloses a processing element and a neural processing device including the processing element. The processing element includes a weight register configured to store a weight, an input activation register configured to store input activation, a flexible multiplier configured to generate result data by performing a multiplication operation of the weight and the input activation by using a first multiplier of a first precision or using both the first multiplier and a second multiplier of the first precision in response to a calculation mode signal and a saturating adder configured to generate a partial sum by using the result data.
    Type: Grant
    Filed: June 15, 2022
    Date of Patent: January 9, 2024
    Assignee: Rebellions Inc.
    Inventors: Jaewan Bae, Jinwook Oh, Karim Charfi
  • Patent number: 11868740
    Abstract: A circuit includes a first full adder, a second full adder, a first half adder, a third full adder configured to receive a sum output signal of the first full adder, a sum output signal of the second full adder, and a sum output signal of the first half adder, a fourth full adder configured to receive a carry output signal of the first full adder, a carry output signal of the second full adder, and a carry output signal of the first half adder, a second half adder configured to receive a carry output signal of the third full adder and a sum output signal of the fourth full adder, and a third half adder configured to receive a carry output signal of the second half adder and a carry output signal of the fourth full adder.
    Type: Grant
    Filed: September 29, 2021
    Date of Patent: January 9, 2024
    Assignee: Postech Research and Business Development Foundation
    Inventors: Seokhyeong Kang, Sunmean Kim, Sunghye Park, SungYun Lee
  • Patent number: 11836459
    Abstract: Techniques are disclosed relating to circuitry for floating-point division. In some embodiments, the circuitry is configured to generate a subnormal result for a division operation that divides a numerator by a denominator. The circuitry may include floating-point circuitry configured to perform a reciprocal operation to determine a normalized mantissa value for the reciprocal of a floating-point representation of the denominator. The circuitry may further include fixed-point circuitry configured to multiply a fixed-point representation of the normalized mantissa value for the reciprocal by a mantissa of the numerator to generate an initial value. Control circuitry may determine error data for the initial value and generate a final subnormal mantissa result for the division operation based on the error data and the initial value. Embodiments with multiple modes with different accuracy guarantees are disclosed.
    Type: Grant
    Filed: March 30, 2021
    Date of Patent: December 5, 2023
    Assignee: Apple Inc.
    Inventors: Liang-Kai Wang, Ian R. Ollmann, Anthony Y. Tai
  • Patent number: 11829729
    Abstract: Systems, apparatuses, and methods of operating memory systems are described. Processing-in-memory capable memory devices are also described, and methods of performing fused-multiply-add operations within the same. Bit positions of bits stored at one or more portions of one or more memory arrays, may be accessed via data lines by activating the same or different access lines. A sensing circuit operatively coupled to a data line may be temporarily formed and measured to determine a state (e.g., a count of the number of bits that are a logic “1”) of accessed bit positions of a data line, and state information may be used to determine a computational result.
    Type: Grant
    Filed: May 29, 2020
    Date of Patent: November 28, 2023
    Assignee: Micron Technology, Inc.
    Inventors: Sean S. Eilert, Shivasankar Gunasekaran, Ameen D. Akel, Dmitri Yudanov, Sivagnanam Parthasarathy
  • Patent number: 11829441
    Abstract: A device includes a matrix transpose component, a matrix processing component, a data alignment component, and a data reduction component. The matrix transpose component is configured to transpose an input matrix of elements to output an output matrix of the elements that have been transposed. The matrix processing component is configured to multiply a first multiplication input matrix with a second multiplication input matrix, wherein the output matrix of the matrix transpose component is utilized as the first multiplication input matrix and a mask vector is utilized as the second multiplication input matrix. The data alignment component is configured to modify at least a portion of elements of a result of the matrix processing component. The data reduction component is configured to sum at least the elements of the modified result of the matrix processing component to determine a sum of the group of values.
    Type: Grant
    Filed: June 7, 2022
    Date of Patent: November 28, 2023
    Assignee: Meta Platforms, Inc.
    Inventors: Krishnakumar Narayanan Nair, Thomas Mark Ulrich, Ehsan Khish Ardestani Zadeh
  • Patent number: 11816446
    Abstract: Systems and methods are provided to perform multiply-accumulate operations of multiple data types in a systolic array. One or more processing elements in the systolic array can include a shared multiplier and one or more adders. The shared multiplier can include a separate and/or a shared circuitry where the shared circuitry can perform at least a part of integer multiplication and at least a part of non-integer multiplication. The one or more adders can include one or more shared adders or one or more separate adders. The shared adder can include a separate and/or a shared circuitry where the shared circuitry can perform at least a part of integer addition and at least a part of non-integer addition.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: November 14, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Thomas Elmer, Thomas A. Volpe
  • Patent number: 11768661
    Abstract: An integrated circuit includes a logic block configured to perform multiplication operations. The logic block includes a plurality of lookup tables configured to receive a plurality of inputs and generate a first plurality of outputs. Additionally, the logic block includes adding circuitry configured to receive the first plurality of outputs and generate a second plurality of outputs. Furthermore, the logic block includes circuitry configured to receive a portion of the plurality of inputs, determine one or more partial products, and generate a third plurality of outputs.
    Type: Grant
    Filed: December 27, 2019
    Date of Patent: September 26, 2023
    Assignee: Intel Corporation
    Inventors: Sadegh Yazdanshenas, Tim Vanderhoek
  • Patent number: 11762633
    Abstract: The present disclosure relates to a circuit and method for determining a sign indicator bit of a binary datum including a step for processing of the binary datum masked with a masking operation, and not including any processing step of the binary datum.
    Type: Grant
    Filed: September 30, 2020
    Date of Patent: September 19, 2023
    Assignees: STMicroelectronics (Grenoble 2) SAS, STMicroelectronics (Rousset) SAS
    Inventors: Rene Peyrard, Fabrice Romain