Patents Examined by Carlo Waje

Patent number: 12079592Abstract: A deep neural network accelerator includes a feature loader that stores input features, a weight memory that stores a weight, and a processing element. The processing element applies 1bit weight values to the input features to generate results according to the 1bit weight values, receives a target weight corresponding to the input features from the weight memory, and selects a target result corresponding to the received target weight from among the results to generate output features.Type: GrantFiled: November 20, 2019Date of Patent: September 3, 2024Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: HoiJun Yoo, Jin Mook Lee

Patent number: 12079591Abstract: A neural network device includes a floatingpoint arithmetic circuit configured to perform a dot product operation and an accumulation operation; and a buffer configured to store first cumulative data generated by the floatingpoint arithmetic circuit, wherein the floatingpoint arithmetic circuit is further configured to perform the dot product operation and the accumulation operation by: identifying a maximum value from a plurality of exponent addition results, obtained by respectively adding exponents of a plurality of floatingpoint data pairs, and an exponent value of the first cumulative data; performing, based on the maximum value, an align shift of a plurality of fraction multiplication results, obtained by respectively multiplying fractions of the plurality of floatingpoint data pairs, and a fraction part of the first cumulative data; and performing a summation of the plurality of aligned fraction multiplication results and the aligned fraction part of the first cumulative data.Type: GrantFiled: March 30, 2021Date of Patent: September 3, 2024Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Hyunpil Kim, Hyunwoo Sim, Seongwoo Ahn, Hasong Kim, Doyoung Lee

Patent number: 12079301Abstract: A command queue is configured to receive a command from a software application. A configuration storage is configured to store a plurality of configurations. A matrix multiplication unit is configured to perform matrix multiplication operations. Memory is configured to store matrices. A control engine is configured to retrieve the command from the command queue; retrieve a configuration from the configuration storage based on the command; generate, based on the command and the configuration, instructions for the matrix multiplication unit to perform a set of matrix multiplication operations on first and second matrices stored in the memory; send the instructions to the matrix multiplication unit to configure the matrix multiplication unit to output results of the set of matrix multiplication operations; and store the results in a third matrix in the memory.Type: GrantFiled: January 8, 2021Date of Patent: September 3, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Nitin Garegrat, Derek Gladding, Shankar Narayan, Sujatha Santhanaraman, Jayadev Velagandula

Patent number: 12073192Abstract: The present application discloses a full adder circuit and a multibit full adder. In the full adder circuit, an inmemory computing fieldeffect transistor stores data and performs logic operation on the data in the transistor and the loaded data according to different input signals; and a lowarea full adder circuit is realized with very few transistors through the characteristics and the reading and writing modes of the inmemory computing fieldeffect transistor. The full adder circuit has a simple structure, which is greatly reduces the area and complexity of the full adder circuit, and saves 19 transistors compared with the traditional CMOS full adder circuits.Type: GrantFiled: October 17, 2023Date of Patent: August 27, 2024Assignee: ZHEJIANG LABInventors: Jiani Gu, Xiao Yu

Patent number: 12073269Abstract: An example SC integrator can include first and second sampling capacitors, an amplifier, an integrating capacitor, coupled at least to an output of the amplifier, and a switching arrangement. The SC integrator can be configured for adding (i.e., integrating in the integrating capacitor) signinverted samples of a flicker noise of the amplifier at one or more cycles of a master clock and can be configured for keeping the time distance/delay between those samples relatively small across a range of master clock frequencies.Type: GrantFiled: January 28, 2021Date of Patent: August 27, 2024Assignee: Analog Devices International Unlimited CompanyInventor: Roberto S. Maurino

Patent number: 12056463Abstract: An optimization apparatus includes hardware circuits configured to function as a random number generator configured to operate either in a first operation mode in which to generate a random number sequence after performing an initialization or in a second operation mode in which to generate a random number sequence without performing the initialization, an annealing calculation unit configured to perform an annealing process by use of random numbers generated by the random number generator, and an operation instruct unit configured to cause the random number generator to start operating in the first operation mode when the annealing calculation unit starts the annealing process, to cause the random number generator to stop operating when the annealing calculation unit, suspends the annealing process, and to cause the random number generator to restart operating in the second operation mode when the annealing calculation unit restarts the annealing process.Type: GrantFiled: March 4, 2021Date of Patent: August 6, 2024Assignee: FUJITSU LIMITEDInventor: Masato Sasaki

Patent number: 12056465Abstract: Verifying the correctness of a leading zero counter, including: generating, based on an input value comprising a plurality of digits, a first bit vector, wherein each entry of the first bit vector indicates whether a corresponding digit of the input value is equal to zero; calculating, based on the first bit vector, a leading zero count for the input value; generating a bit mask comprising a number of leading ones equal to the leading zero count; generating a second bit vector comprising a one at a same index as a first occurring zero in the bit mask; and verifying the leading zero count based on the first bit vector and one or more of the bit mask and the second bit vector.Type: GrantFiled: March 25, 2022Date of Patent: August 6, 2024Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Michael Klein, Petra Leber, Cedric Lichtenau, Stefan Payer, Kerstin Claudia Schelm

Patent number: 12050885Abstract: A method for binary division includes the steps of having a current remainder provided as a sum bitvector and a carry bitvector, performing a carry save add operation between the sum bitvector and the carry bitvector and a two's complement representation of a denominator to produce a temporary sum and a temporary carry, predicting a sign bit of a full total of the temporary sum and the temporary carry and updating the remainder with the temporary sum and the temporary carry and incrementing a quotient if the sign bit is 0.Type: GrantFiled: January 19, 2021Date of Patent: July 30, 2024Assignee: GSI Technology Inc.Inventor: Dan Ilan

Patent number: 12039288Abstract: A processorimplemented data processing method includes: normalizing input data of an activation function comprising a division operation; determining dividend data corresponding to a dividend of the division operation by reading, from a memory, a value of a first lookup table addressed by the normalized input data; determining divisor data corresponding to a divisor of the division operation by accumulating the dividend data; and determining output data of the activation function corresponding to an output of the division operation obtained by reading, from the memory, a value of a second lookup table addressed by the dividend data and the divisor data.Type: GrantFiled: October 16, 2020Date of Patent: July 16, 2024Assignee: Samsung Electronics Co., Ltd.Inventors: Ihor Vasyltsov, Wooseok Chang, Youngnam Hwang

Patent number: 12026479Abstract: A Unit Element (UE) has a digital X input and a digital W input, and comprises groups of NAND gates generating complementary outputs which are coupled to differential charge transfer lines through respective charge transfer capacitor Cu. The number of bits in the X input determines the number of NAND gates in a NANDgroup and the number of bits in the W input determines the number of NAND groups. Each NANDgroup receives one bit of the W input applied to all of the NAND gates of the NANDgroup, and each unit element having the bits of X applied to each associated NAND gate input of each unit element. The NAND gate outputs are coupled through a charge transfer capacitor Cu to charge transfer lines. Multiple Unit Elements may be placed in parallel to sum and scale the charges from the charge transfer lines, the charges coupled to an analog to digital converter which forms the dot product output.Type: GrantFiled: January 31, 2021Date of Patent: July 2, 2024Assignee: Ceremorphic, Inc.Inventors: Martin Kraemer, Ryan Boesch, Wei Xiong

Patent number: 12019702Abstract: A method and circuit for performing multilayer vectormatrix multiplication operations may include, at a first multiplieraccumulator (MAC) layer, converting a digital input vector using onebit digital to analog converters (DACs); sequentially performing vectormatrix multiplication operations for the analog DAC signals; and sequentially performing an analogtodigital (ADC) operation on outputs of the vectormatrix multiplication operations to generate binary partial output vectors. At a second MAC layer, the method and circuit may sequentially receive the binary partial output vectors from the first MAC layer at multibit DACs; and sequentially perform vectormatrix multiplication operations to generate a summed binary output for the second MAC layer.Type: GrantFiled: November 7, 2020Date of Patent: June 25, 2024Assignee: Applied Materials, Inc.Inventors: SheHwa Yen, Xiaofeng Zhang

Patent number: 12020000Abstract: Systems and methods include arithmetic circuitry that generates a floatingpoint mantissa and includes a propagation network that calculates the floatingpoint mantissa based on input bits. The systems and methods also include rounding circuitry that rounds the floatingpoint mantissa. The rounding circuitry includes a multiplexer at a rounding location for the floatingpoint mantissa that selectively inputs a first input bit of the input bits or a rounding bit. The rounding circuitry also includes an OR gate that ORs a second input bit of the input bits with the rounding bit. Moreover, the second input bit is a less significant bit than the first input bit.Type: GrantFiled: December 24, 2020Date of Patent: June 25, 2024Assignee: Intel CorporationInventors: Martin Langhammer, Alexander Heinecke

Patent number: 12001811Abstract: A multiplyaccumulate calculation device includes: multiple calculation units which generates output signals by multiplying an input signal corresponding to an input value and having a rising part, a signal part, and a falling part by a weight, and output the output signals; an accumulate calculation unit configured to calculate a sum of the output signals output from the plurality of multiple calculation units; and a correction unit configured to execute correction processing for correcting the sum of the output signals on the basis of a correction value including at least one of a first value incorporated into the sum by a current flowing into variable resistors of the multiple calculation units due to the rising part of the input signal, and a second value incorporated into the sum by a current flowing into the variable resistors of the multiple calculation units due to the falling part of the input signal.Type: GrantFiled: October 11, 2018Date of Patent: June 4, 2024Assignee: TDK CORPORATIONInventors: Kuniyasu Ito, Tatsuo Shibata

Patent number: 11972229Abstract: Semiconductor devices and multiplyaccumulate operation devices are disclosed. In one example, a semiconductor device includes synapses in which a nonvolatile variable resistance element taking a first resistance value and a second resistance value lower than the first resistance value and a fixed resistance element having a resistance value higher than the second resistance value are connected in series. An output line outputs a sum of currents flowing through the plurality of synapses.Type: GrantFiled: March 15, 2019Date of Patent: April 30, 2024Assignees: Sony Group Corporation, Sony Semiconductor Solutions CorporationInventors: Toshiyuki Kobayashi, Rui Morimoto, Jun Okuno, Masanori Tsukamoto, Yusuke Shuto

Patent number: 11966450Abstract: According to an embodiment, a calculation device includes a memory and one or more processors configured to update, for elements each associated with first and second variables, the first and second variables for each unit time, sequentially for the unit times and alternately between the first and second variables. In a calculation process for each unit time, the one or more processors are configured to: for each of the elements, update the first variable based on the second variable; update the second variable based on the first variables of the elements; when the first variable is smaller than a first value, change the first variable to a value of the first value or more and a threshold value or less; and when the first variable is greater than a second value, change the first variable to a value of the threshold value or more and the second value or less.Type: GrantFiled: February 25, 2021Date of Patent: April 23, 2024Assignee: Kabushiki Kaisha ToshibaInventor: Hayato Goto

Patent number: 11960856Abstract: A system and/or an integrated circuit including a multiplieraccumulator execution pipeline which includes a plurality of MACs to implement a plurality of multiply and accumulate operations. A first memory stores filter weights having a Gaussian floating point (“GFP”) data format and a first bit length. A data format conversion circuitry includes circuitry to convert the filter weights from the GFP data format and the first bit length to filter weights having the data format and bit length that are different from the GFP data format and the first bit length. The converted filter weights are output to the MACs, wherein in operation, the MACs are configured to perform the plurality of multiply operations using (a) the input data and (b) the filter weights having the data format and bit length that are different from the GFP data format and the first bit length, respectively.Type: GrantFiled: January 4, 2021Date of Patent: April 16, 2024Assignee: Flex Logix Technologies, Inc.Inventor: Frederick A. Ware

Patent number: 11941371Abstract: Systems, apparatuses, and methods related to bit string accumulation are described. A method for bit string accumulation can include performing an iteration of a recursive operation using a first bit string and a second bit string and modifying a quantity of bits of a result of the iteration of the recursive operation, wherein the modified quantity of bits is less than a threshold quantity of bits. The method can further include writing a first value comprising the modified bits indicative of the result of the iteration of the recursive operation to a first register and writing a second value indicative of the factor corresponding to the result of the iteration of the recursive operation to a second register.Type: GrantFiled: January 31, 2022Date of Patent: March 26, 2024Assignee: Micron Technology, Inc.Inventors: Vijay S. Ramesh, Katie Blomster Park

Patent number: 11922133Abstract: A method includes processing, by an arithmetic and logic unit of a processor, masked data, and keeping, by the arithmetic and logic unit of the processor, the masked data masked throughout their processing by the arithmetic and logic unit. A processor includes an arithmetic and logic unit configured to keep masked data masked throughout processing of the masked data in the arithmetic and logic unit.Type: GrantFiled: September 30, 2020Date of Patent: March 5, 2024Assignees: STMicroelectronics (Rousset) SAS, STMicroelectronics (Grenoble 2) SASInventors: Rene Peyrard, Fabrice Romain, JeanMichel Derien, Christophe Eichwald

Patent number: 11922240Abstract: A multiplieraccumulator accepts A and B digital inputs and generates a dot product P by applying the bits of the A input and the bits of the B inputs to unit elements comprised of groups of AND gates coupled to charge transfer lines through a capacitor Cu. The number of bits in the B input is a number of ANDgroups and the number of bits in A is the number of AND gates in an ANDgroup. Each unit element receives one bit of the B input applied to all of the AND gates of the unit element, and each unit element having the bits of A applied to each associated AND gate input of each unit element. The AND gates are coupled to charge transfer lines through a capacitor Cu, and the charge transfer lines couple to binary weighted charge summing capacitors which sum and scale the charges from the charge transfer lines, the charge coupled to an analog to digital converter which forms the dot product output. The charge transfer lines may span multiple unit elements.Type: GrantFiled: December 31, 2020Date of Patent: March 5, 2024Assignee: Ceremorphic, Inc.Inventors: Ryan Boesch, Martin Kraemer, Wei Xiong

Patent number: 11909422Abstract: A deep neural network (“DNN”) module compresses and decompresses neurongenerated activation data to reduce the utilization of memory bus bandwidth. The compression unit receives an uncompressed chunk of data generated by a neuron in the DNN module. The compression unit generates a mask portion and a data portion of a compressed output chunk. The mask portion encodes the presence and location of the zero and nonzero bytes in the uncompressed chunk of data. The data portion stores truncated nonzero bytes from the uncompressed chunk of data. A decompression unit receives a compressed chunk of data from memory in the DNN processor or memory of an application host. The decompression unit decompresses the compressed chunk of data using the mask portion and the data portion.Type: GrantFiled: November 11, 2022Date of Patent: February 20, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Joseph Leon Corkery, Benjamin Eliot Lundell, Larry Marvin Wall, Chad Balling McBride, Amol Ashok Ambardekar, George Petre, Kent D. Cedola, Boris Bobrov