Patents Examined by Jonathan David Warner
  • Patent number: 12572330
    Abstract: One example method includes receiving, from a caller, a call for a data stream of a specified size, initializing the data stream by specifying a first prime number and a second prime number, both of which may be 32-bit primes, and by specifying an available amount of data. The method further includes generating data of the data stream using the first prime number and the second prime number, and transmitting the data of the data stream to the caller until either the data stream has fulfilled the call, or until the available amount of data becomes zero. During the transmitting, the method includes maintaining a running counter that starts at the available amount of data, and decrementing the counter by the amount of data sent to the caller.
    Type: Grant
    Filed: January 24, 2022
    Date of Patent: March 10, 2026
    Assignee: Dell Products L.P.
    Inventors: Salil Dangi, Eugene Kim, Amol Powar
  • Patent number: 12561393
    Abstract: A graphics processing unit (GPU) schedules recurrent matrix multiplication operations at different subsets of CUs of the GPU. The GPU includes a scheduler that receives sets of recurrent matrix multiplication operations, such as multiplication operations associated with a recurrent neural network (RNN). The multiple operations associated with, for example, an RNN layer are fused into a single kernel, which is scheduled by the scheduler such that one work group is assigned per compute unit, thus assigning different ones of the recurrent matrix multiplication operations to different subsets of the CUs of the GPU. In addition, via software synchronization of the different workgroups, the GPU pipelines the assigned matrix multiplication operations so that each subset of CUs provides corresponding multiplication results to a different subset, and so that each subset of CUs executes at least a portion of the multiplication operations concurrently.
    Type: Grant
    Filed: October 12, 2021
    Date of Patent: February 24, 2026
    Assignee: Advanced Micro Devices, Inc.
    Inventor: Milind N. Nemlekar
  • Patent number: 12561115
    Abstract: The present disclosure relates to a computing device for processing a multi-bit width value, an integrated circuit board card, a method, and a computer readable storage medium. The computing device may be included in a combined processing apparatus, and the combined processing apparatus may further include a general interconnection interface, and an other processing device. The computing device interacts with the other processing device to jointly complete a computing operation specified by a user. The combined processing apparatus may further include a storage device connected to an apparatus and the other processing device and configured to store data of the apparatus and the other processing device. The solution of the present disclosure can split the multi-bit width value so that the processing capability of the processor is not influenced by the bit width.
    Type: Grant
    Filed: December 20, 2021
    Date of Patent: February 24, 2026
    Assignee: ANHUI CAMBRICON INFORMATION TECHNOLOGY CO., LTD
    Inventors: Shaoli Liu, Daofu Liu, Shiyi Zhou
  • Patent number: 12561551
    Abstract: A sampler for executing a graph neural network (GNN) model are disclosed. The sampler is configured to implement random sampling for neighbor nodes around a specified node of a GNN model, and performs: obtaining a quantity of neighbor nodes around the specified node and a target number of neighbor nodes to be sampled; dividing a range into a plurality of subranges based on the target number; generating random numbers; determining a plurality of integer values within the plurality of subranges based on the random numbers; determining index values of the target number of neighbor nodes to be sampled by matching index values of the neighbor nodes and the plurality of determined integer values; and writing the determined index values into an output buffer. The sampler provided in the present disclosure can uniformly sample the neighbor nodes around the specified node for the specified node.
    Type: Grant
    Filed: February 22, 2022
    Date of Patent: February 24, 2026
    Assignee: Alibaba (China) Co., Ltd.
    Inventors: Tianchan Guan, Yanhong Wang, Shuangchen Li, Heng Liu, Hongzhong Zheng
  • Patent number: 12554462
    Abstract: Asynchronous full-adder circuit is described. The full-adder includes majority and/or minority gates some of which receive two first inputs (A.t, A.f), two second inputs (B.t, B.f), two carry inputs (Cin.t, Cin.f), third acknowledgement input (Cout.e), and fourth acknowledgement input (Sum.e), and generate controls to control gates of transistors, wherein the transistors are coupled to generate two carry outputs (Cout.t, Cout.e), two sum outputs (Sum.t, Sum.e), first acknowledgement output (A.e), second acknowledgement output (B.e), and third acknowledgement output (Cin.e). The majority and/or minority gates comprise CMOS gates or multi-input capacitive circuitries. The multi-input capacitive circuitries include capacitive structures that may comprise linear dielectric, paraelectric dielectric, or ferroelectric dielectric. The capacitors can be planar or non-planar. The capacitors may be stacked vertically to reduce footprint of the asynchronous full-adder circuit.
    Type: Grant
    Filed: February 8, 2022
    Date of Patent: February 17, 2026
    Assignee: Kepler Computing Inc.
    Inventors: Nabil Imam, Amrita Mathuriya, Ikenna Odinaka, Rafael Rios, Rajeev Kumar Dokania, Sasikanth Manipatruni
  • Patent number: 12554464
    Abstract: This disclosure is directed to a digital signal processing (DSP) block that includes multiple weight registers configurable to receive and store a first plurality of values having multiple precisions, and multiple multipliers that are each configurable to receive a respective value of the first plurality of values. The DSP block further includes one or more inputs configurable to receive a second plurality of values, and a multiplexer network configurable to receive the second plurality of values and route each respective value of the second plurality of values to a multiplier of the multipliers. The multipliers are configurable to simultaneously multiply each value of the first plurality of values by a respective value of the second plurality of values to generate a plurality of products. Additionally, the DSP block includes adder circuitry configurable to generate a first sum and a second sum based on the plurality of products.
    Type: Grant
    Filed: December 22, 2021
    Date of Patent: February 17, 2026
    Assignee: Altera Corporation
    Inventors: Martin Langhammer, Michael Wu, Nihat Engin Tunali
  • Patent number: 12524205
    Abstract: Asynchronous full-adder circuit is described. The full-adder includes majority and/or minority gates some of which receive two first inputs (A.t, A.f), two second inputs (B.t, B.f), two carry inputs (Cin.t, Cin.f), third acknowledgement input (Cout.e), and fourth acknowledgement input (Sum.e), and generate controls to control gates of transistors, wherein the transistors are coupled to generate two carry outputs (Cout.t, Cout.e), two sum outputs (Sum.t, Sum.e), first acknowledgement output (A.e), second acknowledgement output (B.e), and third acknowledgement output (Cin.e). The majority and/or minority gates comprise CMOS gates or multi-input capacitive circuitries. The multi-input capacitive circuitries include capacitive structures that may comprise linear dielectric, paraelectric dielectric, or ferroelectric dielectric. The capacitors can be planar or non-planar. The capacitors may be stacked vertically to reduce footprint of the asynchronous full-adder circuit.
    Type: Grant
    Filed: February 8, 2022
    Date of Patent: January 13, 2026
    Assignee: Kepler Computing Inc.
    Inventors: Nabil Imam, Amrita Mathuriya, Ikenna Odinaka, Rafael Rios, Rajeev Kumar Dokania, Sasikanth Manipatruni
  • Patent number: 12517701
    Abstract: A multiplier cell is derived from a 1-bit full adder and an AND gate. The 1-bit full adder is derived from majority and/or minority gates. The majority and/or minority gates include non-linear polar material (e.g., ferroelectric or paraelectric material). A reset mechanism is provided to reset the nodes across the non-linear polar material. The multiplier cell is a hybrid of majority and/or minority gates and complementary metal oxide semiconductor (CMOS) based inverters and/or buffers. The adder uses a non-linear polar capacitor to retain charge with fewer transistors than traditional CMOS sequential circuits. The non-linear polar capacitor includes ferroelectric material, paraelectric material, or non-linear dielectric. Input signals are received by respective terminals of capacitors having non-linear polar material. The other terminals of these capacitors are coupled to a node where the majority function takes place for the inputs.
    Type: Grant
    Filed: October 1, 2021
    Date of Patent: January 6, 2026
    Assignee: Kepler Computing Inc.
    Inventors: Amrita Mathuriya, Rafael Rios, Ikenna Odinaka, Rajeev Kumar Dokania, Sasikanth Manipatruni
  • Patent number: 12511344
    Abstract: Embodiments of the present disclosure include systems and methods for fusing operators for neural network hardware accelerators. A plurality of vector multiplication operations in a data path of a mapping function included in a neural network are identified. The plurality of vector multiplication operations are combined into a single vector multiplication operation in the data path of the mapping function.
    Type: Grant
    Filed: December 22, 2021
    Date of Patent: December 30, 2025
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jinwen Xi, Eric S Chung
  • Patent number: 12504951
    Abstract: The present disclosure provides a computing device for processing a multi-bit width value, an integrated circuit board card, a method, and a computer readable storage medium. The computing device is included in the combined processing apparatus, and the combined processing apparatus further includes a general interconnection interface, and other processing devices. The computing device interacts with the other processing device to jointly complete a computing operation specified by a user. The combined processing apparatus further includes a storage device connected to an apparatus and the other processing devices and configured to store data of the apparatus and the other processing device. The solution of the present disclosure can split the multi-bit width value so that the processing capability of the processor is not influenced by the bit width.
    Type: Grant
    Filed: December 21, 2021
    Date of Patent: December 23, 2025
    Assignee: ANHUI CAMBRICON INFORMATION TECHNOLOGY CO., LTD
    Inventors: Shaoli Liu, Shiyi Zhou, Daofu Liu
  • Patent number: 12481481
    Abstract: Asynchronous full-adder circuit is described. The full-adder includes majority and/or minority gates some of which receive two first inputs (A.t, A.f), two second inputs (B.t, B.f), two carry inputs (Cin.t, Cin.f), third acknowledgement input (Cout.e), and fourth acknowledgement input (Sum.e), and generate controls to control gates of transistors, wherein the transistors are coupled to generate two carry outputs (Cout.t, Cout.e), two sum outputs (Sum.t, Sum.e), first acknowledgement output (A.e), second acknowledgement output (B.e), and third acknowledgement output (Cin.e). The majority and/or minority gates comprise CMOS gates or multi-input capacitive circuitries. The multi-input capacitive circuitries include capacitive structures that may comprise linear dielectric, paraelectric dielectric, or ferroelectric dielectric. The capacitors can be planar or non-planar. The capacitors may be stacked vertically to reduce footprint of the asynchronous full-adder circuit.
    Type: Grant
    Filed: February 7, 2022
    Date of Patent: November 25, 2025
    Assignee: Kepler Computing Inc.
    Inventors: Amrita Mathuriya, Nabil Imam, Ikenna Odinaka, Rafael Rios, Rajeev Kumar Dokania, Sasikanth Manipatruni
  • Patent number: 12481480
    Abstract: A computing device for floating-point mathematic operation using look-up table is provided. The computing device includes: a bit arrangement unit used for receiving a floating-point input data and performing a bit arrangement or a format conversion on the floating-point input data to generate multiple index blocks; a first look-up table unit group used for receiving the index blocks and performing look-up operation using the index blocks as index to generate a plurality of look-up table results; and an operation unit used for performing operation on the look-up table results of the first look-up table unit group to generate an operation output.
    Type: Grant
    Filed: October 21, 2021
    Date of Patent: November 25, 2025
    Assignee: Shenzhen Suanhai Technology Co. Ltd.
    Inventors: Yuan-Hsiang Kuo, Chia-Lin Lu, Wei-Chun Chang, Hao-Cing Jhou, Jen-Shi Wu, Tsung-Hsien Lin
  • Patent number: 12481718
    Abstract: Examples herein describe a hardware accelerator for affine transformations (matrix multiplications followed by additions) using an outer products process. In general, the hardware accelerator reduces memory bandwidth by computing matrix multiplications as a sum of outer products. Moreover, the sum of outer products benefits parallel hardware that accelerates matrix multiplication, and is compatible with both scalar and block affine transformations, and more generally, both scalar and block matrix multiplications.
    Type: Grant
    Filed: September 10, 2021
    Date of Patent: November 25, 2025
    Assignee: XILINX, INC.
    Inventor: Ephrem Wu
  • Patent number: 12436739
    Abstract: A multiplier cell is derived from a 1-bit full adder and an AND gate. The 1-bit full adder is derived from majority and/or minority gates. The majority and/or minority gates include non-linear polar material (e.g., ferroelectric or paraelectric material). A reset mechanism is provided to reset the nodes across the non-linear polar material. The multiplier cell is a hybrid of majority and/or minority gates and complementary metal oxide semiconductor (CMOS) based inverters and/or buffers. The adder uses a non-linear polar capacitor to retain charge with fewer transistors than traditional CMOS sequential circuits. The non-linear polar capacitor includes ferroelectric material, paraelectric material, or non-linear dielectric. Input signals are received by respective terminals of capacitors having non-linear polar material. The other terminals of these capacitors are coupled to a node where the majority function takes place for the inputs.
    Type: Grant
    Filed: October 1, 2021
    Date of Patent: October 7, 2025
    Assignee: Kepler Computing Inc.
    Inventors: Amrita Mathuriya, Rafael Rios, Ikenna Odinaka, Rajeev Kumar Dokania, Sasikanth Manipatruni
  • Patent number: 12411657
    Abstract: Asynchronous full-adder circuit is described. The full-adder includes majority and/or minority gates some of which receive two first inputs (A.t, A.f), two second inputs (B.t, B.f), two carry inputs (Cin.t, Cin.f), third acknowledgement input (Cout.e), and fourth acknowledgement input (Sum.e), and generate controls to control gates of transistors, wherein the transistors are coupled to generate two carry outputs (Cout.t, Cout.e), two sum outputs (Sum.t, Sum.e), first acknowledgement output (A.e), second acknowledgement output (B.e), and third acknowledgement output (Cin.e). The majority and/or minority gates comprise CMOS gates or multi-input capacitive circuitries. The multi-input capacitive circuitries include capacitive structures that may comprise linear dielectric, paraelectric dielectric, or ferroelectric dielectric. The capacitors can be planar or non-planar. The capacitors may be stacked vertically to reduce footprint of the asynchronous full-adder circuit.
    Type: Grant
    Filed: February 7, 2022
    Date of Patent: September 9, 2025
    Assignee: Kepler Computing Inc.
    Inventors: Nabil Imam, Amrita Mathuriya, Ikenna Odinaka, Rafael Rios, Rajeev Kumar Dokania, Sasikanth Manipatruni
  • Patent number: 12405768
    Abstract: Asynchronous full-adder circuit is described. The full-adder includes majority and/or minority gates some of which receive two first inputs (A.t, A.f), two second inputs (B.t, B.f), two carry inputs (Cin.t, Cin.f), third acknowledgement input (Cout.e), and fourth acknowledgement input (Sum.e), and generate controls to control gates of transistors, wherein the transistors are coupled to generate two carry outputs (Cout.t, Cout.e), two sum outputs (Sum.t, Sum.e), first acknowledgement output (A.e), second acknowledgement output (B.e), and third acknowledgement output (Cin.e). The majority and/or minority gates comprise CMOS gates or multi-input capacitive circuitries. The multi-input capacitive circuitries include capacitive structures that may comprise linear dielectric, paraelectric dielectric, or ferroelectric dielectric. The capacitors can be planar or non-planar. The capacitors may be stacked vertically to reduce footprint of the asynchronous full-adder circuit.
    Type: Grant
    Filed: February 7, 2022
    Date of Patent: September 2, 2025
    Assignee: Kepler Computing Inc.
    Inventors: Nabil Imam, Amrita Mathuriya, Ikenna Odinaka, Rafael Rios, Rajeev Kumar Dokania, Sasikanth Manipatruni
  • Patent number: 12399684
    Abstract: In some aspects of the present disclosure, an adder tree circuit is disclosed. In some aspects, the adder tree circuit includes a plurality of full adders (FAs) including: a first subgroup of FAs, wherein each FA of the first subgroup includes a first number of transistors; and a second subgroup of FAs, wherein each FA of the second subgroup includes a second number of transistors, the first number being greater than the second number; wherein each FA of the first subgroup receives a first input from a first one of the second subgroup of FAs and a second input from a second one of the second subgroup of FAs, and each FA provides a first output to a third one of the second subgroup of FAs and a second output to a fourth one of the second subgroup of FAs.
    Type: Grant
    Filed: November 22, 2021
    Date of Patent: August 26, 2025
    Assignee: Taiwan Semiconductor Manufacturing Company, Ltd.
    Inventors: Chia-Fu Lee, Po-Hao Lee, Yi-Chun Shih, Yu-Der Chih
  • Patent number: 12379898
    Abstract: Asynchronous full-adder circuit is described. The full-adder includes majority and/or minority gates some of which receive two first inputs (A.t, A.f), two second inputs (B.t, B.f), two carry inputs (Cin.t, Cin.f), third acknowledgement input (Cout.e), and fourth acknowledgement input (Sum.e), and generate controls to control gates of transistors, wherein the transistors are coupled to generate two carry outputs (Cout.t, Cout.e), two sum outputs (Sum.t, Sum.e), first acknowledgement output (A.e), second acknowledgement output (B.e), and third acknowledgement output (Cin.e). The majority and/or minority gates comprise CMOS gates or multi-input capacitive circuitries. The multi-input capacitive circuitries include capacitive structures that may comprise linear dielectric, paraelectric dielectric, or ferroelectric dielectric. The capacitors can be planar or non-planar. The capacitors may be stacked vertically to reduce footprint of the asynchronous full-adder circuit.
    Type: Grant
    Filed: February 7, 2022
    Date of Patent: August 5, 2025
    Assignee: Kepler Computing Inc.
    Inventors: Nabil Imam, Amrita Mathuriya, Ikenna Odinaka, Rafael Rios, Rajeev Kumar Dokania, Sasikanth Manipatruni
  • Patent number: 12379897
    Abstract: A processing unit for multiplying a first value by a first multiplicand, or for multiplying the first value by, in each instance, a second and third multiplicand. The processing unit receives the multiplicands in a logarithmic number format, so that the multiplicands are each present in the form of at least one exponent at a specifiable base. The processing unit includes a first register, in which either two exponents of the first multiplicand or the exponent of the second and the exponent of the third multiplicand are stored. A set configuration bit indicates whether either the two exponents of the first multiplicand or the exponent of the second and the exponent of the third multiplicand are stored in the first register. The processing unit includes at least two bitshift operators. A method and a computer program for multiplying the value by the multiplicand are also described.
    Type: Grant
    Filed: July 14, 2020
    Date of Patent: August 5, 2025
    Assignee: ROBERT BOSCH GMBH
    Inventor: Sebastian Vogel
  • Patent number: 12333269
    Abstract: A dot product array comprises dot product circuits each to process a respective pair of first and second input vectors to generate a respective dot product result. In a real number mode, each dot product result and vector element represents a respective real number. In a hypercomplex number mode, an input vector manipulation is applied to at least one of the first/second input vectors to be supplied to each dot product circuit, to cause the dot product array to generate hypercomplex dot product results each indicating a sum of hypercomplex products of corresponding pairs of hypercomplex numbers. In the hypercomplex number mode, respective subsets of elements of the first/second input vectors represent respective hypercomplex numbers, for which respective components are represented by different elements of the subset, and each hypercomplex dot product result comprises components represented by the dot product results generated by a corresponding group of at least two dot product circuits.
    Type: Grant
    Filed: September 14, 2021
    Date of Patent: June 17, 2025
    Assignee: Arm Limited
    Inventors: Dominic Hugo Symes, Fredrik Peter Stolt