Patents by Inventor Martin Langhammer

Martin Langhammer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230222275
    Abstract: A method is provided for processing code for a circuit design for an integrated circuit using a computer system. The method includes receiving at least a portion of the code for the circuit design for the integrated circuit, wherein the portion of the code comprises an error or has incomplete constraints, making an assumption about the error and the missing constraints using a computer aid design tool, and generating a revised circuit design for the integrated circuit with the error corrected and any missing constraints added based on the assumption and based on the code using the computer aided design tool and a library of components for circuit designs.
    Type: Application
    Filed: March 16, 2023
    Publication date: July 13, 2023
    Applicant: Intel Corporation
    Inventors: Gregg Baeckler, Mahesh A. Iyer, Martin Langhammer
  • Publication number: 20230195416
    Abstract: An integrated circuit is provided that includes via-configured structured logic circuitry and an embedded arithmetic block that interfaces with the via-configured structured logic circuitry to perform an arithmetic function. The embedded arithmetic block includes at least one monolithic arithmetic circuit that can perform the arithmetic function more efficiently or taking up less die space than a comparable circuit formed from the via-configured structured logic circuitry.
    Type: Application
    Filed: December 22, 2021
    Publication date: June 22, 2023
    Inventors: Sankaran Menon, Martin Langhammer, Mustansir Fanaswalla, Kuldeep Simha
  • Patent number: 11662979
    Abstract: An integrated circuit that includes very large adder circuitry is provided. The very large adder circuitry receives more than two inputs each of which has hundreds or thousands of bits. The very large adder circuitry includes multiple adder nodes arranged in a tree-like network. The adder nodes divide the input operands into segments, computes the sum for each segment, and computes the carry for each segment independently from the segment sums. The carries at each level in the tree are accumulated using population counters. After the last node in the tree, the segment sums can then be combined with the carries to determine the final sum output. An adder tree network implemented in this way asymptotically approaches the area and performance latency as an adder network that uses infinite speed ripple carry adders.
    Type: Grant
    Filed: November 19, 2020
    Date of Patent: May 30, 2023
    Assignee: Intel Corporation
    Inventor: Martin Langhammer
  • Patent number: 11656872
    Abstract: The present disclosure describes a digital signal processing (DSP) block that includes a plurality of columns of weight registers and a plurality of inputs configured to receive a first plurality of values and a second plurality of values. The first plurality of values is stored in the plurality of columns of weight registers after being received. In a first mode of operation, the first and second pluralities of values are received via a first portion of the plurality of inputs. In a second mode of operation, the first plurality of values is received via a second portion of the plurality of inputs, and the second plurality of values is received via the first portion of the plurality of inputs. Additionally, the DSP block includes a plurality of multipliers configured to simultaneously multiply each value of the first plurality of values by each value of the second plurality of values.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: May 23, 2023
    Assignee: Intel Corporation
    Inventor: Martin Langhammer
  • Publication number: 20230116554
    Abstract: A processor circuit includes a compiler configured to receive a software program that comprises software code coded in an assembly language and inline software code coded in a high-level programming language, compile the inline software code coded in the high-level programming language within the software program into assembly code in the assembly language, and compile the assembly code and the software code coded in the assembly language into machine code for the processor circuit. A method includes determining if first and second instructions in a software program are combinable into one instruction word, combining the first and the second instructions in the software program into one instruction word if the first and the second instructions are combinable, and fetching the instruction word into a single register by storing the instruction word in the single register.
    Type: Application
    Filed: December 7, 2022
    Publication date: April 13, 2023
    Applicant: Intel Corporation
    Inventors: Gregg Baeckler, Martin Langhammer
  • Publication number: 20230037575
    Abstract: An integrated circuit is provided that includes compression or decompression circuitry along a datapath. An integrated circuit system may include first memory to store data, data utilization circuitry to operate on the data, and a configurable data distribution path to transfer data between the first memory and the data utilization circuitry. Compression or decompression circuitry may be disposed along the data distribution path between the first memory and the data utilization circuitry to enable the first memory to store the data in compressed form and to enable the data utilization circuitry to operate on the data in uncompressed form. The compression or decompression circuitry may use lossless sparse encoding, lossless multi-precision encoding, lossless prefix lookup table-based encoding, Huffman encoding, selective compression, or lossy compression.
    Type: Application
    Filed: September 30, 2022
    Publication date: February 9, 2023
    Inventors: Michael Wu, Nihat Engin Tunali, Martin Langhammer
  • Publication number: 20230027064
    Abstract: Systems and methods of the present disclosure provide techniques for reducing power consumption of a large combinational circuit using register insertion. In particular, a large circuit may be analyzed to determine the amount of signal switching at various logical points (e.g., stages in the computation) of the circuit. A clock sequence with many pulses in the period of a clock that runs the large combinatorial circuit may be generated. To balance the amount of signal switching at various logical points in the circuit, registers may be inserted at certain points in the large circuit with the clock pulses of the clock sequence assigned to the registers that may not have a constant frequency or may be phase shifted versions of the main clock.
    Type: Application
    Filed: September 30, 2022
    Publication date: January 26, 2023
    Inventors: Martin Langhammer, Gregg William Baeckler, Sergey Vladimirovich Gribok, Mahesh A. Iyer
  • Publication number: 20230021396
    Abstract: A method for implementing an artificial neural network in a computing system that comprises performing a compute operation using an input activation and a weight to generate an output activation, and modifying the output activation using a noise value to increase activation sparsity.
    Type: Application
    Filed: September 27, 2022
    Publication date: January 26, 2023
    Applicant: Intel Corporation
    Inventors: Nihat Tunali, Arnab Raha, Bogdan Pasca, Martin Langhammer, Michael Wu, Deepak Mathaikutty
  • Publication number: 20230026331
    Abstract: A circuit system for performing modular reduction of a modular multiplication includes multiplier circuits that receive a first subset of coefficients that are generated by summing partial products of a multiplication operation that is part of the modular multiplication. The multiplier circuits multiply the coefficients in the first subset by constants that equal remainders of divisions to generate products. Adder circuits add a second subset of the coefficients and segments of bits of the products that are aligned with respective ones of the second subset of the coefficients to generate sums.
    Type: Application
    Filed: September 23, 2022
    Publication date: January 26, 2023
    Applicant: Intel Corporation
    Inventors: Sergey Gribok, Bogdan Pasca, Martin Langhammer
  • Publication number: 20230018414
    Abstract: The present disclosure describes techniques for incorporating pipelined DSP blocks or other types of embedded functions into a logic circuit with a slower clock rate without any clock crossing complexities, and at the same time managing the power consumption of the more complex design that results from it. The techniques include generating a faster clock or several faster clocks that may have a faster clock rate than the clock used by the logic circuit and that may be used as clock input to the embedded pipelined DSP blocks. In addition, the present disclosure describes techniques for generating, improving, and using the faster clock to sample the output of a logic circuit using pulses of generated faster clock, which may allow to increase the clock frequency of the circuit to an optimal level, while maintaining functional correctness.
    Type: Application
    Filed: September 29, 2022
    Publication date: January 19, 2023
    Inventors: Martin Langhammer, Gregg William Baeckler, Sergey Vladimirovich Gribok, Mahesh A. Iyer
  • Patent number: 11556692
    Abstract: Techniques for designing and implementing networks-on-chip (NoCs) are provided. For example, a computer-implemented method for programming a network-on-chip (NoC) onto an integrated circuit includes determining a first portion of a plurality of registers to potentially be included in a NoC design, determining routing information regarding datapaths between registers of the first portion of the plurality of registers, and determining an expected performance associated with the first portion of the plurality of registers. The method also includes determining whether the expected performance is within a threshold range, including the first portion of the plurality of registers and the datapaths in the NoC design after determining that the expected performance is within the threshold range, and generating instructions configured to cause circuitry corresponding to the NoC design to be implemented on the integrated circuit.
    Type: Grant
    Filed: December 24, 2020
    Date of Patent: January 17, 2023
    Assignee: Intel Corporation
    Inventors: Gregg William Baeckler, Martin Langhammer, Sergey Vladimirovich Gribok
  • Publication number: 20220405005
    Abstract: A three dimensional circuit system includes a first integrated circuit die having a core logic region that has first memory circuits and logic circuits. The three dimensional circuit system includes a second integrated circuit die that has second memory circuits. The first and second integrated circuit dies are coupled together in a vertically stacked configuration. The three dimensional circuit system includes third memory circuits coupled to the first integrated circuit die. The third memory circuits reside in a plane of the first integrated circuit die. The logic circuits are coupled to access the first, second, and third memory circuits and data can move between the first, second, and third memories. The third memory circuits have a larger memory capacity and a smaller memory access bandwidth than the second memory circuits. The second memory circuits have a larger memory capacity and a smaller memory access bandwidth than the first memory circuits.
    Type: Application
    Filed: June 16, 2021
    Publication date: December 22, 2022
    Applicant: Intel Corporation
    Inventors: Scott Weber, Jawad Khan, Ilya Ganusov, Martin Langhammer, Matthew Adiletta, Terence Magee, Albert Fazio, Richard Coulson, Ravi Gutala, Aravind Dasu, Mahesh Iyer
  • Patent number: 11520584
    Abstract: The present disclosure describes a digital signal processing (DSP) block that includes a plurality of columns of weight registers and a plurality of inputs configured to receive a first plurality of values and a second plurality of values. The first plurality of values is stored in the plurality of columns of weight registers after being received. Additionally, the DSP block includes a plurality of multipliers configured to simultaneously multiply each value of the first plurality of values by each value of the second plurality of values.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: December 6, 2022
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Dongdong Chen, Jason R. Bergendahl
  • Patent number: 11494186
    Abstract: The present disclosure describes a digital signal processing (DSP) block that includes a plurality of columns of weight registers and a plurality of inputs configured to receive a first plurality of values and a second plurality of values. The first plurality of values is stored in the plurality of columns of weight registers after being received. Additionally, the DSP block includes a plurality of multipliers configured to simultaneously multiply each value of the first plurality of values by each value of the second plurality of values.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: November 8, 2022
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Dongdong Chen, Jason R. Bergendahl
  • Patent number: 11467804
    Abstract: A computer-implemented method for programming an integrated circuit includes receiving a program design and determining one or more addition operations based on the program design. The method also includes performing geometric synthesis based on the one or more addition operations by determining a plurality of bits associated with the one or more addition operations and defining a plurality of counters that includes the plurality of bits. Furthermore, the method includes generating instructions configured to cause circuitry configured to perform the one or more addition operations to be implemented on the integrated circuit based on the plurality of counters. The circuitry includes first adder circuitry configured to add a portion of the plurality of bits and produce a carry-out value. The circuitry also includes second adder circuitry configured to determine a sum of a second portion of the plurality of bits and the carry-out value.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: October 11, 2022
    Assignee: Intel Corporation
    Inventors: Sergey Vladimirovich Gribok, Gregg William Baeckler, Martin Langhammer
  • Publication number: 20220292366
    Abstract: Methods, apparatus, systems, and articles of manufacture to perform low overhead sparsity acceleration logic for multi-precision dataflow in deep neural network accelerators are disclosed. An example apparatus includes a first buffer to store data corresponding to a first precision; a second buffer to store data corresponding to a second precision; and hardware control circuitry to: process a first multibit bitmap to determine an activation precision of an activation value, the first multibit bitmap including values corresponding to different precisions; process a second multibit bitmap to determine a weight precision of a weight value, the second multibit bitmap including values corresponding to different precisions; and store the activation value and the weight value in the second buffer when at least one of the activation precision or the weight precision corresponds to the second precision.
    Type: Application
    Filed: March 30, 2022
    Publication date: September 15, 2022
    Inventors: Arnab Raha, Martin Langhammer, Debabrata Mohapatra, Nihat Tunali, Michael Wu
  • Patent number: 11436399
    Abstract: A method for implementing a multiplier on a programmable logic device (PLD) is disclosed. Partial product bits of the multiplier are identified and how the partial product bits are to be summed to generate a final product from a multiplier and multiplicand are determined. Chains of PLD cells and cells in the chains of PLD cells for generating and summing the partial product bits are assigned. It is determined whether a bit in an assigned cell in an assigned chain of PLD cells is under-utilized. In response to determining that a bit is under-utilized, the assigning of the chains of PLD cells and cells for generating and summing the partial product bits are changed to improve an overall utilization of the chains of PLD cells and cells in the chains of PLD cells.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: September 6, 2022
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Sergey Gribok, Gregg William Baeckler
  • Publication number: 20220230057
    Abstract: The present disclosure relates generally to techniques for enhancing recurrent neural networks (RNNs) implemented on an integrated circuit. In particular, approximations of activation functions used in an RNN, such as sigmoid and hyperbolic tangent, may be implemented in an integrated circuit, which may result in increased efficiencies, reduced latency, increased accuracy, and reduced resource consumption involved with implementing machine learning.
    Type: Application
    Filed: February 22, 2022
    Publication date: July 21, 2022
    Inventors: Bogdan Pasca, Martin Langhammer
  • Publication number: 20220222040
    Abstract: The present disclosure relates generally to techniques for adjusting the number representation (e.g., format) of a variable before and/or after performing one or more arithmetic operations on the variable. In particular, the present disclosure relates to scaling the range of a variable to a suitable representation based on available hardware (e.g., hard logic) in an integrated circuit device. For example, an input in a first number format (e.g., bfloat16) may be scaled to a second number format (e.g., half-precision floating-point) so that circuitry implemented to receive inputs in the second number format may perform one or more arithmetic operations on the input. Further, the output produced by the circuitry may be scaled back to the first number format. Accordingly, arithmetic operations, such as a dot-product, performed in a first format may be emulated by scaling the inputs to and/or the outputs from arithmetic operations performed in another format.
    Type: Application
    Filed: April 1, 2022
    Publication date: July 14, 2022
    Inventors: Bogdan Mihai Pasca, Martin Langhammer
  • Publication number: 20220206747
    Abstract: Systems and methods related to performing arithmetic operations on floating-point numbers. Floating-point arithmetic circuitry is configured to receive two floating-point numbers. The floating-point arithmetic circuitry includes a first path configured to perform a first operation on the two floating-point numbers based at least in part on a difference in size between the two floating-point numbers. The floating-point arithmetic circuitry includes a second path configured to perform a second operation on the two floating-point numbers based at least in part on the difference is size between the two floating-point numbers. The first path and the second path diverge from each other after receipt of the floating-point numbers in the floating-point arithmetic circuitry and converge on a shared adder that is used for the first operation and the second operation.
    Type: Application
    Filed: December 24, 2020
    Publication date: June 30, 2022
    Inventors: Martin Langhammer, Theo Drane