Patents by Inventor Martin Langhammer
Martin Langhammer has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230222275Abstract: A method is provided for processing code for a circuit design for an integrated circuit using a computer system. The method includes receiving at least a portion of the code for the circuit design for the integrated circuit, wherein the portion of the code comprises an error or has incomplete constraints, making an assumption about the error and the missing constraints using a computer aid design tool, and generating a revised circuit design for the integrated circuit with the error corrected and any missing constraints added based on the assumption and based on the code using the computer aided design tool and a library of components for circuit designs.Type: ApplicationFiled: March 16, 2023Publication date: July 13, 2023Applicant: Intel CorporationInventors: Gregg Baeckler, Mahesh A. Iyer, Martin Langhammer
-
Publication number: 20230195416Abstract: An integrated circuit is provided that includes via-configured structured logic circuitry and an embedded arithmetic block that interfaces with the via-configured structured logic circuitry to perform an arithmetic function. The embedded arithmetic block includes at least one monolithic arithmetic circuit that can perform the arithmetic function more efficiently or taking up less die space than a comparable circuit formed from the via-configured structured logic circuitry.Type: ApplicationFiled: December 22, 2021Publication date: June 22, 2023Inventors: Sankaran Menon, Martin Langhammer, Mustansir Fanaswalla, Kuldeep Simha
-
Patent number: 11662979Abstract: An integrated circuit that includes very large adder circuitry is provided. The very large adder circuitry receives more than two inputs each of which has hundreds or thousands of bits. The very large adder circuitry includes multiple adder nodes arranged in a tree-like network. The adder nodes divide the input operands into segments, computes the sum for each segment, and computes the carry for each segment independently from the segment sums. The carries at each level in the tree are accumulated using population counters. After the last node in the tree, the segment sums can then be combined with the carries to determine the final sum output. An adder tree network implemented in this way asymptotically approaches the area and performance latency as an adder network that uses infinite speed ripple carry adders.Type: GrantFiled: November 19, 2020Date of Patent: May 30, 2023Assignee: Intel CorporationInventor: Martin Langhammer
-
Patent number: 11656872Abstract: The present disclosure describes a digital signal processing (DSP) block that includes a plurality of columns of weight registers and a plurality of inputs configured to receive a first plurality of values and a second plurality of values. The first plurality of values is stored in the plurality of columns of weight registers after being received. In a first mode of operation, the first and second pluralities of values are received via a first portion of the plurality of inputs. In a second mode of operation, the first plurality of values is received via a second portion of the plurality of inputs, and the second plurality of values is received via the first portion of the plurality of inputs. Additionally, the DSP block includes a plurality of multipliers configured to simultaneously multiply each value of the first plurality of values by each value of the second plurality of values.Type: GrantFiled: June 26, 2020Date of Patent: May 23, 2023Assignee: Intel CorporationInventor: Martin Langhammer
-
Publication number: 20230116554Abstract: A processor circuit includes a compiler configured to receive a software program that comprises software code coded in an assembly language and inline software code coded in a high-level programming language, compile the inline software code coded in the high-level programming language within the software program into assembly code in the assembly language, and compile the assembly code and the software code coded in the assembly language into machine code for the processor circuit. A method includes determining if first and second instructions in a software program are combinable into one instruction word, combining the first and the second instructions in the software program into one instruction word if the first and the second instructions are combinable, and fetching the instruction word into a single register by storing the instruction word in the single register.Type: ApplicationFiled: December 7, 2022Publication date: April 13, 2023Applicant: Intel CorporationInventors: Gregg Baeckler, Martin Langhammer
-
Publication number: 20230037575Abstract: An integrated circuit is provided that includes compression or decompression circuitry along a datapath. An integrated circuit system may include first memory to store data, data utilization circuitry to operate on the data, and a configurable data distribution path to transfer data between the first memory and the data utilization circuitry. Compression or decompression circuitry may be disposed along the data distribution path between the first memory and the data utilization circuitry to enable the first memory to store the data in compressed form and to enable the data utilization circuitry to operate on the data in uncompressed form. The compression or decompression circuitry may use lossless sparse encoding, lossless multi-precision encoding, lossless prefix lookup table-based encoding, Huffman encoding, selective compression, or lossy compression.Type: ApplicationFiled: September 30, 2022Publication date: February 9, 2023Inventors: Michael Wu, Nihat Engin Tunali, Martin Langhammer
-
Publication number: 20230027064Abstract: Systems and methods of the present disclosure provide techniques for reducing power consumption of a large combinational circuit using register insertion. In particular, a large circuit may be analyzed to determine the amount of signal switching at various logical points (e.g., stages in the computation) of the circuit. A clock sequence with many pulses in the period of a clock that runs the large combinatorial circuit may be generated. To balance the amount of signal switching at various logical points in the circuit, registers may be inserted at certain points in the large circuit with the clock pulses of the clock sequence assigned to the registers that may not have a constant frequency or may be phase shifted versions of the main clock.Type: ApplicationFiled: September 30, 2022Publication date: January 26, 2023Inventors: Martin Langhammer, Gregg William Baeckler, Sergey Vladimirovich Gribok, Mahesh A. Iyer
-
Publication number: 20230021396Abstract: A method for implementing an artificial neural network in a computing system that comprises performing a compute operation using an input activation and a weight to generate an output activation, and modifying the output activation using a noise value to increase activation sparsity.Type: ApplicationFiled: September 27, 2022Publication date: January 26, 2023Applicant: Intel CorporationInventors: Nihat Tunali, Arnab Raha, Bogdan Pasca, Martin Langhammer, Michael Wu, Deepak Mathaikutty
-
Publication number: 20230026331Abstract: A circuit system for performing modular reduction of a modular multiplication includes multiplier circuits that receive a first subset of coefficients that are generated by summing partial products of a multiplication operation that is part of the modular multiplication. The multiplier circuits multiply the coefficients in the first subset by constants that equal remainders of divisions to generate products. Adder circuits add a second subset of the coefficients and segments of bits of the products that are aligned with respective ones of the second subset of the coefficients to generate sums.Type: ApplicationFiled: September 23, 2022Publication date: January 26, 2023Applicant: Intel CorporationInventors: Sergey Gribok, Bogdan Pasca, Martin Langhammer
-
Publication number: 20230018414Abstract: The present disclosure describes techniques for incorporating pipelined DSP blocks or other types of embedded functions into a logic circuit with a slower clock rate without any clock crossing complexities, and at the same time managing the power consumption of the more complex design that results from it. The techniques include generating a faster clock or several faster clocks that may have a faster clock rate than the clock used by the logic circuit and that may be used as clock input to the embedded pipelined DSP blocks. In addition, the present disclosure describes techniques for generating, improving, and using the faster clock to sample the output of a logic circuit using pulses of generated faster clock, which may allow to increase the clock frequency of the circuit to an optimal level, while maintaining functional correctness.Type: ApplicationFiled: September 29, 2022Publication date: January 19, 2023Inventors: Martin Langhammer, Gregg William Baeckler, Sergey Vladimirovich Gribok, Mahesh A. Iyer
-
Patent number: 11556692Abstract: Techniques for designing and implementing networks-on-chip (NoCs) are provided. For example, a computer-implemented method for programming a network-on-chip (NoC) onto an integrated circuit includes determining a first portion of a plurality of registers to potentially be included in a NoC design, determining routing information regarding datapaths between registers of the first portion of the plurality of registers, and determining an expected performance associated with the first portion of the plurality of registers. The method also includes determining whether the expected performance is within a threshold range, including the first portion of the plurality of registers and the datapaths in the NoC design after determining that the expected performance is within the threshold range, and generating instructions configured to cause circuitry corresponding to the NoC design to be implemented on the integrated circuit.Type: GrantFiled: December 24, 2020Date of Patent: January 17, 2023Assignee: Intel CorporationInventors: Gregg William Baeckler, Martin Langhammer, Sergey Vladimirovich Gribok
-
Publication number: 20220405005Abstract: A three dimensional circuit system includes a first integrated circuit die having a core logic region that has first memory circuits and logic circuits. The three dimensional circuit system includes a second integrated circuit die that has second memory circuits. The first and second integrated circuit dies are coupled together in a vertically stacked configuration. The three dimensional circuit system includes third memory circuits coupled to the first integrated circuit die. The third memory circuits reside in a plane of the first integrated circuit die. The logic circuits are coupled to access the first, second, and third memory circuits and data can move between the first, second, and third memories. The third memory circuits have a larger memory capacity and a smaller memory access bandwidth than the second memory circuits. The second memory circuits have a larger memory capacity and a smaller memory access bandwidth than the first memory circuits.Type: ApplicationFiled: June 16, 2021Publication date: December 22, 2022Applicant: Intel CorporationInventors: Scott Weber, Jawad Khan, Ilya Ganusov, Martin Langhammer, Matthew Adiletta, Terence Magee, Albert Fazio, Richard Coulson, Ravi Gutala, Aravind Dasu, Mahesh Iyer
-
Patent number: 11520584Abstract: The present disclosure describes a digital signal processing (DSP) block that includes a plurality of columns of weight registers and a plurality of inputs configured to receive a first plurality of values and a second plurality of values. The first plurality of values is stored in the plurality of columns of weight registers after being received. Additionally, the DSP block includes a plurality of multipliers configured to simultaneously multiply each value of the first plurality of values by each value of the second plurality of values.Type: GrantFiled: June 26, 2020Date of Patent: December 6, 2022Assignee: Intel CorporationInventors: Martin Langhammer, Dongdong Chen, Jason R. Bergendahl
-
Patent number: 11494186Abstract: The present disclosure describes a digital signal processing (DSP) block that includes a plurality of columns of weight registers and a plurality of inputs configured to receive a first plurality of values and a second plurality of values. The first plurality of values is stored in the plurality of columns of weight registers after being received. Additionally, the DSP block includes a plurality of multipliers configured to simultaneously multiply each value of the first plurality of values by each value of the second plurality of values.Type: GrantFiled: June 26, 2020Date of Patent: November 8, 2022Assignee: Intel CorporationInventors: Martin Langhammer, Dongdong Chen, Jason R. Bergendahl
-
Patent number: 11467804Abstract: A computer-implemented method for programming an integrated circuit includes receiving a program design and determining one or more addition operations based on the program design. The method also includes performing geometric synthesis based on the one or more addition operations by determining a plurality of bits associated with the one or more addition operations and defining a plurality of counters that includes the plurality of bits. Furthermore, the method includes generating instructions configured to cause circuitry configured to perform the one or more addition operations to be implemented on the integrated circuit based on the plurality of counters. The circuitry includes first adder circuitry configured to add a portion of the plurality of bits and produce a carry-out value. The circuitry also includes second adder circuitry configured to determine a sum of a second portion of the plurality of bits and the carry-out value.Type: GrantFiled: June 28, 2019Date of Patent: October 11, 2022Assignee: Intel CorporationInventors: Sergey Vladimirovich Gribok, Gregg William Baeckler, Martin Langhammer
-
Publication number: 20220292366Abstract: Methods, apparatus, systems, and articles of manufacture to perform low overhead sparsity acceleration logic for multi-precision dataflow in deep neural network accelerators are disclosed. An example apparatus includes a first buffer to store data corresponding to a first precision; a second buffer to store data corresponding to a second precision; and hardware control circuitry to: process a first multibit bitmap to determine an activation precision of an activation value, the first multibit bitmap including values corresponding to different precisions; process a second multibit bitmap to determine a weight precision of a weight value, the second multibit bitmap including values corresponding to different precisions; and store the activation value and the weight value in the second buffer when at least one of the activation precision or the weight precision corresponds to the second precision.Type: ApplicationFiled: March 30, 2022Publication date: September 15, 2022Inventors: Arnab Raha, Martin Langhammer, Debabrata Mohapatra, Nihat Tunali, Michael Wu
-
Patent number: 11436399Abstract: A method for implementing a multiplier on a programmable logic device (PLD) is disclosed. Partial product bits of the multiplier are identified and how the partial product bits are to be summed to generate a final product from a multiplier and multiplicand are determined. Chains of PLD cells and cells in the chains of PLD cells for generating and summing the partial product bits are assigned. It is determined whether a bit in an assigned cell in an assigned chain of PLD cells is under-utilized. In response to determining that a bit is under-utilized, the assigning of the chains of PLD cells and cells for generating and summing the partial product bits are changed to improve an overall utilization of the chains of PLD cells and cells in the chains of PLD cells.Type: GrantFiled: December 12, 2018Date of Patent: September 6, 2022Assignee: Intel CorporationInventors: Martin Langhammer, Sergey Gribok, Gregg William Baeckler
-
Publication number: 20220230057Abstract: The present disclosure relates generally to techniques for enhancing recurrent neural networks (RNNs) implemented on an integrated circuit. In particular, approximations of activation functions used in an RNN, such as sigmoid and hyperbolic tangent, may be implemented in an integrated circuit, which may result in increased efficiencies, reduced latency, increased accuracy, and reduced resource consumption involved with implementing machine learning.Type: ApplicationFiled: February 22, 2022Publication date: July 21, 2022Inventors: Bogdan Pasca, Martin Langhammer
-
Publication number: 20220222040Abstract: The present disclosure relates generally to techniques for adjusting the number representation (e.g., format) of a variable before and/or after performing one or more arithmetic operations on the variable. In particular, the present disclosure relates to scaling the range of a variable to a suitable representation based on available hardware (e.g., hard logic) in an integrated circuit device. For example, an input in a first number format (e.g., bfloat16) may be scaled to a second number format (e.g., half-precision floating-point) so that circuitry implemented to receive inputs in the second number format may perform one or more arithmetic operations on the input. Further, the output produced by the circuitry may be scaled back to the first number format. Accordingly, arithmetic operations, such as a dot-product, performed in a first format may be emulated by scaling the inputs to and/or the outputs from arithmetic operations performed in another format.Type: ApplicationFiled: April 1, 2022Publication date: July 14, 2022Inventors: Bogdan Mihai Pasca, Martin Langhammer
-
Publication number: 20220206747Abstract: Systems and methods related to performing arithmetic operations on floating-point numbers. Floating-point arithmetic circuitry is configured to receive two floating-point numbers. The floating-point arithmetic circuitry includes a first path configured to perform a first operation on the two floating-point numbers based at least in part on a difference in size between the two floating-point numbers. The floating-point arithmetic circuitry includes a second path configured to perform a second operation on the two floating-point numbers based at least in part on the difference is size between the two floating-point numbers. The first path and the second path diverge from each other after receipt of the floating-point numbers in the floating-point arithmetic circuitry and converge on a shared adder that is used for the first operation and the second operation.Type: ApplicationFiled: December 24, 2020Publication date: June 30, 2022Inventors: Martin Langhammer, Theo Drane