Patents by Inventor Gregg William Baeckler

Gregg William Baeckler has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11003446
    Abstract: Adder trees may be constructed for efficient packing of arithmetic operators into an integrated circuit. The operands of the trees may be truncated to pack an integer number of nodes per logic array block. As a result, arithmetic operations may pack more efficiently onto the integrated circuit while providing increased precision and performance.
    Type: Grant
    Filed: December 14, 2017
    Date of Patent: May 11, 2021
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Gregg William Baeckler, Bogdan Pasca
  • Publication number: 20210117607
    Abstract: Techniques for designing and implementing networks-on-chip (NoCs) are provided. For example, a computer-implemented method for programming a network-on-chip (NoC) onto an integrated circuit includes determining a first portion of a plurality of registers to potentially be included in a NoC design, determining routing information regarding datapaths between registers of the first portion of the plurality of registers, and determining an expected performance associated with the first portion of the plurality of registers. The method also includes determining whether the expected performance is within a threshold range, including the first portion of the plurality of registers and the datapaths in the NoC design after determining that the expected performance is within the threshold range, and generating instructions configured to cause circuitry corresponding to the NoC design to be implemented on the integrated circuit.
    Type: Application
    Filed: December 24, 2020
    Publication date: April 22, 2021
    Inventors: Gregg William Baeckler, Martin Langhammer, Sergey Vladimirovich Gribok
  • Patent number: 10922471
    Abstract: Techniques for designing and implementing networks-on-chip (NoCs) are provided. For example, a computer-implemented method for programming a network-on-chip (NoC) onto an integrated circuit includes determining a first portion of a plurality of registers to potentially be included in a NoC design, determining routing information regarding datapaths between registers of the first portion of the plurality of registers, and determining an expected performance associated with the first portion of the plurality of registers. The method also includes determining whether the expected performance is within a threshold range, including the first portion of the plurality of registers and the datapaths in the NoC design after determining that the expected performance is within the threshold range, and generating instructions configured to cause circuitry corresponding to the NoC design to be implemented on the integrated circuit.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: February 16, 2021
    Assignee: Intel Corporation
    Inventors: Gregg William Baeckler, Martin Langhammer, Sergey Vladimirovich Gribok
  • Patent number: 10871946
    Abstract: Integrated circuits with digital signal processing (DSP) blocks are provided. A DSP block may include one or more large multiplier circuits. A large multiplier circuit (e.g., an 18×18 or 18×19 multiplier circuit) may be used to support two or more smaller multiplication operations sharing one or two sets of multiplier operands, a complex multiplication, and a sum of two multiplications. If the multiplier products overflow and interfere with one another, correction operations can be performed. Partial products from two or more larger multiplier circuits can be used to combine decomposed partial products. A large multiplier circuit can also be used to support two floating-point mantissa multipliers.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: December 22, 2020
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Gregg William Baeckler, Sergey Gribok, Dmitry N. Denisenko, Bogdan Pasca
  • Patent number: 10867090
    Abstract: A method for designing a system on a target device is disclosed. The system is synthesized from a register transfer level description. The system is placed on the target device. The system is routed on the target device. A configuration file is generated that reflects the synthesizing, placing, and routing of the system for programming the target device. A modification for the system is identified. The configuration file is modified to effectuate the modification for the system without changing the placing and routing of the system.
    Type: Grant
    Filed: March 18, 2019
    Date of Patent: December 15, 2020
    Assignee: Intel Corporation
    Inventors: Gregg William Baeckler, Martin Langhammer, Sergey Gribok, Scott J. Weber, Gregory Steinke
  • Patent number: 10790829
    Abstract: Integrated circuits with programmable logic regions are provided. The programmable logic regions may be organized into smaller logic units sometimes referred to as a logic element. A logic element may include four lookup tables coupled to an adder carry chain. At least some of the lookup tables are configured to output combinatorial outputs, whereas the adder carry chain are used to output sum outputs. Both the combinatorial outputs and the sum outputs may be used simultaneously to support a multiplication operation, three or more logic operations, or arithmetic and combinatorial operations in parallel.
    Type: Grant
    Filed: September 27, 2018
    Date of Patent: September 29, 2020
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Sergey Gribok, Gregg William Baeckler
  • Publication number: 20200293707
    Abstract: A method for implementing a programmable device is provided. The method may include extracting an underlay from an existing routing network on the programmable device and then mapping a user design to the extracted underlay. The underlay may represent a subset of fast routing wires satisfying predetermined constraints. The underlay may be composed of multiple repeating adjacent logic blocks, each implementing some datapath reduction operation. Implementing circuit designs in this way can dramatically improve circuit performance while cutting down compile times by more than half.
    Type: Application
    Filed: June 1, 2020
    Publication date: September 17, 2020
    Applicant: Intel Corporation
    Inventors: Gregg William Baeckler, Martin Langhammer
  • Publication number: 20200278865
    Abstract: Integrated circuits that include lightweight processor cores are provided. Each processor core may be configured to execute a series of instructions. At least one of the instructions may include an embedded delay field with a value specifying the amount of time that instruction needs to wait before proceeding to the next instruction to avoid a data hazard. The value of the delay field may be determined by a compiler during software compile time. Such delay field may also be used in conjunction with branch instructions to specify a number of no-operations (NOPs) for one or more associated branch delay slots and may also be used to reduce data forwarding cost.
    Type: Application
    Filed: May 15, 2020
    Publication date: September 3, 2020
    Applicant: Intel Corporation
    Inventors: Martin Langhammer, Gregg William Baeckler
  • Patent number: 10732932
    Abstract: Integrated circuits with digital signal processing (DSP) blocks are provided. A DSP block may include one or more large multiplier circuits. A large multiplier circuit such as an 18×18 multiplier circuit may be used to support two or more smaller multiplication operations such as two 8×8 integer multiplications or two 9×9 integer multiplications. To implement the two 8×8 or 9×9 unsigned/signed multiplications, the 18×18 multiplier may be configured to support two 8×8 multiplications with one shared operand, two 6×6 multiplications without any shared operand, or two 7×7 multiplications without any shared operand. Any potential overlap of partial product terms may be subtracted out using correction logic. The multiplication of the remaining most significant bits can be computed using associated multiplier extension logic and appended to the other least significant bits using merging logic.
    Type: Grant
    Filed: December 21, 2018
    Date of Patent: August 4, 2020
    Assignee: Intel Corporation
    Inventors: Bogdan Pasca, Martin Langhammer, Sergey Gribok, Gregg William Baeckler
  • Publication number: 20200186149
    Abstract: A programmable device may have logic circuitry formed in a top die and memory and specialized processing blocks formed in a bottom die, where the top die is stacked directly on top of the bottom die in a face-to-face configuration. The logic circuitry may include logic sectors, logic array blocks, logic elements, and other types of logic regions. The memory blocks may include large banks of multiport memory for storing data. The specialized processing blocks may include multipliers, adders, and other arithmetic components. The logic circuitry may access the memory and specialized processing blocks via an address encoded scheme. Configured in this way, the maximum operating frequency of the programmable device can be optimized such that critical paths will no longer need to traverse any unused memory and specialized processing blocks.
    Type: Application
    Filed: February 12, 2020
    Publication date: June 11, 2020
    Applicant: Intel Corporation
    Inventors: Dheeraj Subbareddy, MD Altaf Hossain, Ankireddy Nalamalpu, Robert Sankman, Ravindranath Mahajan, Gregg William Baeckler
  • Publication number: 20200142671
    Abstract: Integrated circuits with digital signal processing (DSP) blocks are provided. A DSP block may include one or more large multiplier circuits. A large multiplier circuit such as an 18×18 multiplier circuit may be used to support two or more smaller multiplication operations such as two 8×8 integer multiplications or two 9×9 integer multiplications. To implement the two 8×8 or 9×9 unsigned/signed multiplications, the 18×18 multiplier may be configured to support two 8×8 multiplications with one shared operand, two 6×6 multiplications without any shared operand, or two 7×7 multiplications without any shared operand. Any potential overlap of partial product terms may be subtracted out using correction logic. The multiplication of the remaining most significant bits can be computed using associated multiplier extension logic and appended to the other least significant bits using merging logic.
    Type: Application
    Filed: December 21, 2018
    Publication date: May 7, 2020
    Applicant: Intel Corporation
    Inventors: Bogdan Pasca, Martin Langhammer, Sergey Gribok, Gregg William Baeckler
  • Publication number: 20200125780
    Abstract: Methods and apparatus for increasing the random logic utilization on a programmable device are provided. Although not completely homogeneous, the programmable device has many components that are repeated many times in an array. To help improve repeatability and packing, computer-aided design tools for compiling a circuit design for the programmable device may first lock down a synthesis cell netlist with stable naming, create location solution files (files with desired clustering granularity for stabilizing performance and reducing compile times) for selected regions of interest on the programmable device, and compose a final design with only the best solutions some of which can be imported from one location to another. Compiling a design in this way can help improve random logic utilization beyond 85% while improving circuit performance by 20% or more.
    Type: Application
    Filed: December 19, 2019
    Publication date: April 23, 2020
    Applicant: Intel Corporation
    Inventors: Gregg William Baeckler, Martin Langhammer
  • Publication number: 20200106442
    Abstract: Integrated circuits with programmable logic regions are provided. The programmable logic regions may be organized into smaller logic units sometimes referred to as a logic element. A logic element may include four lookup tables coupled to an adder carry chain. At least some of the lookup tables are configured to output combinatorial outputs, whereas the adder carry chain are used to output sum outputs. Both the combinatorial outputs and the sum outputs may be used simultaneously to support a multiplication operation, three or more logic operations, or arithmetic and combinatorial operations in parallel.
    Type: Application
    Filed: September 27, 2018
    Publication date: April 2, 2020
    Applicant: Intel Corporation
    Inventors: Martin Langhammer, Sergey Gribok, Gregg William Baeckler
  • Patent number: 10601426
    Abstract: A programmable device may have logic circuitry formed in a top die and memory and specialized processing blocks formed in a bottom die, where the top die is stacked directly on top of the bottom die in a face-to-face configuration. The logic circuitry may include logic sectors, logic array blocks, logic elements, and other types of logic regions. The memory blocks may include large banks of multiport memory for storing data. The specialized processing blocks may include multipliers, adders, and other arithmetic components. The logic circuitry may access the memory and specialized processing blocks via an address encoded scheme. Configured in this way, the maximum operating frequency of the programmable device can be optimized such that critical paths will no longer need to traverse any unused memory and specialized processing blocks.
    Type: Grant
    Filed: September 6, 2018
    Date of Patent: March 24, 2020
    Assignee: Intel Corporation
    Inventors: Dheeraj Subbareddy, Md Altaf Hossain, Ankireddy Nalamalpu, Robert Sankman, Ravindranath Mahajan, Gregg William Baeckler
  • Publication number: 20200083890
    Abstract: A programmable device may have logic circuitry formed in a top die and memory and specialized processing blocks formed in a bottom die, where the top die is stacked directly on top of the bottom die in a face-to-face configuration. The logic circuitry may include logic sectors, logic array blocks, logic elements, and other types of logic regions. The memory blocks may include large banks of multiport memory for storing data. The specialized processing blocks may include multipliers, adders, and other arithmetic components. The logic circuitry may access the memory and specialized processing blocks via an address encoded scheme. Configured in this way, the maximum operating frequency of the programmable device can be optimized such that critical paths will no longer need to traverse any unused memory and specialized processing blocks.
    Type: Application
    Filed: September 6, 2018
    Publication date: March 12, 2020
    Applicant: Intel Corporation
    Inventors: Dheeraj Subbareddy, MD Altaf Hossain, Ankireddy Nalamalpu, Robert Sankman, Ravindranath Mahajan, Gregg William Baeckler
  • Publication number: 20200026494
    Abstract: A programmable device may be configured to support machine learning training operations using matrix multiplication circuitry implemented on a systolic array. The systolic array includes an array of processing elements, each of which includes hybrid floating-point dot-product circuitry. The hybrid dot-product circuitry has a hard data path that uses digital signal processing (DSP) blocks operating in floating-point mode and a hard/soft data path that uses DSP blocks operating in fixed-point mode operated in conjunction with general purpose soft logic. The hard/soft data path includes 2-element dot-product circuits that feed an adder tree. Results from the hard data path are combined with the adder tree using format conversion and normalization circuitry. Inputs to the hybrid dot-product circuitry may be in the BFLOAT16 format. The hard data path may be in the single precision format. The hard/soft data path uses a custom format that is similar to but different than BFLOAT16.
    Type: Application
    Filed: September 27, 2019
    Publication date: January 23, 2020
    Applicant: Intel Corporation
    Inventors: Martin Langhammer, Bogdan Pasca, Sergey Gribok, Gregg William Baeckler, Andrei Hagiescu
  • Publication number: 20190324724
    Abstract: A computer-implemented method for programming an integrated circuit includes receiving a program design and determining one or more addition operations based on the program design. The method also includes performing geometric synthesis based on the one or more addition operations by determining a plurality of bits associated with the one or more addition operations and defining a plurality of counters that includes the plurality of bits. Furthermore, the method includes generating instructions configured to cause circuitry configured to perform the one or more addition operations to be implemented on the integrated circuit based on the plurality of counters. The circuitry includes first adder circuitry configured to add a portion of the plurality of bits and produce a carry-out value. The circuitry also includes second adder circuitry configured to determine a sum of a second portion of the plurality of bits and the carry-out value.
    Type: Application
    Filed: June 28, 2019
    Publication date: October 24, 2019
    Inventors: Sergey Vladimirovich Gribok, Gregg William Baeckler, Martin Langhammer
  • Publication number: 20190318058
    Abstract: Techniques for designing and implementing networks-on-chip (NoCs) are provided. For example, a computer-implemented method for programming a network-on-chip (NoC) onto an integrated circuit includes determining a first portion of a plurality of registers to potentially be included in a NoC design, determining routing information regarding datapaths between registers of the first portion of the plurality of registers, and determining an expected performance associated with the first portion of the plurality of registers. The method also includes determining whether the expected performance is within a threshold range, including the first portion of the plurality of registers and the datapaths in the NoC design after determining that the expected performance is within the threshold range, and generating instructions configured to cause circuitry corresponding to the NoC design to be implemented on the integrated circuit.
    Type: Application
    Filed: June 28, 2019
    Publication date: October 17, 2019
    Inventors: Gregg William Baeckler, Martin Langhammer, Sergey Vladimirovich Gribok
  • Publication number: 20190213289
    Abstract: A method for designing a system on a target device is disclosed. The system is synthesized from a register transfer level description. The system is placed on the target device. The system is routed on the target device. A configuration file is generated that reflects the synthesizing, placing, and routing of the system for programming the target device. A modification for the system is identified. The configuration file is modified to effectuate the modification for the system without changing the placing and routing of the system.
    Type: Application
    Filed: March 18, 2019
    Publication date: July 11, 2019
    Inventors: Gregg William BAECKLER, Martin LANGHAMMER, Sergey GRIBOK, Scott J. WEBER, Gregory STEINKE
  • Patent number: 10296479
    Abstract: The present disclosure provides an innovative circuit structure for control insertion into a multiple-word wide data stream. The control-insertion circuit structure is advantageously scalable as the data width increases. An exemplary implementation of the control-insertion circuit structure includes a multiple-layer shifting circuit. The multiple-layer shifting circuit has some similarities with a barrel shifter. However, unlike a barrel shifter, the multiple-layer shifting circuit moves data words in both directions and moves portions of the data to create spaces or holes in the data (rather than moving the entire width as a barrel shifter does). The output of the multiple-layer shifting circuit is a “swiss-cheese-like” structure of data, where the spaces or holes in the data are available for control insertion. Other features, aspects and embodiments are also disclosed.
    Type: Grant
    Filed: December 18, 2015
    Date of Patent: May 21, 2019
    Assignee: Altera Corporation
    Inventors: Gregg William Baeckler, David W. Mendel