Patents by Inventor Gregg William Baeckler

Gregg William Baeckler has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240394448
    Abstract: A method for implementing a programmable device is provided. The method may include extracting an underlay from an existing routing network on the programmable device and then mapping a user design to the extracted underlay. The underlay may represent a subset of fast routing wires satisfying predetermined constraints. The underlay may be composed of multiple repeating adjacent logic blocks, each implementing some datapath reduction operation. Implementing circuit designs in this way can dramatically improve circuit performance while cutting down compile times by more than half.
    Type: Application
    Filed: August 6, 2024
    Publication date: November 28, 2024
    Inventors: Gregg William Baeckler, Martin Langhammer
  • Patent number: 12086518
    Abstract: A method for implementing a programmable device is provided. The method may include extracting an underlay from an existing routing network on the programmable device and then mapping a user design to the extracted underlay. The underlay may represent a subset of fast routing wires satisfying predetermined constraints. The underlay may be composed of multiple repeating adjacent logic blocks, each implementing some datapath reduction operation. Implementing circuit designs in this way can dramatically improve circuit performance while cutting down compile times by more than half.
    Type: Grant
    Filed: June 1, 2020
    Date of Patent: September 10, 2024
    Assignee: Intel Corporation
    Inventors: Gregg William Baeckler, Martin Langhammer
  • Publication number: 20240020449
    Abstract: Systems or methods of the present disclosure may provide a library including multiple macros that may be pre-compiled prior to implementation of the design. For example, a design may be mapped to one or more macros in the library, and the one or more macros may be placed into and routed between a portion of a region, one region, one or more regions of the integrated circuit device to implement the design. Since the macros may be pre-compiled, compilation time experienced by the designer may correspond to the placement and routing of the one or more macros, which may be less than compilation time for fine-grained operations. The pre-compiled logic within the macros may be set using a lookup table mask to set and/or adjust a functionality of the macro. Additionally or alternatively, the place and route operation may be performed at finer granularities to reduce bottle necks.
    Type: Application
    Filed: September 27, 2023
    Publication date: January 18, 2024
    Inventors: Byron Sinclair, Deshanand P. Singh, Gregg William Baeckler, Mahesh A. Iyer, Michael Kinsner, Chengping Liang, Victor Tzi-on Zhang
  • Publication number: 20230239136
    Abstract: Integrated circuits, methods, and circuitry are provided for performing multiplication such as that used in Galois field counter mode (GCM) hash computations. An integrated circuit may include selection circuitry to provide one of several powers of a hash key. A Galois field multiplier may receive the one of the powers of the hash key and a hash sequence and generate one or more values. The Galois field multiplier may include multiple levels of pipeline stages. An adder may receive the one or more values and provide a summation of the one or more values in computing a GCM hash.
    Type: Application
    Filed: March 31, 2023
    Publication date: July 27, 2023
    Inventors: Sergey Vladimirovich Gribok, Gregg William Baeckler, Bogdan Pasca, Martin Langhammer
  • Publication number: 20230198526
    Abstract: A programmable device may have logic circuitry formed in a top die and memory and specialized processing blocks formed in a bottom die, where the top die is stacked directly on top of the bottom die in a face-to-face configuration. The logic circuitry may include logic sectors, logic array blocks, logic elements, and other types of logic regions. The memory blocks may include large banks of multiport memory for storing data. The specialized processing blocks may include multipliers, adders, and other arithmetic components. The logic circuitry may access the memory and specialized processing blocks via an address encoded scheme. Configured in this way, the maximum operating frequency of the programmable device can be optimized such that critical paths will no longer need to traverse any unused memory and specialized processing blocks.
    Type: Application
    Filed: February 16, 2023
    Publication date: June 22, 2023
    Inventors: Dheeraj Subbareddy, MD Altaf Hossain, Ankireddy Nalamalpu, Robert Sankman, Ravindranath Mahajan, Gregg William Baeckler
  • Patent number: 11595045
    Abstract: A programmable device may have logic circuitry formed in a top die and memory and specialized processing blocks formed in a bottom die, where the top die is stacked directly on top of the bottom die in a face-to-face configuration. The logic circuitry may include logic sectors, logic array blocks, logic elements, and other types of logic regions. The memory blocks may include large banks of multiport memory for storing data. The specialized processing blocks may include multipliers, adders, and other arithmetic components. The logic circuitry may access the memory and specialized processing blocks via an address encoded scheme. Configured in this way, the maximum operating frequency of the programmable device can be optimized such that critical paths will no longer need to traverse any unused memory and specialized processing blocks.
    Type: Grant
    Filed: June 25, 2021
    Date of Patent: February 28, 2023
    Assignee: Intel Corporation
    Inventors: Dheeraj Subbareddy, MD Altaf Hossain, Ankireddy Nalamalpu, Robert Sankman, Ravindranath Mahajan, Gregg William Baeckler
  • Publication number: 20230027064
    Abstract: Systems and methods of the present disclosure provide techniques for reducing power consumption of a large combinational circuit using register insertion. In particular, a large circuit may be analyzed to determine the amount of signal switching at various logical points (e.g., stages in the computation) of the circuit. A clock sequence with many pulses in the period of a clock that runs the large combinatorial circuit may be generated. To balance the amount of signal switching at various logical points in the circuit, registers may be inserted at certain points in the large circuit with the clock pulses of the clock sequence assigned to the registers that may not have a constant frequency or may be phase shifted versions of the main clock.
    Type: Application
    Filed: September 30, 2022
    Publication date: January 26, 2023
    Inventors: Martin Langhammer, Gregg William Baeckler, Sergey Vladimirovich Gribok, Mahesh A. Iyer
  • Publication number: 20230018414
    Abstract: The present disclosure describes techniques for incorporating pipelined DSP blocks or other types of embedded functions into a logic circuit with a slower clock rate without any clock crossing complexities, and at the same time managing the power consumption of the more complex design that results from it. The techniques include generating a faster clock or several faster clocks that may have a faster clock rate than the clock used by the logic circuit and that may be used as clock input to the embedded pipelined DSP blocks. In addition, the present disclosure describes techniques for generating, improving, and using the faster clock to sample the output of a logic circuit using pulses of generated faster clock, which may allow to increase the clock frequency of the circuit to an optimal level, while maintaining functional correctness.
    Type: Application
    Filed: September 29, 2022
    Publication date: January 19, 2023
    Inventors: Martin Langhammer, Gregg William Baeckler, Sergey Vladimirovich Gribok, Mahesh A. Iyer
  • Patent number: 11556692
    Abstract: Techniques for designing and implementing networks-on-chip (NoCs) are provided. For example, a computer-implemented method for programming a network-on-chip (NoC) onto an integrated circuit includes determining a first portion of a plurality of registers to potentially be included in a NoC design, determining routing information regarding datapaths between registers of the first portion of the plurality of registers, and determining an expected performance associated with the first portion of the plurality of registers. The method also includes determining whether the expected performance is within a threshold range, including the first portion of the plurality of registers and the datapaths in the NoC design after determining that the expected performance is within the threshold range, and generating instructions configured to cause circuitry corresponding to the NoC design to be implemented on the integrated circuit.
    Type: Grant
    Filed: December 24, 2020
    Date of Patent: January 17, 2023
    Assignee: Intel Corporation
    Inventors: Gregg William Baeckler, Martin Langhammer, Sergey Vladimirovich Gribok
  • Patent number: 11467804
    Abstract: A computer-implemented method for programming an integrated circuit includes receiving a program design and determining one or more addition operations based on the program design. The method also includes performing geometric synthesis based on the one or more addition operations by determining a plurality of bits associated with the one or more addition operations and defining a plurality of counters that includes the plurality of bits. Furthermore, the method includes generating instructions configured to cause circuitry configured to perform the one or more addition operations to be implemented on the integrated circuit based on the plurality of counters. The circuitry includes first adder circuitry configured to add a portion of the plurality of bits and produce a carry-out value. The circuitry also includes second adder circuitry configured to determine a sum of a second portion of the plurality of bits and the carry-out value.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: October 11, 2022
    Assignee: Intel Corporation
    Inventors: Sergey Vladimirovich Gribok, Gregg William Baeckler, Martin Langhammer
  • Patent number: 11436399
    Abstract: A method for implementing a multiplier on a programmable logic device (PLD) is disclosed. Partial product bits of the multiplier are identified and how the partial product bits are to be summed to generate a final product from a multiplier and multiplicand are determined. Chains of PLD cells and cells in the chains of PLD cells for generating and summing the partial product bits are assigned. It is determined whether a bit in an assigned cell in an assigned chain of PLD cells is under-utilized. In response to determining that a bit is under-utilized, the assigning of the chains of PLD cells and cells for generating and summing the partial product bits are changed to improve an overall utilization of the chains of PLD cells and cells in the chains of PLD cells.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: September 6, 2022
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Sergey Gribok, Gregg William Baeckler
  • Patent number: 11301611
    Abstract: Methods and apparatus for increasing the random logic utilization on a programmable device are provided. Although not completely homogeneous, the programmable device has many components that are repeated many times in an array. To help improve repeatability and packing, computer-aided design tools for compiling a circuit design for the programmable device may first lock down a synthesis cell netlist with stable naming, create location solution files (files with desired clustering granularity for stabilizing performance and reducing compile times) for selected regions of interest on the programmable device, and compose a final design with only the best solutions some of which can be imported from one location to another. Compiling a design in this way can help improve random logic utilization beyond 85% while improving circuit performance by 20% or more.
    Type: Grant
    Filed: December 19, 2019
    Date of Patent: April 12, 2022
    Assignee: Intel Corporation
    Inventors: Gregg William Baeckler, Martin Langhammer
  • Publication number: 20220107783
    Abstract: A programmable device may be configured to support machine learning training operations using matrix multiplication circuitry. In some embodiments, the multiplication is implemented on a systolic array. The systolic array includes an array of processing elements, each of which includes hybrid floating-point dot-product circuitry.
    Type: Application
    Filed: December 16, 2021
    Publication date: April 7, 2022
    Inventors: Martin Langhammer, Bogdan Pasca, Sergey Gribok, Gregg William Baeckler, Andrei Hagiescu
  • Patent number: 11275998
    Abstract: The present disclosure relates generally to techniques for improving the implementation of certain operations on an integrated circuit. In particular, deep learning techniques, which may use a deep neural network (DNN) topology, may be implemented more efficiently using low-precision weights and activation values by efficiently performing down conversion of data to a lower precision and by preventing data overflow during suitable computations. Further, by more efficiently mapping multipliers to programmable logic on the integrated circuit device, the resources used by the DNN topology to perform, for example, inference tasks may be reduced, resulting in improved integrated circuit operating speeds.
    Type: Grant
    Filed: May 31, 2018
    Date of Patent: March 15, 2022
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Sudarshan Srinivasan, Gregg William Baeckler, Duncan Moss, Sasikanth Avancha, Dipankar Das
  • Patent number: 11216249
    Abstract: A method for designing a system on a target device includes identifying a length for a carry chain that is supported by predefined quanta of a resource on the target device. A plurality of logical adders is mapped onto a single logical adder implemented on the carry chain subject to the identified length to increase logic utilization in a design for the system.
    Type: Grant
    Filed: March 27, 2018
    Date of Patent: January 4, 2022
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Gregg William Baeckler
  • Patent number: 11210063
    Abstract: A programmable device may be configured to support machine learning training operations using matrix multiplication circuitry implemented on a systolic array. The systolic array includes an array of processing elements, each of which includes hybrid floating-point dot-product circuitry. The hybrid dot-product circuitry has a hard data path that uses digital signal processing (DSP) blocks operating in floating-point mode and a hard/soft data path that uses DSP blocks operating in fixed-point mode operated in conjunction with general purpose soft logic. The hard/soft data path includes 2-element dot-product circuits that feed an adder tree. Results from the hard data path are combined with the adder tree using format conversion and normalization circuitry. Inputs to the hybrid dot-product circuitry may be in the BFLOAT16 format. The hard data path may be in the single precision format. The hard/soft data path uses a custom format that is similar to but different than BFLOAT16.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: December 28, 2021
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Bogdan Pasca, Sergey Gribok, Gregg William Baeckler, Andrei Hagiescu
  • Patent number: 11163530
    Abstract: Multiplier circuitry includes first combinatorial circuitry configured to perform a combinatorial function, based at least in part on redundant form arithmetic, to generate a first subset of two or more partial products. The two or more partial products are based at least in part on a first input to the multiplier circuitry and a second input to the multiplier circuitry. The multiplier circuitry also includes a carry chain that includes a second combinatorial circuitry configured to generate a second subset of the two or more partial products based at least in part on the first input and the second input. Furthermore, the carry chain includes one or more binary ripple-carry adders configured to generate a product of the multiplier circuitry based at least in part on a sum of the two or more partial products.
    Type: Grant
    Filed: March 22, 2018
    Date of Patent: November 2, 2021
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Gregg William Baeckler
  • Publication number: 20210328589
    Abstract: A programmable device may have logic circuitry formed in a top die and memory and specialized processing blocks formed in a bottom die, where the top die is stacked directly on top of the bottom die in a face-to-face configuration. The logic circuitry may include logic sectors, logic array blocks, logic elements, and other types of logic regions. The memory blocks may include large banks of multiport memory for storing data. The specialized processing blocks may include multipliers, adders, and other arithmetic components. The logic circuitry may access the memory and specialized processing blocks via an address encoded scheme. Configured in this way, the maximum operating frequency of the programmable device can be optimized such that critical paths will no longer need to traverse any unused memory and specialized processing blocks.
    Type: Application
    Filed: June 25, 2021
    Publication date: October 21, 2021
    Inventors: Dheeraj Subbareddy, MD Altaf Hossain, Ankireddy Nalamalpu, Robert Sankman, Ravindranath Mahajan, Gregg William Baeckler
  • Patent number: 11080019
    Abstract: A method for designing and configuring a system on a field programmable gate array (FPGA) is disclosed. A portion of the system that is implemented greater than a predetermined number of times is identified. A structural netlist that describes how to implement the portion of the system a plurality of times on the FPGA and that leverages a repetitive nature of implementing the portion is generated. The identifying and generating is performed prior to synthesizing and placing other portions of the system that are not implemented greater than the predetermined number of time. Synthesizing, placing, and routing the other portions of the system on the FPGA is performed in accordance with the structural netlist. The FPGA is configured with a configuration file that includes a design for the system that reflects the synthesizing, placing, and routing, wherein the configuring physically transforms resources on the FPGA to implement the system.
    Type: Grant
    Filed: June 29, 2018
    Date of Patent: August 3, 2021
    Assignee: Intel Corporation
    Inventors: Martin Langhammer, Gregg William Baeckler, Sergey Gribok
  • Patent number: 11070209
    Abstract: A programmable device may have logic circuitry formed in a top die and memory and specialized processing blocks formed in a bottom die, where the top die is stacked directly on top of the bottom die in a face-to-face configuration. The logic circuitry may include logic sectors, logic array blocks, logic elements, and other types of logic regions. The memory blocks may include large banks of multiport memory for storing data. The specialized processing blocks may include multipliers, adders, and other arithmetic components. The logic circuitry may access the memory and specialized processing blocks via an address encoded scheme. Configured in this way, the maximum operating frequency of the programmable device can be optimized such that critical paths will no longer need to traverse any unused memory and specialized processing blocks.
    Type: Grant
    Filed: February 12, 2020
    Date of Patent: July 20, 2021
    Assignee: Intel Corporation
    Inventors: Dheeraj Subbareddy, Md Altaf Hossain, Ankireddy Nalamalpu, Robert Sankman, Ravindranath Mahajan, Gregg William Baeckler