Patents by Inventor Andrew Lines

Andrew Lines has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11908542
    Abstract: Prior knowledge of access pattern is leveraged to improve energy dissipation for general matrix operations. This improves memory access energy for a multitude of applications such as image processing, deep neural networks, and scientific computing workloads, for example. In some embodiments, prior knowledge of access pattern allows for burst read and/or write operations. As such, burst mode solution can provide energy savings in both READ (RD) and WRITE (WR) operations. For machine learning or inference, the weight values are known ahead in time (e.g., inference operation), and so the unused bytes in the cache line are exploited to store a sparsity map that is used for disabling read from either upper or lower half of the cache line, thus saving dynamic capacitance.
    Type: Grant
    Filed: December 23, 2019
    Date of Patent: February 20, 2024
    Assignee: Intel Corporation
    Inventors: Charles Augustine, Somnath Paul, Turbo Majumder, Iqbal Rajwani, Andrew Lines, Altug Koker, Lakshminarayanan Striramassarma, Muhammad Khellah
  • Publication number: 20210193196
    Abstract: Prior knowledge of access pattern is leveraged to improve energy dissipation for general matrix operations. This improves memory access energy for a multitude of applications such as image processing, deep neural networks, and scientific computing workloads, for example. In some embodiments, prior knowledge of access pattern allows for burst read and/or write operations. As such, burst mode solution can provide energy savings in both READ (RD) and WRITE (WR) operations. For machine learning or inference, the weight values are known ahead in time (e.g., inference operation), and so the unused bytes in the cache line are exploited to store a sparsity map that is used for disabling read from either upper or lower half of the cache line, thus saving dynamic capacitance.
    Type: Application
    Filed: December 23, 2019
    Publication date: June 24, 2021
    Applicant: Intel Corporation
    Inventors: Charles Augustine, Somnath Paul, Turbo Majumder, Iqbal Rajwani, Andrew Lines, Altug Koker, Lakshminarayanan Striramassarma, Muhammad Khellah
  • Patent number: 8954661
    Abstract: Efficient hardware implementations of a binary search algorithm are provided.
    Type: Grant
    Filed: November 11, 2010
    Date of Patent: February 10, 2015
    Assignee: Intel Corporation
    Inventor: Andrew Lines
  • Patent number: 8495543
    Abstract: Techniques are described for generating asynchronous circuits (e.g., in the form of one or more netlists) for implementation, e.g., in integrated circuitry/chips. Embodiments are directed to asynchronous multi-level domino design template and several variants including a mixture of domino and single-rail data logic. The templates can provide high throughput, low latency, and area efficiency. A multi-level domino template is partitioned into pipeline stages in which each stage consists of potentially multiple-levels of domino logic controlled by a single controller that communicates with other controllers via handshaking. Each stage is composed of two parts: a data path and a control path. The data path implements the computational logic, both combinational and sequential using efficient dual-rail domino logic. The control path implements a unique four-phase handshake to ensure correctness and the preservation of logical dependencies between pipeline stages.
    Type: Grant
    Filed: June 17, 2009
    Date of Patent: July 23, 2013
    Assignees: University of Southern California, Fulcrum Microsystems, Inc.
    Inventors: Georgios Dimou, Peter A. Beerel, Andrew Lines
  • Patent number: 8448105
    Abstract: Techniques are described for generating asynchronous circuits from any arbitrary HDL representation of a synchronous circuit by automatically clustering the synthesized gates into pipeline stages that are then slack-matched to meet performance goals while minimizing area. Automatic pipelining can be provided in which the throughput of the overall design is not limited to the clock frequency or the level of pipelining in the original RTL specification. The techniques are applicable to many asynchronous design styles. A model and infrastructure can be designed that guides clustering to avoid the introduction of deadlocks and achieve a target circuit performance. Slack matching models can be used to take advantage of fanout optimizations of buffer trees that improve the quality of the results.
    Type: Grant
    Filed: April 24, 2009
    Date of Patent: May 21, 2013
    Assignees: University of Southern California, Fulcrum Microsystems, Inc.
    Inventors: Georgios Dimou, Peter A. Beerel, Andrew Lines
  • Patent number: 8370557
    Abstract: A memory is described which includes a main memory array made up of multiple single-ported memory banks connected by parallel read and write buses, and a sideband memory equivalent to a single dual-ported memory bank. Control logic and tags state facilitates a pattern of access to the main memory and the sideband memory such that the memory performs like a fully provisioned dual-ported memory capable of reading and writing any two arbitrary addresses on the same cycle.
    Type: Grant
    Filed: December 19, 2008
    Date of Patent: February 5, 2013
    Assignee: Intel Corporation
    Inventors: Jonathan Dama, Andrew Lines
  • Publication number: 20120110049
    Abstract: Efficient hardware implementations of a binary search algorithm are provided.
    Type: Application
    Filed: November 11, 2010
    Publication date: May 3, 2012
    Applicant: FULCRUM MICROSYSTEMS
    Inventor: Andrew Lines
  • Patent number: 8086975
    Abstract: Techniques are described for converting netlists for synchronous circuits such as combinational modules, flip flops (or latches), and clock gating modules, to netlist of asynchronous modules. Processes including algorithms are described that bundle multiple modules in an enable domain, so that they are activated only if the incoming enable token to the enable domain has the UPDATE value. The modules can be clustered inside an enable domain, so that each cluster has a separate controller. The objective function of bundling and clustering can minimize power consumption with respect to a given cycle time. Exemplary embodiments can include a gated multilevel domino template.
    Type: Grant
    Filed: April 10, 2009
    Date of Patent: December 27, 2011
    Assignees: University of Southern California, Fulcrum Microsystems, Inc.
    Inventors: Ken Shiring, Peter A. Beerel, Andrew Lines, Arash Saifhashemi
  • Patent number: 8051396
    Abstract: Methods and apparatus are described for optimizing a circuit design. A gate level circuit description corresponding to the circuit design is generated. The gate level circuit description includes a plurality of pipelines across a plurality of levels. Using a linear programming technique, a minimal number of buffers is added to selected ones of the pipelines such that a performance constraint is satisfied.
    Type: Grant
    Filed: May 4, 2009
    Date of Patent: November 1, 2011
    Assignee: Fulcrum Microsystems, Inc.
    Inventors: Peter Beerel, Andrew Lines, Michael Davies
  • Publication number: 20110029941
    Abstract: Techniques are described for generating asynchronous circuits (e.g., in the form of one or more netlists) for implementation, e.g., in integrated circuitry/chips. Embodiments are directed to asynchronous multi-level domino design template and several variants including a mixture of domino and single-rail data logic. The templates can provide high throughput, low latency, and area efficiency. A multi-level domino template is partitioned into pipeline stages in which each stage consists of potentially multiple-levels of domino logic controlled by a single controller that communicates with other controllers via handshaking. Each stage is composed of two parts: a data path and a control path. The data path implements the computational logic, both combinational and sequential using efficient dual-rail domino logic. The control path implements a unique four-phase handshake to ensure correctness and the preservation of logical dependencies between pipeline stages.
    Type: Application
    Filed: June 17, 2009
    Publication date: February 3, 2011
    Applicants: UNIVERSITY OF SOUTHERN CALIFORNIA, FULCRUM MICROSYSTEMS, INC.
    Inventors: Georgios Dimou, Peter A. Beerel, Andrew Lines
  • Publication number: 20100325370
    Abstract: A shared memory is described having a plurality of receive ports and a plurality of transmit ports characterized by a first data rate. A memory includes a plurality of memory banks organized in rows and columns. Operation of the memory array is characterized by a second data rate. Non-blocking receive crossbar circuitry is operable to connect any of the receive ports with any of the memory banks. Non-blocking transmit crossbar circuitry is operable to connect any of the memory banks with any of the transmit ports. Buffering is operable to decouple operation of the receive and transmit ports at the first data rate from operation of the memory array at the second data rate.
    Type: Application
    Filed: August 24, 2010
    Publication date: December 23, 2010
    Applicant: FULCRUM MICROSYSTEMS INC.
    Inventors: Uri Cummings, Andrew Lines, Patrick Pelletier, Robert Southworth
  • Patent number: 7814280
    Abstract: A shared memory is described having a plurality of receive ports and a plurality of transmit ports characterized by a first data rate. A memory includes a plurality of memory banks organized in rows and columns. Operation of the memory array is characterized by a second data rate. Non-blocking receive crossbar circuitry is operable to connect any of the receive ports with any of the memory banks. Non-blocking transmit crossbar circuitry is operable to connect any of the memory banks with any of the transmit ports. Buffering is operable to decouple operation of the receive and transmit ports at the first data rate from operation of the memory array at the second data rate. Scheduling circuitry is operable to control interaction of the ports, crossbar circuitry, and memory array to effect storage and retrieval of the data segments in the shared memory.
    Type: Grant
    Filed: August 18, 2005
    Date of Patent: October 12, 2010
    Assignee: Fulcrum Microsystems Inc.
    Inventors: Uri Cummings, Andrew Lines, Patrick Pelletier, Robert Southworth
  • Publication number: 20100161892
    Abstract: A memory is described which includes a main memory array made up of multiple single-ported memory banks connected by parallel read and write buses, and a sideband memory equivalent to a single dual-ported memory bank. Control logic and tags state facilitates a pattern of access to the main memory and the sideband memory such that the memory performs like a fully provisioned dual-ported memory capable of reading and writing any two arbitrary addresses on the same cycle.
    Type: Application
    Filed: December 19, 2008
    Publication date: June 24, 2010
    Applicant: FULCRUM MICROSYSTEMS, INC.
    Inventors: Jonathan Dama, Andrew Lines
  • Patent number: 7698535
    Abstract: An asynchronous circuit is described for processing units of data having a program order associated therewith. The circuit includes an N-way-issue resource comprising N parallel pipelines. Each pipeline is operable to transmit a subset of the units of data in a first-in-first-out manner. The asynchronous circuit is operable to sequentially control transmission of the units of data in the pipelines such that the program order is maintained.
    Type: Grant
    Filed: September 16, 2003
    Date of Patent: April 13, 2010
    Assignee: Fulcrum Microsystems, Inc.
    Inventors: Andrew Lines, Robert Southworth, Uri Cummings
  • Publication number: 20090288059
    Abstract: Techniques are described for generating asynchronous circuits from any arbitrary HDL representation of a synchronous circuit by automatically clustering the synthesized gates into pipeline stages that are then slack-matched to meet performance goals while minimizing area. Automatic pipelining can be provided in which the throughput of the overall design is not limited to the clock frequency or the level of pipelining in the original RTL specification. The techniques are applicable to many asynchronous design styles. A model and infrastructure can be designed that guides clustering to avoid the introduction of deadlocks and achieve a target circuit performance. Slack matching models can be used to take advantage of fanout optimizations of buffer trees that improve the quality of the results.
    Type: Application
    Filed: April 24, 2009
    Publication date: November 19, 2009
    Applicants: UNIVERSITY OF SOUTHERN CALIFORNIA, FULCRUM MICROSYSTEMS, INC.
    Inventors: Georgios Dimou, Peter A. Beerel, Andrew Lines
  • Publication number: 20090288058
    Abstract: Techniques are described for converting netlists for synchronous circuits such as combinational modules, flip flops (or latches), and clock gating modules, to netlist of asynchronous modules. Processes including algorithms are described that bundle multiple modules in an enable domain, so that they are activated only if the incoming enable token to the enable domain has the UPDATE value. The modules can be clustered inside an enable domain, so that each cluster has a separate controller. The objective function of bundling and clustering can minimize power consumption with respect to a given cycle time. Exemplary embodiments can include a gated multilevel domino template.
    Type: Application
    Filed: April 10, 2009
    Publication date: November 19, 2009
    Applicant: UNIVERSITY OF SOUTHERN CALIFORNIA
    Inventors: Ken Shiring, Peter A. Beerel, Andrew Lines, Arash Saifhashemi
  • Patent number: 7584449
    Abstract: Methods and apparatus are described for optimizing a circuit design. A gate level circuit description corresponding to the circuit design is generated. The gate level circuit description includes a plurality of pipelines across a plurality of levels. Using a linear programming technique, a minimal number of buffers is added to selected ones of the pipelines such that a performance constraint is satisfied.
    Type: Grant
    Filed: November 10, 2005
    Date of Patent: September 1, 2009
    Assignee: Fulcrum Microsystems, Inc.
    Inventors: Peter Beerel, Andrew Lines, Michael Davies
  • Publication number: 20090217232
    Abstract: Methods and apparatus are described for optimizing a circuit design. A gate level circuit description corresponding to the circuit design is generated. The gate level circuit description includes a plurality of pipelines across a plurality of levels. Using a linear programming technique, a minimal number of buffers is added to selected ones of the pipelines such that a performance constraint is satisfied.
    Type: Application
    Filed: May 4, 2009
    Publication date: August 27, 2009
    Applicant: FULCRUM MICROSYSTEMS, INC.
    Inventors: Peter Beerel, Andrew Lines, Michael Davies
  • Patent number: 7283557
    Abstract: Methods and apparatus are described relating to a crossbar which is operable to route data from any of a first number of input channels to any of a second number of output channels according to routing control information. Each combination of an input channel and an output channel corresponds to one of a plurality of links. The crossbar circuitry is operable to route the data in a deterministic manner on each of the links thereby preserving a partial ordering represented by the routing control information. Events on different links are uncorrelated.
    Type: Grant
    Filed: April 30, 2002
    Date of Patent: October 16, 2007
    Assignee: Fulcrum Microsystems, Inc.
    Inventors: Uri Cummings, Andrew Lines
  • Patent number: 7274710
    Abstract: Methods and apparatus are described relating to a crossbar which is operable to route data from any of a first number of input channels to any of a second number of output channels according to routing control information. Each combination of an input channel and an output channel corresponds to one of a plurality of links. The crossbar circuitry is operable to route the data in a deterministic manner on each of the links thereby preserving a partial ordering represented by the routing control information. Events on different links are uncorrelated.
    Type: Grant
    Filed: September 6, 2002
    Date of Patent: September 25, 2007
    Assignee: Fulcrum Microsystems, Inc.
    Inventors: Uri Cummings, Andrew Lines