Patents by Inventor Babak Falsafi

Babak Falsafi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11748254
    Abstract: Data transformer apparatus comprising a dispatcher module, a reader module, a converter module and a writer module; the dispatcher module is configured to receive a data transformation request including a first and a second information items; the reader module is configured to retrieve data to be transformed, according to said first information item; obtain the type attribute of the data to be transformed, based on said first information item; send the data to be transformed and the type attribute to the converter module; the converter module is configured to select transformation instructions based on said type attribute; execute, on the data to be transformed, the selected transformation instructions, thereby obtaining transformed data; send the transformed data to the writer module; the writer module is configured to; write the transformed data in an output buffer according to said second information item.
    Type: Grant
    Filed: August 27, 2019
    Date of Patent: September 5, 2023
    Assignee: Ecole Polytechnique Federale de Lausanne (EPFL)
    Inventors: Arash Pourhabibi Zarandi, Siddharth Gupta, Hussein Kassir, Mark Sutherland, Zilu Tian, Mario Paulo Drumond Lages De Oliveira, Babak Falsafi, Christoph Koch
  • Publication number: 20220327048
    Abstract: Data transformer apparatus comprising at least a dispatcher module (D), a reader module (R), a converter module (C) and a writer module (W). The dispatcher module (D) is configured to: receive a data transformation request (DTR) including: a first information item (X1) associated to a memory address where data to be transformed (Data1) are stored and to a type attribute (T) of said data to be transformed (Data1); a second information item (X2) indicating a memory address where transformed data (Data2), obtained from said data to be transformed (Data1), have to be written. The reader module (R) is configured to: retrieve the data to be transformed, according to said first information item (X1); obtain the type attribute (T) of the data to be transformed (Data1), based on said first information item (X1); send the data to be transformed (Data1) and the type attribute (T) thereof to the converter module (C).
    Type: Application
    Filed: August 27, 2019
    Publication date: October 13, 2022
    Applicant: Ecole Polytechnique Federale de Lausanne (EPFL)
    Inventors: Arash POURHABIBI ZARANDI, Siddharth GUPTA, Hussein KASSIR, Mark SUTHERLAND, Zilu TIAN, Mario Paulo DRUMOND LAGES DE OLIVEIRA, Babak FALSAFI, Christoph KOCH
  • Patent number: 10929174
    Abstract: A distributed memory system including a plurality of chips, a plurality of nodes that are distributed across the plurality of chips such that each node is comprised within a chip, each node includes a dedicated local memory and a processor core, and each local memory is configured to be accessible over network communication, a network interface for each node, the network interface configured such that a corresponding network interface of each node is integrated in a coherence domain of the chip of the corresponding node, wherein each of the network interfaces are configured to support a one-sided operation, the network interface directly reading or writing in the dedicated local memory of the corresponding node without involving a processor core, and the one-sided operation is configured such that the processor core of a corresponding node uses a protocol to directly inject a remote memory access for read or write request to the network interface of the node, the remote memory access request allowing to read
    Type: Grant
    Filed: December 12, 2017
    Date of Patent: February 23, 2021
    Assignee: ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE (EPFL)
    Inventors: Alexandros Daglis, Boris Robert Grot, Babak Falsafi
  • Publication number: 20180173673
    Abstract: A distributed memory system including a plurality of chips, a plurality of nodes that are distributed across the plurality of chips such that each node is comprised within a chip, each node includes a dedicated local memory and a processor core, and each local memory is configured to be accessible over network communication, a network interface for each node, the network interface configured such that a corresponding network interface of each node is integrated in a coherence domain of the chip of the corresponding node, wherein each of the network interfaces are configured to support a one-sided operation, the network interface directly reading or writing in the dedicated local memory of the corresponding node without involving a processor core, and wherein the one-sided operation is configured such that the processor core of a corresponding node uses a protocol to directly inject a remote memory access for read or write request to the network interface of the node, the remote memory access request allowing
    Type: Application
    Filed: December 12, 2017
    Publication date: June 21, 2018
    Inventors: Alexandros Daglis, Boris Robert Grot, Babak Falsafi
  • Patent number: 9996358
    Abstract: A system and method of coupling a Branch Target Buffer (BTB) content of a BTB with an instruction cache content of an instruction cache. The method includes: tagging a plurality of target buffer entries that belong to branches within a same instruction block with a corresponding instruction block address and a branch bitmap to indicate individual branches in the block; coupling an overflow buffer with the BTB to accommodate further target buffer entries of instruction blocks, distinct from the plurality of target buffer entries, which have more branches than the bundle is configured to accommodate in the corresponding instruction's bundle in the BTB; and predicting the instructions or the instruction blocks that are likely to be fetched by the core in the future and fetch those instructions from the lower levels of the memory hierarchy proactively by means of a prefetcher.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: June 12, 2018
    Assignee: ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE
    Inventors: Babak Falsafi, Ilknur Cansu Kaynak, Boris Robert Grot
  • Patent number: 9734063
    Abstract: A computing system that uses a Scale-Out NUMA (“soNUMA”) architecture, programming model, and/or communication protocol provides for low-latency, distributed in-memory processing. Using soNUMA, a programming model is layered directly on top of a NUMA memory fabric via a stateless messaging protocol. To facilitate interactions between the application, OS, and the fabric, soNUMA uses a remote memory controller—an architecturally-exposed hardware block integrated into the node's local coherence hierarchy.
    Type: Grant
    Filed: February 27, 2015
    Date of Patent: August 15, 2017
    Assignee: ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE (EPFL)
    Inventors: Stanko Novakovic, Alexandros Daglis, Boris Robert Grot, Edouard Bugnion, Babak Falsafi
  • Patent number: 9703707
    Abstract: A NOC comprises a die having a cache and a core area, a plurality of core tiles arranged in the core area in a plurality of subsets, at least one cache memory bank arranged in the cache area, whereby the at least one cache memory bank is distinct from each of the plurality of core files. The NOC further comprises an interconnect fabric comprising a request tree to connect to a first cache memory bank of the at least one cache memory bank, each core tile of a first one of the subsets, the first subset corresponding to the first cache memory bank, such that each core tile is connected to the first cache memory bank only, and a reply tree to connect the first cache memory bank to each core tile of the first subset.
    Type: Grant
    Filed: December 4, 2012
    Date of Patent: July 11, 2017
    Assignee: Ecole Polytechnique Fédérale de Lausanne (EPFL)
    Inventors: Babak Falsafi, Boris Grot, Pejman Lotfi Kamran
  • Publication number: 20170090935
    Abstract: A system and method of coupling a Branch Target Buffer (BTB) content of a BTB with an instruction cache content of an instruction cache. The method includes: tagging a plurality of target buffer entries that belong to branches within a same instruction block with a corresponding instruction block address and a branch bitmap to indicate individual branches in the block; coupling an overflow buffer with the BTB to accommodate further target buffer entries of instruction blocks, distinct from the plurality of target buffer entries, which have more branches than the bundle is configured to accommodate in the corresponding instruction's bundle in the BTB; and predicting the instructions or the instruction blocks that are likely to be fetched by the core in the future and fetch those instructions from the lower levels of the memory hierarchy proactively by means of a prefetcher.
    Type: Application
    Filed: September 30, 2015
    Publication date: March 30, 2017
    Inventors: Babak FALSAFI, Ilknur Cansu KAYNAK, Boris Robert GROT
  • Publication number: 20150242324
    Abstract: A computing system that uses a Scale-Out NUMA (“soNUMA”) architecture, programming model, and/or communication protocol provides for low-latency, distributed in-memory processing. Using soNUMA, a programming model is layered directly on top of a NUMA memory fabric via a stateless messaging protocol. To facilitate interactions between the application, OS, and the fabric, soNUMA uses a remote memory controller—an architecturally-exposed hardware block integrated into the node's local coherence hierarchy.
    Type: Application
    Filed: February 27, 2015
    Publication date: August 27, 2015
    Inventors: Stanko Novakovic, Alexandros Daglis, Boris Robert Grot, Edouard Bugnion, Babak Falsafi
  • Publication number: 20140156929
    Abstract: A Network-On-Chip (NOC) organization comprises a die having a cache area and a core area, a plurality of core tiles arranged in the core area in a plurality of subsets, at least one cache memory bank arranged in the cache area, whereby the at least one cache memory bank is distinct from each of the plurality of core tiles. The NOC organization further comprises an interconnect fabric comprising a request tree to connect to a first cache memory bank of the at least one cache memory bank, each core tile of a first one of the subsets, the first subset corresponding to the first cache memory bank, such that each core tile of the first subset is connected to the first cache memory bank only, and allow guiding data packets from each core tile of the first subset to the first memory bank, and a reply tree to connect the first cache memory bank to each core tile of the first subset, and allow guiding data packets from the first cache memory bank to a core tile of the first subset.
    Type: Application
    Filed: December 4, 2012
    Publication date: June 5, 2014
    Applicant: ECOLE POLYTECHNIQUE FEDERALE DE LAUSANNE (EPFL)
    Inventors: Babak FALSAFI, Boris GROT, Pejman LOTFI KAMRAN
  • Patent number: 5951657
    Abstract: A device interface for communicating between a processor system and a separate device employs cacheable control registers, both to indicate the receipt of a message and to receive messages to be transmitted. The data structure of the cacheable control registers may be that of a queue, minimizing the need for routine handshaking signals to clear the queue after each message. Communication of queue pointers is minimized by the use of a shadow pointer relied on as long as adequate queue space exists and queue entry valid flags which are interpreted with alternate sense for each cycling through the queue.
    Type: Grant
    Filed: June 9, 1997
    Date of Patent: September 14, 1999
    Assignee: Wisconsin Alumni Research Foundation
    Inventors: David A. Wood, Steven K. Reinhardt, Shubhendu S. Mukherjee, Babak Falsafi, Mark D. Hill, Robert W. Pfile