Patents by Inventor Fabrizio Petrini

Fabrizio Petrini has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240163221
    Abstract: Examples described herein relate to a router. In some examples, the router includes an interface and circuitry coupled to the interface. In some examples, the circuitry is to: based on detection of a drop of a packet of a flow: drop subsequently received packets of the flow and based on receipt of a packet associated with the dropped packet of the flow, forward the received packet and subsequent received packets of the flow.
    Type: Application
    Filed: December 20, 2023
    Publication date: May 16, 2024
    Inventors: Hossein FARROKHBAKHT, Fabrizio PETRINI
  • Publication number: 20240129234
    Abstract: Examples described herein relate to a router interface device. In some examples, the router includes an interface and circuitry. In some examples, the circuitry is to: proactively drop a packet and send a negative acknowledgement (NACK) message to a sender based on lack of buffer space for a response associated with the packet and sent from a downstream network interface device that received the packet and also based on one or more of: congestion at a downstream switch or congestion at an endpoint receiver.
    Type: Application
    Filed: December 20, 2023
    Publication date: April 18, 2024
    Inventors: Hossein FARROKHBAKHT, Fabrizio PETRINI
  • Publication number: 20240129260
    Abstract: Examples described herein relate to a router. In some examples, the router includes an interface and circuitry coupled to the interface. In some examples, the circuitry is to reserve a memory region in a buffer for a response sent by a receiver of a forwarded packet.
    Type: Application
    Filed: December 20, 2023
    Publication date: April 18, 2024
    Inventors: Hossein FARROKHBAKHT, Fabrizio PETRINI
  • Publication number: 20240129235
    Abstract: Examples described herein relate to a router. In some examples, the router includes an interface and circuitry coupled to the interface. In some examples, the circuitry is to determine whether an incoming packet is to reach a faulty link based on a fault location received in a received negative acknowledgment (NACK) message and based on a determination that the incoming packet is to reach the faulty link, drop the packet one or multiple hops before reaching the faulty link.
    Type: Application
    Filed: December 20, 2023
    Publication date: April 18, 2024
    Inventors: Hossein FARROKHBAKHT, Fabrizio PETRINI
  • Publication number: 20230367640
    Abstract: An offload analyzer analyzes a program for porting to a heterogenous computing system by identifying code objects for offloading to an accelerator. Runtime metrics generated by executing the program on a host processor unit are provided to an accelerator model that models the performance of the accelerator and generates estimated accelerator metrics for the program. A code object offload selector selects code objects for offloading based on whether estimated accelerated times of the code objects, which comprise estimated accelerator times and offload overhead times, are better than their host processor unit execution times. The code object offload selector selects additional code objects for offloading using a dynamic-programming-like performance estimation approach that performs a bottom-up traversal of a call tree. A heterogeneous version of the program can be generated for execution on the heterogeneous computing system.
    Type: Application
    Filed: April 23, 2021
    Publication date: November 16, 2023
    Applicant: Intel Corporation
    Inventors: Kermin E. ChoFleming, Jr., Egor A. Kazachkov, Daya Shanker Khudia, Zakhar A. Matveev, Sergey U. Kokljuev, Fabrizio Petrini, Dmitry S. Petrov, Swapna Raj
  • Publication number: 20230325185
    Abstract: Systems, apparatus, articles of manufacture, and methods are disclosed for performance of sparse matrix time dense matrix operations. Example instructions cause programmable circuitry to control execution of the sparse matrix times dense matrix operation using a sparse matrix and a dense matrix stored in memory, and transmit a plurality of instructions to execute the sparse matrix times dense matrix operation to DMA engine circuitry, the plurality of instructions to cause DMA engine circuitry to create an output matrix in the memory, the creation of the output matrix in the memory performed without the programmable circuitry computing the output matrix.
    Type: Application
    Filed: March 31, 2023
    Publication date: October 12, 2023
    Inventors: Jesmin Jahan Tithi, Fabio Checconi, Ahmed Helal, Fabrizio Petrini
  • Publication number: 20230095207
    Abstract: A memory architecture may provide support for any number of direct memory access (DMA) operations at least partially independent of the CPU coupled to the memory. DMA operations may involve data movement between two or more memory locations and may involve minor computations. At least some DMA operations may include any number of atomic functions, and at least some of the atomic functions may include a corresponding return value. A system includes a first direct memory access (DMA) engine to request a DMA operation. The DMA operation includes an atomic operation. The system also includes a second DMA engine to receive a return value associated with the atomic operation and store the return value at a source memory.
    Type: Application
    Filed: September 24, 2021
    Publication date: March 30, 2023
    Inventors: Robert Pawlowski, Fabio Checconi, Fabrizio Petrini
  • Patent number: 11593295
    Abstract: Systems, methods, and apparatuses relating to operations in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first processing element that includes a configuration register within the first processing element to store a configuration value that causes the first processing element to perform an operation according to the configuration value, a plurality of input queues, an input controller to control enqueue and dequeue of values into the plurality of input queues according to the configuration value, a plurality of output queues, and an output controller to control enqueue and dequeue of values into the plurality of output queues according to the configuration value.
    Type: Grant
    Filed: December 14, 2021
    Date of Patent: February 28, 2023
    Assignee: Intel Corporation
    Inventors: Kermin E. Fleming, Jr., Simon C. Steely, Jr., Kent D. Glossop, Mitchell Diamond, Benjamin Keen, Dennis Bradford, Fabrizio Petrini, Barry Tannenbaum, Yongzhi Zhang
  • Patent number: 11526483
    Abstract: Methods, apparatus, systems and articles of manufacture to build a storage architecture for graph data are disclosed herein. Disclosed example apparatus include a neighbor identifier to identify respective sets of neighboring vertices of a graph. The neighboring vertices included in the respective sets are adjacent to respective ones of a plurality of vertices of the graph and respective sets of neighboring vertices are represented as respective lists of neighboring vertex identifiers. The apparatus also includes an element creator to create, in a cache memory, an array of elements that are unpopulated. The array elements have lengths equal to a length of a cache line. In addition, the apparatus includes an element populater to populate the elements with neighboring vertex identifiers. Each of the elements store neighboring vertex identifiers of respective ones of the list of neighboring vertex identifiers.
    Type: Grant
    Filed: March 30, 2018
    Date of Patent: December 13, 2022
    Assignee: Intel Corporation
    Inventors: Stijn Eyerman, Jason M. Howard, Ibrahim Hur, Ivan B. Ganev, Fabrizio Petrini, Joshua B. Fryman
  • Patent number: 11360809
    Abstract: Embodiments of apparatuses, methods, and systems for scheduling tasks to hardware threads are described. In an embodiment, a processor includes a multiple hardware threads and a task manager. The task manager is to issue a task to a hardware thread. The task manager includes a hardware task queue to store a descriptor for the task. The descriptor is to include a field to store a value to indicate whether the task is a single task, a collection of iterative tasks, and a linked list of tasks.
    Type: Grant
    Filed: June 29, 2018
    Date of Patent: June 14, 2022
    Assignee: Intel Corporation
    Inventors: William Paul Griffin, Joshua Fryman, Jason Howard, Sang Phill Park, Robert Pawlowski, Michael Abbott, Scott Cline, Samkit Jain, Ankit More, Vincent Cave, Fabrizio Petrini, Ivan Ganev
  • Publication number: 20220107911
    Abstract: Systems, methods, and apparatuses relating to operations in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first processing element that includes a configuration register within the first processing element to store a configuration value that causes the first processing element to perform an operation according to the configuration value, a plurality of input queues, an input controller to control enqueue and dequeue of values into the plurality of input queues according to the configuration value, a plurality of output queues, and an output controller to control enqueue and dequeue of values into the plurality of output queues according to the configuration value.
    Type: Application
    Filed: December 14, 2021
    Publication date: April 7, 2022
    Inventors: Kermin E. FLEMING, Simon C. STEELY, Kent D. GLOSSOP, Mitchell DIAMOND, Benjamin KEEN, Dennis BRADFORD, Fabrizio Petrini, Barry TANNENBAUM, Yongzhi ZHANG
  • Publication number: 20210406214
    Abstract: Methods and apparatus for in-network parallel prefix scan. In one aspect, a dual binary tree topology is embedded in a network to compute prefix scan calculations as data packets traverse the binary tree topology. The dual binary tree topology includes up and down aggregation trees. Input values for a prefix scan are provided at leaves of the up tree. Prefix scan operations such as sum, multiplication, max, etc. are performed at aggregation nodes within the up tree as packets containing associated data propagate from the leaves to the root of the up tree. Output from aggregation nodes in the up tree are provide as input to aggregation nodes in the down tree. In the down tree, the packets containing associated data propagate from the root to its leaves. Output values for the prefix scan are provided at the leaves of the down tree.
    Type: Application
    Filed: September 8, 2021
    Publication date: December 30, 2021
    Inventors: Fabrizio PETRINI, Kartik LAKHOTIA
  • Publication number: 20210409265
    Abstract: Examples described herein relate to a first group of core nodes to couple with a group of switch nodes and a second group of core nodes to couple with the group of switch nodes, wherein: a core node of the first or second group of core nodes includes circuitry to execute one or more message passing instructions that indicate a configuration of a network to transmit data toward two or more endpoint core nodes and a switch node of the group of switch nodes includes circuitry to execute one or more message passing instructions that indicate the configuration to transmit data toward the two or more endpoint core nodes.
    Type: Application
    Filed: September 13, 2021
    Publication date: December 30, 2021
    Inventors: Robert PAWLOWSKI, Vincent CAVE, Shruti SHARMA, Fabrizio PETRINI, Joshua B. FRYMAN, Ankit MORE
  • Patent number: 11200186
    Abstract: Systems, methods, and apparatuses relating to operations in a configurable spatial accelerator are described. In one embodiment, a configurable spatial accelerator includes a first processing element that includes a configuration register within the first processing element to store a configuration value that causes the first processing element to perform an operation according to the configuration value, a plurality of input queues, an input controller to control enqueue and dequeue of values into the plurality of input queues according to the configuration value, a plurality of output queues, and an output controller to control enqueue and dequeue of values into the plurality of output queues according to the configuration value.
    Type: Grant
    Filed: June 30, 2018
    Date of Patent: December 14, 2021
    Assignee: Intel Corporation
    Inventors: Kermin E. Fleming, Jr., Simon C. Steely, Jr., Kent D. Glossop, Mitchell Diamond, Benjamin Keen, Dennis Bradford, Fabrizio Petrini, Barry Tannenbaum, Yongzhi Zhang
  • Publication number: 20210149683
    Abstract: Examples include techniques for an in-network acceleration of a parallel prefix-scan operation. Examples include configuring registers of a node included in a plurality of nodes on a same semiconductor package. The registers to be configured responsive to receiving an instruction that indicates a logical tree to map to a network topology that includes the node. The instruction associated with a prefix-scan operation to be executed by at least a portion of the plurality of nodes.
    Type: Application
    Filed: December 21, 2020
    Publication date: May 20, 2021
    Inventors: Ankit MORE, Fabrizio PETRINI, Robert PAWLOWSKI, Shruti SHARMA, Sowmya PITCHAIMOORTHY
  • Patent number: 10983793
    Abstract: The present disclosure is directed to systems and methods of performing one or more broadcast or reduction operations using direct memory access (DMA) control circuitry. The DMA control circuitry executes a modified instruction set architecture (ISA) that facilitates the broadcast distribution of data to a plurality of destination addresses in system memory circuitry. The broadcast instruction may include broadcast of a single data value to each destination address. The broadcast instruction may include broadcast of a data array to each destination address. The DMA control circuitry may also execute a reduction instruction that facilitates the retrieval of data from a plurality of source addresses in system memory and performing one or more operations using the retrieved data. Since the DMA control circuitry, rather than the processor circuitry performs the broadcast and reduction operations, system speed and efficiency is beneficially enhanced.
    Type: Grant
    Filed: March 29, 2019
    Date of Patent: April 20, 2021
    Assignee: Intel Corporation
    Inventors: Joshua Fryman, Ankit More, Jason Howard, Robert Pawlowski, Yigit Demir, Nick Pepperling, Fabrizio Petrini, Sriram Aananthakrishnan, Shaden Smith
  • Publication number: 20200310795
    Abstract: The present disclosure is directed to systems and methods of performing one or more broadcast or reduction operations using direct memory access (DMA) control circuitry. The DMA control circuitry executes a modified instruction set architecture (ISA) that facilitates the broadcast distribution of data to a plurality of destination addresses in system memory circuitry. The broadcast instruction may include broadcast of a single data value to each destination address. The broadcast instruction may include broadcast of a data array to each destination address. The DMA control circuitry may also execute a reduction instruction that facilitates the retrieval of data from a plurality of source addresses in system memory and performing one or more operations using the retrieved data. Since the DMA control circuitry, rather than the processor circuitry performs the broadcast and reduction operations, system speed and efficiency is beneficially enhanced.
    Type: Application
    Filed: March 29, 2019
    Publication date: October 1, 2020
    Applicant: INTEL CORPORATION
    Inventors: Joshua Fryman, Ankit More, Jason Howard, Robert Pawlowski, Yigit Demir, Nick Pepperling, Fabrizio Petrini, Sriram Aananthakrishnan, Shaden Smith
  • Publication number: 20200004587
    Abstract: Embodiments of apparatuses, methods, and systems for a multithreaded processor core with hardware-assisted task scheduling are described. In an embodiment, a processor includes a first hardware thread, a second hardware thread, and a task manager. The task manager is to issue a task to the first hardware thread. The task manager includes a hardware task queue in which to store a plurality of task descriptors. Each of the task descriptors is to represent one of a single task, a collection of iterative tasks, and a linked list of tasks.
    Type: Application
    Filed: June 29, 2018
    Publication date: January 2, 2020
    Inventors: Paul Griffin, Joshua Fryman, Jason Howard, Sang Phill Park, Robert Pawlowski, Michael Abbott, Scott Cline, Samkit Jain, Ankit More, Vincent Cave, Fabrizio Petrini, Ivan Ganev
  • Patent number: 10484125
    Abstract: A communication system includes first and second devices. The first device includes a first transmitter and a first receiver. The first transmitter transmits one data of a first type using one or more first channels over a first communication link to the second device. The first receiver receives one data of a second type, from the second device, using one or more second channels over the first communication link. The second device includes a second transmitter and a second receiver. The second receiver receives the one data of the first type using the one or more first channels over the first communication link, and to generate the one data of the second type based on the one data of the first type. The second transmitter transmits the one data of the second type using one or more second channels over the first communication link to the first device.
    Type: Grant
    Filed: November 2, 2018
    Date of Patent: November 19, 2019
    Assignee: International Business Machines Corporation
    Inventors: Alan F. Benner, Douglas M. Freimuth, Benjamin G. Lee, Fabrizio Petrini, Laurent Schares, Clint L. Schow, Mehmet Soyuer
  • Patent number: 10476492
    Abstract: Embodiments herein may present an integrated circuit including a switch, where the switch together with other switches forms a network of switches to perform a sequence of operations according to a structure of a collective tree. The switch includes a first number of input ports, a second number of output ports, a configurable crossbar to selectively couple the first number of input ports to the second number of output ports, and a computation engine coupled to the first number of input ports, the second number of output ports, and the crossbar. The computation engine of the switch performs an operation corresponding to an operation represented by a node of the collective tree. The switch further includes one or more registers to selectively configure the first number of input ports and the configurable crossbar. Other embodiments may be described and/or claimed.
    Type: Grant
    Filed: November 27, 2018
    Date of Patent: November 12, 2019
    Assignee: Intel Corporation
    Inventors: Ankit More, Jason M. Howard, Robert Pawlowski, Fabrizio Petrini, Shaden Smith