Patents by Inventor Felix A. Marti

Felix A. Marti has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DATA PROCESSING UNIT FOR STREAM PROCESSING

Publication number: 20240061713

Abstract: A new processing architecture is described that utilizes a data processing unit (DPU). Unlike conventional compute models that are centered around a central processing unit (CPU), the DPU that is designed for a data-centric computing model in which the data processing tasks are centered around the DPU. The DPU may be viewed as a highly programmable, high-performance I/O and data-processing hub designed to aggregate and process network and storage I/O to and from other devices. The DPU comprises a network interface to connect to a network, one or more host interfaces to connect to one or more application processors or storage devices, and a multi-core processor with two or more processing cores executing a run-to-completion data plane operating system and one or more processing cores executing a multi-tasking control plane operating system. The data plane operating system is configured to support software functions for performing the data processing tasks.

Type: Application

Filed: November 1, 2023

Publication date: February 22, 2024

Applicant: Microsoft Technology Licensing, LLC

Inventors: Pradeep Sindhu, Jean-Marc Frailong, Wael Noureddine, Felix A. Marti, Deepak Goel, Rajan Goyal, Bertrand Serlet
Data processing unit for stream processing

Patent number: 11842216

Abstract: A new processing architecture is described that utilizes a data processing unit (DPU). Unlike conventional compute models that are centered around a central processing unit (CPU), the DPU that is designed for a data-centric computing model in which the data processing tasks are centered around the DPU. The DPU may be viewed as a highly programmable, high-performance I/O and data-processing hub designed to aggregate and process network and storage I/O to and from other devices. The DPU comprises a network interface to connect to a network, one or more host interfaces to connect to one or more application processors or storage devices, and a multi-core processor with two or more processing cores executing a run-to-completion data plane operating system and one or more processing cores executing a multi-tasking control plane operating system. The data plane operating system is configured to support software functions for performing the data processing tasks.

Type: Grant

Filed: July 27, 2020

Date of Patent: December 12, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Pradeep Sindhu, Jean-Marc Frailong, Wael Noureddine, Felix A. Marti, Deepak Goel, Rajan Goyal, Bertrand Serlet
Efficient work unit processing in a multicore system

Patent number: 11829295

Abstract: Techniques are described in which a system having multiple processing units processes a series of work units in a processing pipeline, where some or all of the work units access or manipulate data stored in non-coherent memory. In one example, this disclosure describes a method that includes identifying, prior to completing processing of a first work unit with a processing unit of a processor having multiple processing units, a second work unit that is expected to be processed by the processing unit after the first work unit. The method also includes processing the first work unit, and prefetching, from non-coherent memory, data associated with the second work unit into a second cache segment of the buffer cache, wherein prefetching the data associated with the second work unit occurs concurrently with at least a portion of the processing of the first work unit by the processing unit.

Type: Grant

Filed: February 27, 2023

Date of Patent: November 28, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Wael Noureddine, Jean-Marc Frailong, Felix A. Marti, Charles Edward Gray, Paul Kim
Data processing unit for compute nodes and storage nodes

Patent number: 11824683

Abstract: A new processing architecture is described in which a data processing unit (DPU) is utilized within a device. Unlike conventional compute models that are centered around a central processing unit (CPU), example implementations described herein leverage a DPU that is specially designed and optimized for a data-centric computing model in which the data processing tasks are centered around, and the primary responsibility of, the DPU. For example, various data processing tasks, such as networking, security, and storage, as well as related work acceleration, distribution and scheduling, and other such tasks are the domain of the DPU. The DPU may be viewed as a highly programmable, high-performance input/output (I/O) and data-processing hub designed to aggregate and process network and storage I/O to and from multiple other components and/or devices. This frees resources of the CPU, if present, for computing-intensive tasks.

Type: Grant

Filed: March 29, 2022

Date of Patent: November 21, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Pradeep Sindhu, Jean-Marc Frailong, Bertrand Serlet, Wael Noureddine, Felix A. Marti, Deepak Goel, Rajan Goyal
Efficient work unit processing in a multicore system

Patent number: 11734179

Abstract: Techniques are described in which a system having multiple processing units processes a series of work units in a processing pipeline, where some or all of the work units access or manipulate data stored in non-coherent memory. In one example, this disclosure describes a method that includes identifying, prior to completing processing of a first work unit with a processing unit of a processor having multiple processing units, a second work unit that is expected to be processed by the processing unit after the first work unit. The method also includes processing the first work unit, and prefetching, from non-coherent memory, data associated with the second work unit into a second cache segment of the buffer cache, wherein prefetching the data associated with the second work unit occurs concurrently with at least a portion of the processing of the first work unit by the processing unit.

Type: Grant

Filed: June 28, 2021

Date of Patent: August 22, 2023

Assignee: Fungible, Inc.

Inventors: Wael Noureddine, Jean-Marc Frailong, Felix A. Marti, Charles Edward Gray, Paul Kim
SCALED-OUT TRANSPORT AS CONNECTION PROXY FOR DEVICE-TO-DEVICE COMMUNICATIONS

Publication number: 20230224247

Abstract: Techniques are described for providing a scaled-out transport supported by interconnected data processing units (DPUs) that operates as a single system bus connection proxy for device-to-device communications within a data center. As one example, this disclosure describes techniques for providing a Peripheral Component Interconnect Express (PCIe) proxy for device-to-device communications employing the PCIe standard. The disclosed techniques include adding PCIe proxy logic on top of a host unit of a DPU to expose a PCIe proxy model to application processors, storage devices, network interface controllers, field programmable gate arrays, or other PCIe endpoint devices. The PCIe proxy model may be implemented as a physically distributed Ethernet-based switch fabric with PCIe proxy logic at the edge and fronting the PCIe endpoint devices. The interconnected DPUs and the distributed Ethernet-based switch fabric together provide a reliable, low-latency, and scaled-out transport that operates as a PCIe proxy.

Type: Application

Filed: March 23, 2023

Publication date: July 13, 2023

Inventors: Wael Noureddine, Felix A. Marti, Aibing Zhou, Dmitriy Leonidovich Budko, Gaurav Gupte, Hoai Vu Thanh Tran, Aravind Vidhyasagar Lappasi, Leith Alan Leedom, Rajesh G. Nair
EFFICIENT WORK UNIT PROCESSING IN A MULTICORE SYSTEM

Publication number: 20230205702

Abstract: Techniques are described in which a system having multiple processing units processes a series of work units in a processing pipeline, where some or all of the work units access or manipulate data stored in non-coherent memory. In one example, this disclosure describes a method that includes identifying, prior to completing processing of a first work unit with a processing unit of a processor having multiple processing units, a second work unit that is expected to be processed by the processing unit after the first work unit. The method also includes processing the first work unit, and prefetching, from non-coherent memory, data associated with the second work unit into a second cache segment of the buffer cache, wherein prefetching the data associated with the second work unit occurs concurrently with at least a portion of the processing of the first work unit by the processing unit.

Type: Application

Filed: February 27, 2023

Publication date: June 29, 2023

Inventors: Wael Noureddine, Jean-Marc Frailong, Felix A. Marti, Charles Edward Gray, Paul Kim
Scaled-out transport as connection proxy for device-to-device communications

Patent number: 11637773

Abstract: Techniques are described for providing a scaled-out transport supported by interconnected data processing units (DPUs) that operates as a single system bus connection proxy for device-to-device communications within a data center. As one example, this disclosure describes techniques for providing a Peripheral Component Interconnect Express (PCIe) proxy for device-to-device communications employing the PCIe standard. The disclosed techniques include adding PCIe proxy logic on top of a host unit of a DPU to expose a PCIe proxy model to application processors, storage devices, network interface controllers, field programmable gate arrays, or other PCIe endpoint devices. The PCIe proxy model may be implemented as a physically distributed Ethernet-based switch fabric with PCIe proxy logic at the edge and fronting the PCIe endpoint devices. The interconnected DPUs and the distributed Ethernet-based switch fabric together provide a reliable, low-latency, and scaled-out transport that operates as a PCIe proxy.

Type: Grant

Filed: February 9, 2021

Date of Patent: April 25, 2023

Assignee: FUNGIBLE, INC.

Inventors: Wael Noureddine, Felix A. Marti, Aibing Zhou, Dmitriy Leonidovich Budko, Gaurav Gupte, Hoai Vu Thanh Tran, Aravind Vidhyasagar Lappasi, Leith Alan Leedom, Rajesh G. Nair
Access node for data centers

Patent number: 11546189

Abstract: An access node that can be configured and optimized to perform input and output (I/O) tasks, such as storage and retrieval of data to and from network devices (such as solid state drives), networking, data processing, and the like. For example, the access node may be configured to receive data to be processed, wherein the access node includes a plurality of processing cores, a data network fabric, and a control network fabric; receive, over the control network fabric, a work unit message indicating a processing task to be performed a processing core; and process the work unit message, wherein processing the work unit message includes retrieving data associated with the work unit message over the data network fabric.

Type: Grant

Filed: May 18, 2020

Date of Patent: January 3, 2023

Assignee: Fungible, Inc.

Inventors: Pradeep Sindhu, Jean-Marc Frailong, Bertrand Serlet, Wael Noureddine, Felix A. Marti, Deepak Goel, Paul Kim, Rajan Goyal, Aibing Zhou
DATA PROCESSING UNIT FOR COMPUTE NODES AND STORAGE NODES

Publication number: 20220224564

Abstract: A new processing architecture is described in which a data processing unit (DPU) is utilized within a device. Unlike conventional compute models that are centered around a central processing unit (CPU), example implementations described herein leverage a DPU that is specially designed and optimized for a data-centric computing model in which the data processing tasks are centered around, and the primary responsibility of, the DPU. For example, various data processing tasks, such as networking, security, and storage, as well as related work acceleration, distribution and scheduling, and other such tasks are the domain of the DPU. The DPU may be viewed as a highly programmable, high-performance input/output (I/O) and data-processing hub designed to aggregate and process network and storage I/O to and from multiple other components and/or devices. This frees resources of the CPU, if present, for computing-intensive tasks.

Type: Application

Filed: March 29, 2022

Publication date: July 14, 2022

Inventors: Pradeep Sindhu, Jean-Marc Frailong, Bertrand Serlet, Wael Noureddine, Felix A. Marti, Deepak Goel, Rajan Goyal
Data processing unit for compute nodes and storage nodes

Patent number: 11303472

Abstract: A new processing architecture is described in which a data processing unit (DPU) is utilized within a device. Unlike conventional compute models that are centered around a central processing unit (CPU), example implementations described herein leverage a DPU that is specially designed and optimized for a data-centric computing model in which the data processing tasks are centered around, and the primary responsibility of, the DPU. For example, various data processing tasks, such as networking, security, and storage, as well as related work acceleration, distribution and scheduling, and other such tasks are the domain of the DPU. The DPU may be viewed as a highly programmable, high-performance input/output (I/O) and data-processing hub designed to aggregate and process network and storage I/O to and from multiple other components and/or devices. This frees resources of the CPU, if present, for computing-intensive tasks.

Type: Grant

Filed: July 10, 2018

Date of Patent: April 12, 2022

Assignee: Fungible, Inc.

Inventors: Pradeep Sindhu, Jean-Marc Frailong, Bertrand Serlet, Wael Noureddine, Felix A. Marti, Deepak Goel, Rajan Goyal
EFFICIENT WORK UNIT PROCESSING IN A MULTICORE SYSTEM

Publication number: 20210349824

Abstract: Techniques are described in which a system having multiple processing units processes a series of work units in a processing pipeline, where some or all of the work units access or manipulate data stored in non-coherent memory. In one example, this disclosure describes a method that includes identifying, prior to completing processing of a first work unit with a processing unit of a processor having multiple processing units, a second work unit that is expected to be processed by the processing unit after the first work unit. The method also includes processing the first work unit, and prefetching, from non-coherent memory, data associated with the second work unit into a second cache segment of the buffer cache, wherein prefetching the data associated with the second work unit occurs concurrently with at least a portion of the processing of the first work unit by the processing unit.

Type: Application

Filed: June 28, 2021

Publication date: November 11, 2021

Inventors: Wael Noureddine, Jean-Marc Frailong, Felix A. Marti, Charles Edward Gray, Paul Kim
SCALED-OUT TRANSPORT AS CONNECTION PROXY FOR DEVICE-TO-DEVICE COMMUNICATIONS

Publication number: 20210250285

Abstract: Techniques are described for providing a scaled-out transport supported by interconnected data processing units (DPUs) that operates as a single system bus connection proxy for device-to-device communications within a data center. As one example, this disclosure describes techniques for providing a Peripheral Component Interconnect Express (PCIe) proxy for device-to-device communications employing the PCIe standard. The disclosed techniques include adding PCIe proxy logic on top of a host unit of a DPU to expose a PCIe proxy model to application processors, storage devices, network interface controllers, field programmable gate arrays, or other PCIe endpoint devices. The PCIe proxy model may be implemented as a physically distributed Ethernet-based switch fabric with PCIe proxy logic at the edge and fronting the PCIe endpoint devices. The interconnected DPUs and the distributed Ethernet-based switch fabric together provide a reliable, low-latency, and scaled-out transport that operates as a PCIe proxy.

Type: Application

Filed: February 9, 2021

Publication date: August 12, 2021

Inventors: Wael Noureddine, Felix A. Marti, Aibing Zhou, Dmitriy Leonidovich Budko, Gaurav Gupte, Hoai Vu Thanh Tran, Aravind Vidhyasagar Lappasi, Leith Alan Leedom, Rajesh G. Nair
Efficient work unit processing in a multicore system

Patent number: 11048634

Abstract: Techniques are described in which a system having multiple processing units processes a series of work units in a processing pipeline, where some or all of the work units access or manipulate data stored in non-coherent memory. In one example, this disclosure describes a method that includes identifying, prior to completing processing of a first work unit with a processing unit of a processor having multiple processing units, a second work unit that is expected to be processed by the processing unit after the first work unit. The method also includes processing the first work unit, and prefetching, from non-coherent memory, data associated with the second work unit into a second cache segment of the buffer cache, wherein prefetching the data associated with the second work unit occurs concurrently with at least a portion of the processing of the first work unit by the processing unit.

Type: Grant

Filed: January 17, 2020

Date of Patent: June 29, 2021

Assignee: Fungible, Inc.

Inventors: Wael Noureddine, Jean-Marc Frailong, Felix A. Marti, Charles Edward Gray, Paul Kim
Work unit stack data structures in multiple core processor system for stream data processing

Patent number: 10841245

Abstract: Techniques are described in which a device, such as a network device, compute node or storage device, is configured to utilize a work unit (WU) stack data structure in a multiple core processor system to help manage an event driven, run-to-completion programming model of an operating system executed by the multiple core processor system. The techniques may be particularly useful when processing streams of data at high rates. The WU stack may be viewed as a stack of continuation work units used to supplement a typical program stack as an efficient means of moving the program stack between cores. The work unit data structure itself is a building block in the WU stack to compose a processing pipeline and services execution. The WU stack structure carries state, memory, and other information in auxiliary variables.

Type: Grant

Filed: November 20, 2018

Date of Patent: November 17, 2020

Assignee: Fungible, Inc.

Inventors: Charles Edward Gray, Bertrand Serlet, Felix A. Marti, Wael Noureddine, Pratapa Reddy Vaka
DATA PROCESSING UNIT FOR STREAM PROCESSING

Publication number: 20200356414

Abstract: A new processing architecture is described that utilizes a data processing unit (DPU). Unlike conventional compute models that are centered around a central processing unit (CPU), the DPU that is designed for a data-centric computing model in which the data processing tasks are centered around the DPU. The DPU may be viewed as a highly programmable, high-performance I/O and data-processing hub designed to aggregate and process network and storage I/O to and from other devices. The DPU comprises a network interface to connect to a network, one or more host interfaces to connect to one or more application processors or storage devices, and a multi-core processor with two or more processing cores executing a run-to-completion data plane operating system and one or more processing cores executing a multi-tasking control plane operating system. The data plane operating system is configured to support software functions for performing the data processing tasks.

Type: Application

Filed: July 27, 2020

Publication date: November 12, 2020

Inventors: Pradeep Sindhu, Jean-Marc Frailong, Wael Noureddine, Felix A. Marti, Deepak Goel, Rajan Goyal, Bertrand Serlet
ACCESS NODE FOR DATA CENTERS

Publication number: 20200280462

Abstract: An access node that can be configured and optimized to perform input and output (I/O) tasks, such as storage and retrieval of data to and from network devices (such as solid state drives), networking, data processing, and the like. For example, the access node may be configured to receive data to be processed, wherein the access node includes a plurality of processing cores, a data network fabric, and a control network fabric; receive, over the control network fabric, a work unit message indicating a processing task to be performed a processing core; and process the work unit message, wherein processing the work unit message includes retrieving data associated with the work unit message over the data network fabric.

Type: Application

Filed: May 18, 2020

Publication date: September 3, 2020

Inventors: Pradeep Sindhu, Jean-Marc Frailong, Bertrand Serlet, Wael Noureddine, Felix A. Marti, Deepak Goel, Paul Kim, Rajan Goyal, Aibing Zhou
Data processing unit for stream processing

Patent number: 10725825

Abstract: A new processing architecture is described that utilizes a data processing unit (DPU). Unlike conventional compute models that are centered around a central processing unit (CPU), the DPU that is designed for a data-centric computing model in which the data processing tasks are centered around the DPU. The DPU may be viewed as a highly programmable, high-performance I/O and data-processing hub designed to aggregate and process network and storage I/O to and from other devices. The DPU comprises a network interface to connect to a network, one or more host interfaces to connect to one or more application processors or storage devices, and a multi-core processor with two or more processing cores executing a run-to-completion data plane operating system and one or more processing cores executing a multi-tasking control plane operating system. The data plane operating system is configured to support software functions for performing the data processing tasks.

Type: Grant

Filed: July 10, 2018

Date of Patent: July 28, 2020

Assignee: Fungible, Inc.

Inventors: Pradeep Sindhu, Jean-Marc Frailong, Wael Noureddine, Felix A. Marti, Deepak Goel, Rajan Goyal, Bertrand Serlet
Access node integrated circuit for data centers which includes a networking unit, a plurality of host units, processing clusters, a data network fabric, and a control network fabric

Patent number: 10659254

Abstract: A highly-programmable access node is described that can be configured and optimized to perform input and output (I/O) tasks, such as storage and retrieval of data to and from storage devices (such as solid state drives), networking, data processing, and the like. For example, the access node may be configured to execute a large number of data I/O processing tasks relative to a number of instructions that are processed. The access node may be highly programmable such that the access node may expose hardware primitives for selecting and programmatically configuring data processing operations. As one example, the access node may be used to provide high-speed connectivity and I/O operations between and on behalf of computing devices and storage components of a network, such as for providing interconnectivity between those devices and a switch fabric of a data center.

Type: Grant

Filed: July 10, 2018

Date of Patent: May 19, 2020

Assignee: Fungible, Inc.

Inventors: Pradeep Sindhu, Jean-Marc Frailong, Bertrand Serlet, Wael Noureddine, Felix A. Marti, Deepak Goel, Paul Kim, Rajan Goyal, Aibing Zhou
EFFICIENT WORK UNIT PROCESSING IN A MULTICORE SYSTEM

Publication number: 20200151101

Abstract: Techniques are described in which a system having multiple processing units processes a series of work units in a processing pipeline, where some or all of the work units access or manipulate data stored in non-coherent memory. In one example, this disclosure describes a method that includes identifying, prior to completing processing of a first work unit with a processing unit of a processor having multiple processing units, a second work unit that is expected to be processed by the processing unit after the first work unit. The method also includes processing the first work unit, and prefetching, from non-coherent memory, data associated with the second work unit into a second cache segment of the buffer cache, wherein prefetching the data associated with the second work unit occurs concurrently with at least a portion of the processing of the first work unit by the processing unit.

Type: Application

Filed: January 17, 2020

Publication date: May 14, 2020

Inventors: Wael Noureddine, Jean-Marc Frailong, Felix A. Marti, Charles Edward Gray, Paul Kim

1 2 next