Patents by Inventor Nathan L. Wichmann
Nathan L. Wichmann has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20250284415Abstract: A network interface card (NIC) receives a stream of commands, a respective command comprising memory-operation requests, each request associated with a destination NIC. The NIC buffers asynchronously the requests into queues based on the destination NIC, each queue specific to a corresponding destination NIC. When first queue requests reach a threshold, the NIC aggregates the first queue requests into a first packet and sends the first packet to the destination NIC. The NIC receives a plurality of packets, a second packet comprising memory-operation requests, each request associated with a same destination NIC and a destination core. The NIC buffers asynchronously the requests of the second packet into queues based on the destination core, each queue specific to a corresponding destination core. When second queue requests reach the threshold, the NIC aggregates the second queue requests into a third packet and sends the third packet to the destination core.Type: ApplicationFiled: May 27, 2025Publication date: September 11, 2025Inventors: Duncan Roweth, Robert L. Alverson, Nathan L. Wichmann, Eric P. Lundberg
-
Patent number: 12346274Abstract: An apparatus is provided that includes a network interface to transmit and receive data packets over a network; a memory including one or more buffers; an arithmetic logic unit to perform arithmetic operations for organizing and combining the data packets; and a circuitry to receive, via the network interface, data packets from the network; aggregate, via the arithmetic logic unit, the received data packets in the one or more buffers at a network rate; and transmit, via the network interface, the aggregated data packets to one or more compute nodes in the network, thereby optimizing latency incurred in combining the received data packets and transmitting the aggregated data packets, and hence accelerating a bulk data allreduce operation. One embodiment provides a system and method for performing the allreduce operation. During operation, the system performs the allreduce operation by pacing network operations for enhancing performance of the allreduce operation.Type: GrantFiled: July 17, 2023Date of Patent: July 1, 2025Assignee: Hewlett Packard Enterprise Development LPInventors: Keith D. Underwood, Robert L. Alverson, Duncan Roweth, Nathan L. Wichmann
-
Publication number: 20250181268Abstract: A system for performing a broadcast operation on a first process in a plurality of processes is provided. During operation, the system can select, from the plurality of processes, a subset of processes based on a plurality of selection conditions. The system can initiate a broadcast operation for the subset of processes and identify a source buffer of a root process storing data to be distributed by the broadcast operation. The system can then determine a first segment of the data for which the first process is responsible for broadcasting based on a number of processes in the subset of processes. The system can obtain the first segment from the source buffer based on remote memory access and store the first segment in a first destination buffer of the first process. The system can send the first segment to respective destination buffers of other processes in the subset of processes.Type: ApplicationFiled: November 30, 2023Publication date: June 5, 2025Inventors: Naveen Namashivayam Ravichandrasekaran, Nathan L. Wichmann
-
Patent number: 12321619Abstract: A network interface card (NIC) receives a stream of commands, a respective command comprising memory-operation requests, each request associated with a destination NIC. The NIC buffers asynchronously the requests into queues based on the destination NIC, each queue specific to a corresponding destination NIC. When first queue requests reach a threshold, the NIC aggregates the first queue requests into a first packet and sends the first packet to the destination NIC. The NIC receives a plurality of packets, a second packet comprising memory-operation requests, each request associated with a same destination NIC and a destination core. The NIC buffers asynchronously the requests of the second packet into queues based on the destination core, each queue specific to a corresponding destination core. When second queue requests reach the threshold, the NIC aggregates the second queue requests into a third packet and sends the third packet to the destination core.Type: GrantFiled: October 28, 2022Date of Patent: June 3, 2025Assignee: Hewlett Packard Enterprise Development LPInventors: Duncan Roweth, Robert L. Alverson, Nathan L. Wichmann, Eric P. Lundberg
-
Publication number: 20240143198Abstract: A network interface card (NIC) receives a stream of commands, a respective command comprising memory-operation requests, each request associated with a destination NIC. The NIC buffers asynchronously the requests into queues based on the destination NIC, each queue specific to a corresponding destination NIC. When first queue requests reach a threshold, the NIC aggregates the first queue requests into a first packet and sends the first packet to the destination NIC. The NIC receives a plurality of packets, a second packet comprising memory-operation requests, each request associated with a same destination NIC and a destination core. The NIC buffers asynchronously the requests of the second packet into queues based on the destination core, each queue specific to a corresponding destination core. When second queue requests reach the threshold, the NIC aggregates the second queue requests into a third packet and sends the third packet to the destination core.Type: ApplicationFiled: October 28, 2022Publication date: May 2, 2024Inventors: Duncan Roweth, Robert L. Alverson, Nathan L. Wichmann, Eric P. Lundberg
-
Publication number: 20230359574Abstract: An apparatus is provided that includes a network interface to transmit and receive data packets over a network; a memory including one or more buffers; an arithmetic logic unit to perform arithmetic operations for organizing and combining the data packets; and a circuitry to receive, via the network interface, data packets from the network; aggregate, via the arithmetic logic unit, the received data packets in the one or more buffers at a network rate; and transmit, via the network interface, the aggregated data packets to one or more compute nodes in the network, thereby optimizing latency incurred in combining the received data packets and transmitting the aggregated data packets, and hence accelerating a bulk data allreduce operation. One embodiment provides a system and method for performing the allreduce operation. During operation, the system performs the allreduce operation by pacing network operations for enhancing performance of the allreduce operation.Type: ApplicationFiled: July 17, 2023Publication date: November 9, 2023Inventors: Keith D. Underwood, Robert L. Alverson, Duncan Roweth, Nathan L. Wichmann
-
Patent number: 11714765Abstract: An apparatus is provided that includes a network interface to transmit and receive data packets over a network; a memory including one or more buffers; an arithmetic logic unit to perform arithmetic operations for organizing and combining the data packets; and a circuitry to receive, via the network interface, data packets from the network; aggregate, via the arithmetic logic unit, the received data packets in the one or more buffers at a network rate; and transmit, via the network interface, the aggregated data packets to one or more compute nodes in the network, thereby optimizing latency incurred in combining the received data packets and transmitting the aggregated data packets, and hence accelerating a bulk data allreduce operation. One embodiment provides a system and method for performing the allreduce operation. During operation, the system performs the allreduce operation by pacing network operations for enhancing performance of the allreduce operation.Type: GrantFiled: July 23, 2021Date of Patent: August 1, 2023Assignee: Hewlett Packard Enterprise Development LPInventors: Keith D. Underwood, Robert L. Alverson, Duncan Roweth, Nathan L. Wichmann
-
Publication number: 20230035657Abstract: An apparatus is provided that includes a network interface to transmit and receive data packets over a network; a memory including one or more buffers; an arithmetic logic unit to perform arithmetic operations for organizing and combining the data packets; and a circuitry to receive, via the network interface, data packets from the network; aggregate, via the arithmetic logic unit, the received data packets in the one or more buffers at a network rate; and transmit, via the network interface, the aggregated data packets to one or more compute nodes in the network, thereby optimizing latency incurred in combining the received data packets and transmitting the aggregated data packets, and hence accelerating a bulk data allreduce operation. One embodiment provides a system and method for performing the allreduce operation. During operation, the system performs the allreduce operation by pacing network operations for enhancing performance of the allreduce operation.Type: ApplicationFiled: July 23, 2021Publication date: February 2, 2023Inventors: Keith D. Underwood, Robert L. Alverson, Duncan Roweth, Nathan L. Wichmann