Patents by Inventor Larry Robert Dennison

Larry Robert Dennison has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

MECHANISM FOR DETECTING AND MITIGATING CONGESTION IN A DRAGONFLY NETWORK

Publication number: 20250097153

Abstract: A process to manage congestion in a network involves converting traffic received from the local endpoints to a bandwidth demand for one or more destination endpoint in a remote group, and determining a sum over the destination endpoints of a minimum of a maximum bandwidth of a link and a bandwidth demand to one or more of the remote endpoints.

Type: Application

Filed: April 25, 2024

Publication date: March 20, 2025

Applicant: NVIDIA Corp.

Inventors: John Martin Snyder, Nan Jiang, Dennis Charles Abts, Larry Robert Dennison
TECHNIQUES FOR REDUCING NETWORK CONGESTION DUE TO MULTICAST COMMUNICATIONS

Publication number: 20240305577

Abstract: One embodiment of a method for reducing network congestion cause by multicast communications includes receiving, via a network, first data associated with one or more multicast operations, determining a congestion state of the network based on the first data, and performing one or more operations to reduce an amount of second data that is transmitted via the network based on the congestion state of the network.

Type: Application

Filed: October 24, 2023

Publication date: September 12, 2024

Inventors: John Martin SNYDER, Nan JIANG, Alan Lynn DAVIS, Larry Robert DENNISON
EPOCH-BASED MECHANISM FOR PROVIDING DATA INTEGRITY AND RELIABILITY IN A MESSAGING SYSTEM

Publication number: 20240184927

Abstract: Messaging protocols used by components in a messaging system to exchange messages conventionally use a reliability mechanism to ensure that each message sent by a sender is received, without compromise, by the intended receiver. Typically, this reliability mechanism involves use of a returned acknowledgement message to the message sender, with automatic retransmission of the message by the sender when the acknowledgement message is not received (e.g. within a defined timeframe). However, existing acknowledgement-based reliability mechanisms require that a sender identifier be included in the message header, which increases the overhead of the message. The present disclosure provides an epoch-based reliability mechanism that allows the sender identifier to be omitted from the message header to minimize overhead and maximize the efficient use of the available bandwidth.

Type: Application

Filed: December 2, 2022

Publication date: June 6, 2024

Inventors: Benjamin Klenk, Al Davis, Larry Robert Dennison
Sparse convolutional neural network accelerator

Patent number: 11847550

Abstract: A method, computer program product, and system perform computations using a processor. A first instruction including a first index vector operand and a second index vector operand is received and the first index vector operand is decoded to produce first coordinate sets for a first array, each first coordinate set including at least a first coordinate and a second coordinate of a position of a non-zero element in the first array. The second index vector operand is decoded to produce second coordinate sets for a second array, each second coordinate set including at least a third coordinate and a fourth coordinate of a position of a non-zero element in the second array. The first coordinate sets are summed with the second coordinate sets to produce output coordinate sets and the output coordinate sets are converted into a set of linear indices.

Type: Grant

Filed: December 4, 2020

Date of Patent: December 19, 2023

Assignee: NVIDIA Corporation

Inventors: William J. Dally, Angshuman Parashar, Joel Springer Emer, Stephen William Keckler, Larry Robert Dennison
Use of stashing buffers to improve the efficiency of crossbar switches

Patent number: 11799799

Abstract: A switch architecture enables ports to stash packets in unused buffers on other ports, exploiting excess internal bandwidth that may exist, for example, in a tiled switch. This architecture leverages unused port buffer memory to improve features such as congestion handling and error recovery.

Type: Grant

Filed: July 16, 2021

Date of Patent: October 24, 2023

Assignee: NVIDIA Corp.

Inventors: Matthias Augustin Blumrich, Nan Jiang, Larry Robert Dennison
IN-NETWORK MESSAGE AGGREGATION FOR EFFICIENT SMALL MESSAGE TRANSPORT

Publication number: 20230327996

Abstract: Aggregation of small payloads from multiple packets may improve bandwidth efficiency of a network, particularly a high-performance compute cluster with thousands of network endpoints and distributed data. Aggregation is context-based and a packet header is reduced because the common components that are shared by the aggregated messages are included once within the header. Execution contexts are explicitly created and destroyed by application programs. Each participating endpoint stores context-specific properties until the context is destroyed, so that the properties are not included in the header. Aggregation may be performed at different hierarchical levels by switches and/or endpoints.

Type: Application

Filed: January 4, 2023

Publication date: October 12, 2023

Inventors: Benjamin Klenk, Alan Lynn Davis, Larry Robert Dennison
Transceiver system with end-to-end reliability and ordering protocols

Patent number: 11770215

Abstract: Packet flows between a transmitter and a receiver in an unreliable and unordered switched packet network may be established as a result of receiving a second packet comprising a second memory operation on a memory address. The transmission of memory load command packets followed by memory store command packets in the packet flow may be serialized, and a synchronization operation may be executed between the transmitter and the receiver when a packet count at the receiver satisfies a number of data packets in the packet flow.

Type: Grant

Filed: February 17, 2022

Date of Patent: September 26, 2023

Assignee: NVIDIA CORP.

Inventors: Hans Eberle, Larry Robert Dennison, John Martin Snyder
TRANSCEIVER SYSTEM WITH END-TO-END RELIABILITY AND ORDERING PROTOCOLS

Publication number: 20230261794

Abstract: Packet flows between a transmitter and a receiver in an unreliable and unordered switched packet network may be established as a result of receiving a second packet comprising a second memory operation on a memory address. The transmission of memory load command packets followed by memory store command packets in the packet flow may be serialized, and a synchronization operation may be executed between the transmitter and the receiver when a packet count at the receiver satisfies a number of data packets in the packet flow.

Type: Application

Filed: February 17, 2022

Publication date: August 17, 2023

Applicant: NVIDIA Corp.

Inventors: Hans Eberle, Larry Robert Dennison, John Martin Snyder
CROSSBAR MULTIPATHING FOR MULTICAST PERFORMANCE IN TILED SWITCHES

Publication number: 20220417176

Abstract: A method is provided for operating a network switch comprising a plurality of input ports and a plurality of output ports. The method comprises receiving a first data packet received via a first input port and a second data packet received via a second input port to be delivered to an egress endpoint connected to a first output port, configuring a plurality of crossbar switch units arranged in a tiled architecture to pass the first data packet to the first output port via a primary path and pass the second data packet to the first output port via a secondary path, and transmitting the first data packet and the second data packet to the egress endpoint. The first data packet and the second data packet pass through the plurality of crossbar switch units simultaneously.

Type: Application

Filed: June 23, 2022

Publication date: December 29, 2022

Inventors: Glenn Alan Dearth, Nan Jiang, Mark D. Hummel, Gregory Michael Thorson, Karan Gupta, Dane Thomas Mrazek, Eric Anderson, Larry Robert Dennison
Injection limiting and wave synchronization for scalable in-network computation

Patent number: 11502867

Abstract: A network device configured to perform scalable, in-network computations is described. The network device is configured to process pull requests and/or push requests from a plurality of endpoints connected to the network. A collective communication primitive from a particular endpoint can be received at a network device. The collective communication primitive is associated with a multicast region of a shared global address space and is mapped to a plurality of participating endpoints. The network device is configured to perform an in-network computation based on information received from the participating endpoints before forwarding a response to the collective communication primitive back to one or more of the participating endpoints.

Type: Grant

Filed: July 24, 2020

Date of Patent: November 15, 2022

Assignee: NVIDIA Corporation

Inventors: Benjamin Klenk, Nan Jiang, Larry Robert Dennison
Scalable light-weight protocols for wire-speed packet ordering

Patent number: 11470394

Abstract: A communication method between a source device and a target device utilizes speculative connection setup between the source device and the target device, target-device-side packet ordering, and fine-grained ordering to remove packet dependencies.

Type: Grant

Filed: July 21, 2020

Date of Patent: October 11, 2022

Assignee: NVIDIA CORP.

Inventors: Hans Eberle, Larry Robert Dennison
Scalable in-network computation for massively-parallel shared-memory processors

Patent number: 11463272

Abstract: A network device configured to perform scalable, in-network computations is described. The network device is configured to process pull requests and/or push requests from a plurality of endpoints connected to the network. A collective communication primitive from a particular endpoint can be received at a network device. The collective communication primitive is associated with a multicast region of a shared global address space and is mapped to a plurality of participating endpoints. The network device is configured to perform an in-network computation based on information received from the participating endpoints before forwarding a response to the collective communication primitive back to one or more of the participating endpoints. The endpoints can inject pull requests (e.g., load commands) and/or push requests (e.g., store commands) into the network. A multicast capability enables tasks, such as a reduction operation, to be offloaded to hardware in the network device.

Type: Grant

Filed: October 6, 2021

Date of Patent: October 4, 2022

Assignee: NVIDIA Corporation

Inventors: Benjamin Klenk, Nan Jiang, Larry Robert Dennison, Gregory M. Thorson
Scalable light-weight protocols for wire-speed packet ordering

Patent number: 11363339

Abstract: A communication method between a source device and a target device utilizes speculative connection setup between the source device and the target device, target-device-side packet ordering, and fine-grained ordering to remove packet dependencies.

Type: Grant

Filed: July 20, 2020

Date of Patent: June 14, 2022

Assignee: NVIDIA Corp.

Inventors: Hans Eberle, Larry Robert Dennison
Distributed batch normalization using partial populations

Patent number: 11341369

Abstract: A technique for performing data parallel training of a neural network model is disclosed that incorporates batch normalization techniques using partial populations to generate normalization parameters. The technique involves processing, by each processor of a plurality of processors in parallel, a first portion of a sub-batch of training samples allocated to the processor to generate activations for the first portion of the sub-batch. Each processor analyzes the activations and transmits statistical measures for the first portion to an additional processor that reduces the statistical measures from multiple processors to generate normalization parameters for a partial population of the training samples that includes the first portion from each of the plurality of processors. The normalization parameters are then transmitted back to each of the processors to normalize the activations for both the first portion and a second portion of the sub-batch of training samples allocated to each processor.

Type: Grant

Filed: October 31, 2019

Date of Patent: May 24, 2022

Assignee: NVIDIA Corporation

Inventors: Larry Robert Dennison, Benjamin Klenk
Scalable in-network computation for massively-parallel shared-memory processors

Patent number: 11336476

Abstract: A network device configured to perform scalable, in-network computations is described. The network device is configured to process pull requests and/or push requests from a plurality of endpoints connected to the network. A collective communication primitive from a particular endpoint can be received at a network device. The collective communication primitive is associated with a multicast region of a shared global address space and is mapped to a plurality of participating endpoints. The network device is configured to perform an in-network computation based on information received from the participating endpoints before forwarding a response to the collective communication primitive back to one or more of the participating endpoints. The endpoints can inject pull requests (e.g., load commands) and/or push requests (e.g., store commands) into the network. A multicast capability enables tasks, such as a reduction operation, to be offloaded to hardware in the network device.

Type: Grant

Filed: July 24, 2020

Date of Patent: May 17, 2022

Assignee: NVIDIA Corporation

Inventors: Benjamin Klenk, Nan Jiang, Larry Robert Dennison, Gregory M. Thorson
Securing memory accesses in a virtualized environment

Patent number: 11327900

Abstract: Multiprocessor clusters in a virtualized environment conventionally fail to provide memory access security, which is frequently a requirement for efficient utilization in multi-client settings. Without adequate access security, a malicious process may access what might be confidential data that belongs to a different client sharing the multiprocessor cluster. Furthermore, an inadvertent programming error in the code for one client process may accidentally corrupt data that belongs to the different client. Neither scenario is acceptable. Embodiments of the present disclosure provide access security by enabling each processing node within a multiprocessor cluster to virtualize and manage local memory access and only process access requests possessing proper access credentials. In this way, different applications executing on a multiprocessor cluster may be isolated from each other while advantageously sharing the hardware resources of the multiprocessor cluster.

Type: Grant

Filed: July 23, 2020

Date of Patent: May 10, 2022

Assignee: NVIDIA Corporation

Inventors: Samuel Hammond Duncan, Sanjeev Jain, Mark Douglas Hummel, Vyas Venkataraman, Olivier Giroux, Larry Robert Dennison, Alexander Toichi Ishii, Hemayet Hossain, Nir Haim Arad
SCALABLE LIGHT-WEIGHT PROTOCOLS FOR WIRE-SPEED PACKET ORDERING

Publication number: 20220095017

Abstract: A communication method between a source device and a target device utilizes speculative connection setup between the source device and the target device, target-device-side packet ordering, and fine-grained ordering to remove packet dependencies.

Type: Application

Filed: December 1, 2021

Publication date: March 24, 2022

Applicant: NVIDIA Corp.

Inventors: Hans Eberle, Larry Robert Dennison
SCALABLE IN-NETWORK COMPUTATION FOR MASSIVELY-PARALLEL SHARED-MEMORY PROCESSORS

Publication number: 20220029845

Abstract: A network device configured to perform scalable, in-network computations is described. The network device is configured to process pull requests and/or push requests from a plurality of endpoints connected to the network. A collective communication primitive from a particular endpoint can be received at a network device. The collective communication primitive is associated with a multicast region of a shared global address space and is mapped to a plurality of participating endpoints. The network device is configured to perform an in-network computation based on information received from the participating endpoints before forwarding a response to the collective communication primitive back to one or more of the participating endpoints. The endpoints can inject pull requests (e.g., load commands) and/or push requests (e.g., store commands) into the network. A multicast capability enables tasks, such as a reduction operation, to be offloaded to hardware in the network device.

Type: Application

Filed: October 6, 2021

Publication date: January 27, 2022

Inventors: Benjamin Klenk, Nan Jiang, Larry Robert Dennison, Gregory M. Thorson
Scalable in-network computation for massively-parallel shared-memory processors

Patent number: 11171798

Abstract: A network device configured to perform scalable, in-network computations is described. The network device is configured to process pull requests and/or push requests from a plurality of endpoints connected to the network. A collective communication primitive from a particular endpoint can be received at a network device. The collective communication primitive is associated with a multicast region of a shared global address space and is mapped to a plurality of participating endpoints. The network device is configured to perform an in-network computation based on information received from the participating endpoints before forwarding a response to the collective communication primitive back to one or more of the participating endpoints. The endpoints can inject pull requests (e.g., load commands) and/or push requests (e.g., store commands) into the network. A multicast capability enables tasks, such as a reduction operation, to be offloaded to hardware in the network device.

Type: Grant

Filed: July 24, 2020

Date of Patent: November 9, 2021

Assignee: NVIDIA Corporation

Inventors: Benjamin Klenk, Nan Jiang, Larry Robert Dennison, Gregory M. Thorson
Distributed batch normalization using estimates and rollback

Patent number: 11170263

Abstract: A technique utilizing speculative execution and rollback for performing data parallel training of a neural network model is disclosed. Activations for a layer of the neural network model are normalized during a speculative normalization operation using estimated normalization parameters associated with a partial population of a set of training data allocated to a particular processor. Normalization parameters associated with the total population of the set of training data are generated by a distributed reduce operation in parallel with the speculative normalization operation. An optional rollback operation can revert the activations to a pre-normalization state if the estimated normalization parameters for the partial population are subsequently determined to be inaccurate compared to the normalization parameters for the population of the set of training data distributed across a plurality of processors.

Type: Grant

Filed: October 31, 2019

Date of Patent: November 9, 2021

Assignee: NVIDIA Corporation

Inventors: Larry Robert Dennison, Benjamin Klenk

1 2 3 next