Patents by Inventor Torsten HOEFLER

Torsten HOEFLER has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Recycled entropies packet spraying

Patent number: 12255824

Abstract: An entropy value is generated for a data packet to be transmitted on a computing network. The entropy value is usable to select or change a network path for the data packet. In response to receiving an acknowledgement message for the data packet, the entropy value is saved in a storage structure if the entropy value is acknowledged as not congested. When transmitting an additional data packet, the oldest saved entropy from the storage structure is reused and the oldest saved entropy value is invalidated.

Type: Grant

Filed: November 13, 2023

Date of Patent: March 18, 2025

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Abdul Kabbani, Torsten Hoefler
COORDINATING CONGESTION CONTROL AND ADAPTIVE LOAD BALANCING

Publication number: 20250047604

Abstract: A computing network implements a congestion control mechanism and a load balancing mechanism. It is determined which available routes in the network are congested. Activation of the congestion control mechanism is limited until a threshold number of the available routes are determined to be congested, which prevents over-attenuation of an overall sending rate or window by the congestion control mechanism.

Type: Application

Filed: November 13, 2023

Publication date: February 6, 2025

Inventors: Abdul KABBANI, Torsten HOEFLER
SCALABLE COORDINATION OF CONGESTION CONTROL AND ADAPTIVE LOAD BALANCING

Publication number: 20250047603

Abstract: A computing network implements a congestion control mechanism and a load balancing mechanism. The load balancing mechanism is run at the packet level. A connection-level measure is generated for congestion in the computing network. The connection-level measure is accumulated at the packet level. Activation of the congestion control mechanism is limited until the accumulated connection-level measure reaches a threshold.

Type: Application

Filed: November 13, 2023

Publication date: February 6, 2025

Inventors: Abdul KABBANI, Torsten HOEFLER
FAIRNESS AND CONGESTION CONTROL CONVERGENCE

Publication number: 20250047600

Abstract: A first ratio of a sending rate limit to a full line rate for a link in the computing network is accessed. A second ratio of a sending window size to W_max for the link is accessed. W_max is the maximum allowed window size or the window size that utilizes an end-to-end path for the link. One or more of the first or second ratio is used to determine an amount to reduce the sending rate or window for the link in response to an indication of network congestion in the link.

Type: Application

Filed: November 13, 2023

Publication date: February 6, 2025

Inventors: Abdul KABBANI, Torsten HOEFLER
RECYCLED ENTROPIES PACKET SPRAYING

Publication number: 20250047610

Abstract: An entropy value is generated for a data packet to be transmitted on a computing network. The entropy value is usable to select or change a network path for the data packet. In response to receiving an acknowledgement message for the data packet, the entropy value is saved in a storage structure if the entropy value is acknowledged as not congested. When transmitting an additional data packet, the oldest saved entropy from the storage structure is reused and the oldest saved entropy value is invalidated.

Type: Application

Filed: November 13, 2023

Publication date: February 6, 2025

Inventors: Abdul KABBANI, Torsten HOEFLER
ROUND-TRIP TIME AND EXPLICIT CONGESTION NOTIFICATION SIGNALS FOR BANDWIDTH UTILIZATION

Publication number: 20250047613

Abstract: Acknowledgement messages for a link in a computing network are accessed. Round trip time (RTT) measurements for the link are accessed. In response to determining that none of the acknowledgement messages are Explicit Congestion Notification (ECN)-marked and none of the accessed RTT measurements exceeded a minimum expected latency threshold, it is determined that an end-to-end path of the link is not congested.

Type: Application

Filed: November 13, 2023

Publication date: February 6, 2025

Inventors: Abdul KABBANI, Torsten HOEFLER
LOW OVERHEAD SEND/RECEIVE DATA DELIVERY INTO USER MEMORY

Publication number: 20250047616

Abstract: A match table is used to match a concatenation of a source address of a data packet and a consecutive message sequence number (MSN) to a match data structure. The matched concatenation is inserted into the match data structure based on a source and a running counter for a MSN to each source. The currently active PDC associated with the source is attached or an empty PDC is atomically created and attached. In response to a packet arriving before the semantic layer has posted a recv( ), A canary value “failed match” is atomically inserted into the match data structure and a request to wait (RTW) message is sent to the source. In response to the semantic layer posting the recv( ), the “failed match” entry in the match data structure is identified, the “failed match” entry is atomically updated with match information, and a request to send (RTS) message is sent to the source.

Type: Application

Filed: November 13, 2023

Publication date: February 6, 2025

Inventors: Torsten HOEFLER, Abdul KABBANI
WEIGHTED RANDOM EARLY BACK-TO-SENDER NOTIFICATION

Publication number: 20250047598

Abstract: It is determined that a computing node is contributing to a network congestion event. A congestion notification message is generated. A timing profile is determined for sending the congestion notification message based on the level of the network congestion event. Based on the timing profile, the congestion notification message is forwarded to the computing node determined to be contributing to the network congestion event.

Type: Application

Filed: November 13, 2023

Publication date: February 6, 2025

Inventors: Abdul KABBANI, Torsten HOEFLER
ADAPTIVE PACKET ROUTING

Publication number: 20240314073

Abstract: A computer implemented method includes encoding a packet in a source endpoint of a multi-path communication network, the packet having a hash seed for use by routers to route the packet through the multi-path communication network to a destination endpoint. Network performance is tracked for the packet at the source endpoint. The hash seed is modified as a function of the network performance. The packet is re-sent such that the modified hash seed is used to route the packet to the destination endpoint.

Type: Application

Filed: March 16, 2023

Publication date: September 19, 2024

Inventors: Abdul KABBANI, Torsten HOEFLER
ASSIGNING WORKLOADS TO PHYSICAL RESOURCES IN SPATIAL ARCHITECTURES

Publication number: 20240303117

Abstract: Operations of a workload are assigned to physical resources of a physical device array. The workload includes a graph of operations to be performed on a physical device array. The graph of operations is partitioned into subgraphs. Partitioning includes at least minimizing the quantity of subgraphs and maximizing resource utilization per subgraph. A logical mapping of the subgraph to logical processing engine (PE) units is generated using features of the subgraph and tiling factors of the logical PE units. The logical mapping is assigned to physical PE units of the physical device array at least by minimizing network traffic across the physical PE units. The operations of the subgraph are performed using the physical PE units to which the logical mapping is assigned. This process enhances the computational efficiency of the array when executing the workload.

Type: Application

Filed: February 24, 2023

Publication date: September 12, 2024

Inventors: Fanny NINA PARAVECINO, Michael Eric DAVIES, Abhishek Dilip KULKARNI, Md Aamir RAIHAN, Ankit MORE, Aayush ANKIT, Torsten HOEFLER, Douglas Christopher BURGER
Method and apparatus for compressing and decompressing sparse data sets

Patent number: 11989414

Abstract: Embodiments of the present disclosure include a digital circuit and method for multi-stage compression. Digital data values are compressed using a multi-stage compression algorithm and stored in a memory. A decompression circuit receives the values and performs a partial decompression. The partially compressed values are provided to a processor, which performs the final decompression. In one embodiment, a vector of N length compressed values are decompressed using a first bit mask into two N length sets having non-zero values. The two N length sets are further decompressed using two M length bit masks into M length sparse vectors, each having non-zero values.

Type: Grant

Filed: June 23, 2023

Date of Patent: May 21, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Mattheus C. Heddes, Ankit More, Nishit Shah, Torsten Hoefler
Message communication between integrated computing devices

Patent number: 11886938

Abstract: One example provides an integrated computing device, comprising one or more computing clusters, and one or more network controllers, each network controller comprising a local data notification queue to queue send message notifications originating from the computing clusters on the integrated computing device, a remote data notification queue to queue receive message notifications originating from network controllers on remote integrated computing devices, a local no-data notification queue to queue receive message notifications originating from computing clusters on the integrated computing device, and a connection scheduler configured to schedule sending of data from memory on the integrated computing device when a send message notification in the local data notification queue is matched with a receive message notification in the remote data notification queue, and to schedule sending of receive message notifications from the local no-data notification queue.

Type: Grant

Filed: March 11, 2021

Date of Patent: January 30, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Deepak Goel, Mattheus C Heddes, Torsten Hoefler, Xiaoling Xu
METHOD AND APPARATUS FOR COMPRESSING AND DECOMPRESSING SPARSE DATA SETS

Publication number: 20230333739

Abstract: Embodiments of the present disclosure include a digital circuit and method for multi-stage compression. Digital data values are compressed using a multi-stage compression algorithm and stored in a memory. A decompression circuit receives the values and performs a partial decompression. The partially compressed values are provided to a processor, which performs the final decompression. In one embodiment, a vector of N length compressed values are decompressed using a first bit mask into two N length sets having non-zero values. The two N length sets are further decompressed using two M length bit masks into M length sparse vectors, each having non-zero values.

Type: Application

Filed: June 23, 2023

Publication date: October 19, 2023

Inventors: Mattheus C. HEDDES, Ankit MORE, Nishit SHAH, Torsten HOEFLER
Method and apparatus for compressing and decompressing sparse data sets

Patent number: 11720252

Abstract: Embodiments of the present disclosure include a digital circuit and method for multi-stage compression. Digital data values are compressed using a multi-stage compression algorithm and stored in a memory. A decompression circuit receives the values and performs a partial decompression. The partially compressed values are provided to a processor, which performs the final decompression. In one embodiment, a vector of N length compressed values are decompressed using a first bit mask into two N length sets having non-zero values. The two N length sets are further decompressed using two M length bit masks into M length sparse vectors, each having non-zero values.

Type: Grant

Filed: March 4, 2022

Date of Patent: August 8, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Mattheus C. Heddes, Ankit More, Nishit Shah, Torsten Hoefler
Distributed processing architecture

Patent number: 11580388

Abstract: Embodiments of the present disclosure include techniques for processing neural networks. Various forms of parallelism may be implemented using topology that combines sequences of processors. In one embodiment, the present disclosure includes a computer system comprising a plurality of processor groups, the processor groups each comprising a plurality of processors. A plurality of network switches are coupled to subsets of the plurality of processor groups. A subset of the processors in the processor groups may be configurable to form sequences, and the network switches are configurable to form at least one sequence across one or more of the plurality of processor groups to perform neural network computations. Various alternative configurations for creating Hamiltonian cycles are disclosed to support data parallelism, pipeline parallelism, layer parallelism, or combinations thereof.

Type: Grant

Filed: January 3, 2020

Date of Patent: February 14, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Torsten Hoefler, Mattheus C. Heddes, Deepak Goel, Jonathan R Belk
MESSAGE COMMUNICATION BETWEEN INTEGRATED COMPUTING DEVICES

Publication number: 20220291976

Abstract: One example provides an integrated computing device, comprising one or more computing clusters, and one or more network controllers, each network controller comprising a local data notification queue to queue send message notifications originating from the computing clusters on the integrated computing device, a remote data notification queue to queue receive message notifications originating from network controllers on remote integrated computing devices, a local no-data notification queue to queue receive message notifications originating from computing clusters on the integrated computing device, and a connection scheduler configured to schedule sending of data from memory on the integrated computing device when a send message notification in the local data notification queue is matched with a receive message notification in the remote data notification queue, and to schedule sending of receive message notifications from the local no-data notification queue.

Type: Application

Filed: March 11, 2021

Publication date: September 15, 2022

Applicant: Microsoft Technology Licensing, LLC

Inventors: Deepak GOEL, Mattheus C. HEDDES, Torsten HOEFLER, Xiaoling XU
DIGITAL CIRCUITRY FOR NORMALIZATION FUNCTIONS

Publication number: 20220244911

Abstract: The present disclosure includes digital circuits that generate values of a power of two (2) raised to an input value. For example, a digital circuit may include combinational logic that receives first digital bits representing an input mantissa of an input value and second digital bits representing an input exponent of the input value. The combinational logic generates a plurality of output mantissas and plurality of output exponents corresponding to an approximate value of a power of two (2) raised to a power of the input value when the input value is positive and negative and when the input exponent is above and below a first value. Selection circuits are configured to receive output mantissas and output exponents. The selection circuits include selection control inputs coupled to the input exponent and an input sign bit of the input value to select one of the output mantissas and one output exponents.

Type: Application

Filed: January 29, 2021

Publication date: August 4, 2022

Inventors: Torsten Hoefler, Mattheus C Heddes
TRAINING NEURAL NETWORKS BASED ON DUAL PIPELINE ARCHITECTURES

Publication number: 20220138524

Abstract: Embodiments of the present disclosure include systems and methods for training neural networks based on dual pipeline architectures. In some embodiments, a first set of compute elements are configured to implement a first set of layers of a first instance of a neural network. A second set of compute elements are configured to implement a second set of layers of the first instance of the neural network. The second set of compute elements are further configured to implement a first set of layers of a second instance of the neural network. The first set of compute elements are further configured to implement a second set of layers of the second instance of the neural network. The first set of layers of the first instance of the neural network and the first set of layers of the second instance of the neural network are each configured to receive training data.

Type: Application

Filed: January 15, 2021

Publication date: May 5, 2022

Inventors: Mattheus HEDDES, Torsten HOEFLER, Kenneth Andrew COLWELL, Amar PHANISHAYEE
Distributed processing architecture

Patent number: 11076210

Abstract: Embodiments of the present disclosure include techniques for processing neural networks. Various forms of parallelism may be implemented using topology that combines sequences of processors. In one embodiment, the present disclosure includes a computer system comprising one or more processor groups, the processor groups each comprising a plurality of processors. A plurality of network switches are coupled to subsets of the plurality of processor groups. In one embodiment, the switches may be optical network switches. Processors in the processor groups may be configurable to form sequences, and the network switches are configurable to form at least one sequence across one or more of the plurality of processor groups to perform neural network computations. Various alternative configurations for creating Hamiltonian cycles are disclosed to support data parallelism, pipeline parallelism, layer parallelism, or combinations thereof.

Type: Grant

Filed: May 26, 2020

Date of Patent: July 27, 2021

Assignee: Microsoft Technology Licensing, LLC

Inventors: Torsten Hoefler, Mattheus C. Heddes, Jonathan R. Belk
DISTRIBUTED PROCESSING ARCHITECTURE

Publication number: 20210211787

Abstract: Embodiments of the present disclosure include techniques for processing neural networks. Various forms of parallelism may be implemented using topology that combines sequences of processors. In one embodiment, the present disclosure includes a computer system comprising one or more processor groups, the processor groups each comprising a plurality of processors. A plurality of network switches are coupled to subsets of the plurality of processor groups. In one embodiment, the switches may be optical network switches. Processors in the processor groups may be configurable to form sequences, and the network switches are configurable to form at least one sequence across one or more of the plurality of processor groups to perform neural network computations. Various alternative configurations for creating Hamiltonian cycles are disclosed to support data parallelism, pipeline parallelism, layer parallelism, or combinations thereof.

Type: Application

Filed: May 26, 2020

Publication date: July 8, 2021

Inventors: Torsten HOEFLER, Mattheus C. HEDDES, Jonathan R. BELK

1 2 next