Patents by Inventor James Dinan

James Dinan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

TECHNOLOGIES FOR SYNCHRONIZING TRIGGERED OPERATIONS

Publication number: 20190050274

Abstract: Technologies for synchronizing triggered operations include a host fabric interface (HFI) of a compute device configured to receive an operation execution command associated with a triggered operation that has been fired and determine whether the operation execution command includes an instruction to update a table entry of a table managed by the HFI. Additionally, the he HFI is configured to issue, in response to a determination that the operation execution command includes the instruction to update the table entry, a triggered list enable (TLE) operation and a triggered list disable (TLD) operation to a table manager of the HFI and disable a corresponding table entry in response to the TLD operation having been triggered, the identified table entry. The HFI is further configured to execute one or more command operations associated with the received operation execution command and re-enable, in response to the TLE operation having been triggered, the table entry. Other embodiments are described herein.

Type: Application

Filed: March 30, 2018

Publication date: February 14, 2019

Inventors: James Dinan, Mario Flajslik, Timo Schneider, Keith D. Underwood
TECHNOLOGIES FOR EXTENDING TRIGGERED OPERATIONS

Publication number: 20190042337

Abstract: Technologies for extending triggered operations include a host fabric interface (HFI) of a compute device configured to detect a triggering event associated with a counter, increment the counter, and determine whether a value of the counter matches a trigger threshold of a triggered operation in a triggered operation queue associated with the counter. The HFI is further configured to execute, one or more commands associated with the triggered operation upon determining that the value of the counter matches the trigger threshold, and determine, subsequent to the execution of the one or more commands, whether the triggered operation corresponds to a recurring triggered operation. The HFI is additionally configured to increment, in response to a determination that the triggered operation corresponds to a recurring triggered operation, the value of the trigger threshold by a threshold increment and re-insert the triggered operation into the triggered operation queue. Other embodiments are described herein.

Type: Application

Filed: December 30, 2017

Publication date: February 7, 2019

Inventors: James Dinan, Mario Flajslik, Timo Schneider, Keith D. Underwood
TRIGGERED OPERATIONS TO IMPROVE ALLREDUCE OVERLAP

Publication number: 20190042946

Abstract: An embodiment of a semiconductor package apparatus may include technology to embed one or more trigger operations in one or more messages related to collective operations for a neural network, and issue the one or more messages related to the collective operations to a hardware-based message scheduler in a desired order of execution. Other embodiments are disclosed and claimed.

Type: Application

Filed: September 11, 2018

Publication date: February 7, 2019

Applicant: Intel Corporation

Inventors: Sayantan Sur, James Dinan, Maria Garzaran, Anupama Kurpad, Andrew Friedley, Nusrat Islam, Robert Zak
TECHNOLOGIES FOR GENERATING TRIGGERED CONDITIONAL EVENTS

Publication number: 20190042335

Abstract: Technologies for generating triggered conditional events operations include a host fabric interface (HFI) of a compute device configured to receive an operation execution command message associated with a triggered operation that has been fired, process the received operation execution command message to extract and store argument information from the received operation execution command, and increment an event counter associated with the fired triggered operation. The HFI is further configured to perform a triggered compare-and-generate event (TCAGE) operation as a function of the extracted argument information, determine whether to generate a triggering event, generate the triggering event as a function of the performed TCAGE operation, insert the generated triggered event into a triggered operation queue, and update the value of the event counter. Other embodiments are described herein.

Type: Application

Filed: March 30, 2018

Publication date: February 7, 2019

Inventors: Mario Flajslik, Keith D. Underwood, Timo Schneider, James Dinan
Coordination for one-sided memory access in a partitioned global address space

Patent number: 10200472

Abstract: Generally, this disclosure provides systems, devices, methods and computer readable media for improved coordination between sender and receiver nodes in a one-sided memory access to a PGAS in a distributed computing environment. The system may include a transceiver module configured to receive a message over a network, the message comprising a data portion and a data size indicator and an offset handler module configured to calculate a destination address from a base address of a memory buffer and an offset counter. The transceiver module may further be configured to write the data portion to the memory buffer at the destination address; and the offset handler module may further be configured to update the offset counter based on the data size indicator.

Type: Grant

Filed: December 24, 2014

Date of Patent: February 5, 2019

Assignee: Intel Corporation

Inventors: Mario Flajslik, James Dinan
Fabric-integrated data pulling engine

Patent number: 10200310

Abstract: In an example, there is disclosed a compute node, comprising: first one or more logic elements comprising a data producer engine to produce a datum; and a host fabric interface to communicatively couple the compute node to a fabric, the host fabric interface comprising second one or more logic elements comprising a data pulling engine, the data pulling engine to: publish the datum as available; receive a pull request for the datum, the pull request comprising a node identifier for a data consumer; and send the datum to the data consumer via the fabric. There is also disclosed a method of providing a data pulling engine.

Type: Grant

Filed: December 24, 2015

Date of Patent: February 5, 2019

Assignee: Intel Corporation

Inventors: James Dinan, Mario Flajslik, Keith Underwood, David Keppel, Ulf Rainer Hanebutte
Technologies for aggregation-based message synchronization

Patent number: 10178041

Abstract: Technologies for aggregation-based message processing include multiple computing nodes in communication over a network. A computing node receives a message from a remote computing node, increments an event counter in response to receiving the message, determines whether an event trigger is satisfied in response to incrementing the counter, and writes a completion event to an event queue if the event trigger is satisfied. An application of the computing node monitors the event queue for the completion event. The application may be executed by a processor core of the computing node, and the other operations may be performed by a host fabric interface of the computing node. The computing node may be a target node and count one-sided messages received from an initiator node, or the computing node may be an initiator node and count acknowledgement messages received from a target node. Other embodiments are described and claimed.

Type: Grant

Filed: September 23, 2015

Date of Patent: January 8, 2019

Assignee: Intel Corporation

Inventors: James Dinan, Mario Flajslik, David Keppel, Ulf R. Hanebutte
Technologies for performance inspection at an endpoint node

Patent number: 10135708

Abstract: Technologies for monitoring communication performance of a high performance computing (HPC) network include a performance probing engine of a source endpoint node of the HPC network. The performance probing engine is configured to generate a probe request that includes a timestamp of the probe request and transmit the probe request to a destination endpoint node of the HPC network communicatively coupled to the source endpoint node via the HPC network. The performance probing engine is additionally configured to receive a probe response from the destination endpoint node via the HPC network and to generate another timestamp that corresponds to the probe request having been received. Further, the performance probing engine is configured to determine a round-trip latency as a function of the probe request and probe response timestamps. Other embodiments are described and claimed.

Type: Grant

Filed: September 25, 2015

Date of Patent: November 20, 2018

Assignee: Intel Corporation

Inventors: James Dinan, David Keppel
Technologies for sideband performance tracing of network traffic

Patent number: 10135711

Abstract: Technologies for tracing network performance include a network computing device configured to receive a network packet from a source endpoint node, process the received network packet, capture trace data corresponding to the network packet as it is processed by the network computing device, and transmit the received network packet to a target endpoint node. The network computing device is further configured to generate a trace data network packet that includes at least a portion of the captured trace data and transmit the trace data network packet to the destination endpoint node. The destination endpoint node is configured to monitor performance of the network by reconstructing a trace of the network packet based on the trace data of the trace data network packet. Other embodiments are described herein.

Type: Grant

Filed: December 22, 2015

Date of Patent: November 20, 2018

Assignee: Intel Corporation

Inventors: Robert C. Zak, David Keppel, James Dinan
TECHNOLOGIES FOR OFFLOADED MANAGEMENT OF COMMUNICATION

Publication number: 20180287954

Abstract: Technologies for offloaded management of communication are disclosed. In order to manage communication with information that may be available to applications in a compute device, the compute device may offload communication management to a host fabric interface using a credit management system. A credit limit is established, and each message to be sent is added to a queue with a corresponding number of credits required to send the message. The host fabric interface of the compute device may send out messages as credits become available and decrease the number of available credits based on the number of credits required to send a particular message. When an acknowledgement of receipt of a message is received, the number of credits required to send the corresponding message may be added back to an available credit pool.

Type: Application

Filed: March 29, 2017

Publication date: October 4, 2018

Inventors: James Dinan, Sayantan Sur, Mario Flajslik, Keith D. Underwood
TECHNOLOGIES FOR FINE-GRAINED COMPLETION TRACKING OF MEMORY BUFFER ACCESSES

Publication number: 20180267742

Abstract: Technologies for fine-grained completion tracking of memory buffer accesses include a compute device. The compute device is to establish multiple counter pairs for a memory buffer. Each counter pair includes a locally managed offset and a completion counter. The compute device is also to receive a request from a remote compute device to access the memory buffer, assign one of the counter pairs to the request, advance the locally managed offset of the assigned counter pair by the amount of data to be read or written, and advance the completion counter of the assigned counter pair as the data is read from or written to the memory buffer. Other embodiments are also described and claimed.

Type: Application

Filed: March 20, 2017

Publication date: September 20, 2018

Inventors: James Dinan, Keith D. Underwood, Sayantan Sur, Charles A. Giefer, Mario Flajslik
Technologies for scalable remotely accessible memory segments

Patent number: 10073809

Abstract: Technologies for one-side remote memory access communication include multiple computing nodes in communication over a network. A receiver computing node receives a message from a sender node and extracts a segment identifier from the message. The receiver computing node determines, based on the segment identifier, a segment start address associated with a partitioned global address space (PGAS) segment of its local memory. The receiver computing node may index a segment table stored in the local memory or in a host fabric interface. The receiver computing node determines a local destination address within the PGAS segment based on the segment start address and an offset included in the message. The receiver computing node performs a remote memory access operation at the local destination address. The receiver computing node may perform those operations in hardware by the host fabric interface of the receiver computing node. Other embodiments are described and claimed.

Type: Grant

Filed: April 27, 2015

Date of Patent: September 11, 2018

Assignee: Intel Corporation

Inventors: James Dinan, Mario Flajslik
TECHNOLOGIES FOR ENDPOINT CONGESTION AVOIDANCE

Publication number: 20180234347

Abstract: Technologies for endpoint congestion avoidance are disclosed. In order to avoid congestion caused by a network fabric that can transport data to a compute device faster than the compute device can store the data in a particular type of memory, the compute device may in the illustrative embodiment determine a suitable data transfer rate and communicate an indication of the data transfer rate to the remote compute device which is sending the data. The remote compute device may then send the data at the indicated data transfer rate, thus avoiding congestion.

Type: Application

Filed: February 10, 2017

Publication date: August 16, 2018

Inventors: James Dinan, Mario Flajslik, Robert C. Zak
TECHNOLOGIES FOR QUEUE MANAGEMENT BY A HOST FABRIC INTERFACE

Publication number: 20180225144

Abstract: Technologies for managing a queue on a compute device are disclosed. In the illustrative embodiment, the queue is managed by a host fabric interface of the compute device. Queue operations such as enqueuing data onto the queue and dequeuing data from the queue may be requested by remote compute devices by sending queue operations which may be processed by the host fabric interface. The host fabric interface may, in some embodiments, fully manage the queue without any assistance from the processor of the compute device. In other embodiments, the processor of the compute device may be responsible for certain tasks, such as garbage collection.

Type: Application

Filed: February 9, 2017

Publication date: August 9, 2018

Inventors: James Dinan, Mario Flajslik, Timo Schneider
Technologies for integrated thread scheduling

Patent number: 9916178

Abstract: Technologies for integrated thread scheduling include a computing device having a network interface controller (NIC). The NIC is configured to detect and suspend a thread that is being blocked by one or more communication operations. A thread scheduling engine of the NIC is configured to move the suspended thread from a running queue of the system thread scheduler to a pending queue of the thread scheduling engine. The thread scheduling engine is further configured to move the suspended thread from the pending queue to a ready queue of the thread scheduling engine upon determining any dependencies and/or blocking communications operations have completed. Other embodiments are described and claimed.

Type: Grant

Filed: September 25, 2015

Date of Patent: March 13, 2018

Assignee: Intel Corporation

Inventors: James Dinan, Mario Flajslik, Tom St. John
TECHNOLOGIES FOR DYNAMIC WORK QUEUE MANAGEMENT

Publication number: 20170289242

Abstract: Technologies for dynamic work queue management include a producer computing device communicatively coupled to a consumer computing device. The consumer computing device is configured to transmit a pop request (e.g., a one-sided pull request) that includes consumption constraints indicating an amount of work (e.g., a range of acceptable fraction of work elements to return from a work queue of the producer computing device) to pull from the producer computing device. The producer computing device is configured to determine whether the pop request can be satisfied and generate a response that includes an indication of the result of the determination and one or more producer metrics usable by the consumer computing device to determine a subsequent action to be performed by the consumer computing device upon receipt of the response message. Other embodiments are described and claimed herein.

Type: Application

Filed: March 31, 2016

Publication date: October 5, 2017

Inventors: David Keppel, Ulf R. Hanebutte, Mario Flajslik, James Dinan
Scalable synchronization mechanism for distributed memory

Patent number: 9733995

Abstract: A method comprising receiving control information at a first processing element from a second processing element, synchronizing objects within a shared global memory space of the first processing element with a shared global memory space of a second processing element in response to receiving the control information and generating a completion event indicating the first processing element has been synchronized with the second processing element.

Type: Grant

Filed: December 17, 2014

Date of Patent: August 15, 2017

Assignee: Intel Corporation

Inventors: Clement T. Cole, James Dinan, Gabriele Jost, Stanley C. Smith, Robert W. Wisniewski, Keith D. Underwood
Technologies for inline network traffic performance tracing

Publication number: 20170187587

Abstract: Technologies for tracing network performance in a high performance computing (HPC) network include a network computing device configured to receive a network packet from a source endpoint node and store the header and trace data of the received network packet to a trace buffer of the network computing device. The network computing device is further configured to retrieve updated trace data from the trace buffer and update the trace data portion of the network packet to include the retrieved updated trace data from the trace buffer. Additionally, the network computing device is configured to transmit the updated network packet to a target endpoint node, in which the trace data of the updated network packet is usable by the target endpoint node to determine inline performance of the network relative to a flow of the network packet. Other embodiments are described and claimed herein.

Type: Application

Filed: December 26, 2015

Publication date: June 29, 2017

Inventors: David Keppel, James Dinan, Robert C. Zak
FABRIC-INTEGRATED DATA PULLING ENGINE

Publication number: 20170185561

Abstract: In an example, there is disclosed a compute node, comprising: first one or more logic elements comprising a data producer engine to produce a datum; and a host fabric interface to communicatively couple the compute node to a fabric, the host fabric interface comprising second one or more logic elements comprising a data pulling engine, the data pulling engine to: publish the datum as available; receive a pull request for the datum, the pull request comprising a node identifier for a data consumer; and send the datum to the data consumer via the fabric. There is also disclosed a method of providing a data pulling engine.

Type: Application

Filed: December 24, 2015

Publication date: June 29, 2017

Applicant: Intel Corporation

Inventors: James Dinan, Mario Flajslik, Keith Underwood, David Keppel, Ulf Rainer Hanebutte
TECHNOLOGIES FOR SIDEBAND PERFORMANCE TRACING OF NETWORK TRAFFIC

Publication number: 20170180235

Abstract: Technologies for tracing network performance include a network computing device configured to receive a network packet from a source endpoint node, process the received network packet, capture trace data corresponding to the network packet as it is processed by the network computing device, and transmit the received network packet to a target endpoint node. The network computing device is further configured to generate a trace data network packet that includes at least a portion of the captured trace data and transmit the trace data network packet to the destination endpoint node. The destination endpoint node is configured to monitor performance of the network by reconstructing a trace of the network packet based on the trace data of the trace data network packet. Other embodiments are described herein.

Type: Application

Filed: December 22, 2015

Publication date: June 22, 2017

Inventors: Robert C. Zak, David Keppel, James Dinan

prev 1 2 3 next