Patents by Inventor James Dinan

James Dinan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11677839
    Abstract: Apparatuses, systems, and techniques are directed to automatic coalescing of GPU-initiated network communications. In one method, a communication engine receives, from a shared memory application executing on a first graphics processing unit (GPU), a first communication request assigned to or having a second GPU as a destination to be processed. The communication engine determines that the first communication request satisfies a coalescing criterion and stores the first communication request in association with a group of requests that have a common property. The communication engine coalesces the group of requests into a coalesced request and transports the coalesced request to the second GPU over a network.
    Type: Grant
    Filed: June 17, 2021
    Date of Patent: June 13, 2023
    Assignee: NVIDIA Corporation
    Inventors: James Dinan, Akhil Langer, Sreeram Potluri
  • Patent number: 11645534
    Abstract: An embodiment of a semiconductor package apparatus may include technology to embed one or more trigger operations in one or more messages related to collective operations for a neural network, and issue the one or more messages related to the collective operations to a hardware-based message scheduler in a desired order of execution. Other embodiments are disclosed and claimed.
    Type: Grant
    Filed: September 11, 2018
    Date of Patent: May 9, 2023
    Assignee: Intel Corporation
    Inventors: Sayantan Sur, James Dinan, Maria Garzaran, Anupama Kurpad, Andrew Friedley, Nusrat Islam, Robert Zak
  • Publication number: 20220407920
    Abstract: Apparatuses, systems, and techniques are directed to automatic coalescing of GPU-initiated network communications. In one method, a communication engine receives, from a shared memory application executing on a first graphics processing unit (GPU), a first communication request assigned to or having a second GPU as a destination to be processed. The communication engine determines that the first communication request satisfies a coalescing criterion and stores the first communication request in association with a group of requests that have a common property. The communication engine coalesces the group of requests into a coalesced request and transports the coalesced request to the second GPU over a network.
    Type: Application
    Filed: June 17, 2021
    Publication date: December 22, 2022
    Inventors: James Dinan, Akhil Langer, Sreeram Potluri
  • Publication number: 20220334948
    Abstract: Methods, apparatus, systems and articles of manufacture to improve performance data collection are disclosed. An example apparatus includes a performance data comparator of a source node to collect the performance data of an application of the source node from the host fabric interface at a polling frequency; an interface to transmit a write back instruction to the host fabric interface, the write back instruction to cause data to be written to a memory address location of memory of the source node to trigger a wake up mode; and a frequency selector to: start the polling frequency to a first polling frequency for a sleep mode; and increase the polling frequency to a second polling frequency in response to the data in the memory address location identifying the wake mode.
    Type: Application
    Filed: July 1, 2022
    Publication date: October 20, 2022
    Inventors: David Ozog, Md. Wasi-ur Rahman, James Dinan
  • Patent number: 11194636
    Abstract: Technologies for generating triggered conditional events operations include a host fabric interface (HFI) of a compute device configured to receive an operation execution command message associated with a triggered operation that has been fired, process the received operation execution command message to extract and store argument information from the received operation execution command, and increment an event counter associated with the fired triggered operation. The HFI is further configured to perform a triggered compare-and-generate event (TCAGE) operation as a function of the extracted argument information, determine whether to generate a triggering event, generate the triggering event as a function of the performed TCAGE operation, insert the generated triggered event into a triggered operation queue, and update the value of the event counter. Other embodiments are described herein.
    Type: Grant
    Filed: March 30, 2018
    Date of Patent: December 7, 2021
    Assignee: Intel Corporation
    Inventors: Mario Flajslik, Keith D. Underwood, Timo Schneider, James Dinan
  • Patent number: 11188394
    Abstract: Technologies for synchronizing triggered operations include a host fabric interface (HFI) of a compute device configured to receive an operation execution command associated with a triggered operation that has been fired and determine whether the operation execution command includes an instruction to update a table entry of a table managed by the HFI. Additionally, the HFI is configured to issue, in response to a determination that the operation execution command includes the instruction to update the table entry, a triggered list enable (TLE) operation and a triggered list disable (TLD) operation to a table manager of the HFI and disable a corresponding table entry in response to the TLD operation having been triggered, the identified table entry. The HFI is further configured to execute one or more command operations associated with the received operation execution command and re-enable, in response to the TLE operation having been triggered, the table entry. Other embodiments are described herein.
    Type: Grant
    Filed: March 30, 2018
    Date of Patent: November 30, 2021
    Assignee: Intel Corporation
    Inventors: James Dinan, Mario Flajslik, Timo Schneider, Keith D. Underwood
  • Patent number: 11157336
    Abstract: Technologies for extending triggered operations include a host fabric interface (HFI) of a compute device configured to detect a triggering event associated with a counter, increment the counter, and determine whether a value of the counter matches a trigger threshold of a triggered operation in a triggered operation queue associated with the counter. The HFI is further configured to execute, one or more commands associated with the triggered operation upon determining that the value of the counter matches the trigger threshold, and determine, subsequent to the execution of the one or more commands, whether the triggered operation corresponds to a recurring triggered operation. The HFI is additionally configured to increment, in response to a determination that the triggered operation corresponds to a recurring triggered operation, the value of the trigger threshold by a threshold increment and re-insert the triggered operation into the triggered operation queue. Other embodiments are described herein.
    Type: Grant
    Filed: December 30, 2017
    Date of Patent: October 26, 2021
    Assignee: Intel Corporation
    Inventors: James Dinan, Mario Flajslik, Timo Schneider, Keith D. Underwood
  • Publication number: 20210255910
    Abstract: Systems, apparatuses and methods may provide for detecting an outbound communication and identifying a context of the outbound communication. Additionally, a completion status of the outbound communication may be tracked relative to the context. In one example, tracking the completion status includes incrementing a sent messages counter associated with the context in response to the outbound communication, detecting an acknowledgement of the outbound communication based on a network response to the outbound communication, incrementing a received acknowledgements counter associated with the context in response to the acknowledgement, comparing the sent messages counter to the received acknowledgements counter, and triggering a per-context memory ordering operation if the sent messages counter and the received acknowledgements counter have matching values.
    Type: Application
    Filed: May 21, 2020
    Publication date: August 19, 2021
    Applicant: Intel Corporation
    Inventors: Mario Flajslik, James Dinan
  • Patent number: 11023275
    Abstract: Technologies for managing a queue on a compute device are disclosed. In the illustrative embodiment, the queue is managed by a host fabric interface of the compute device. Queue operations such as enqueuing data onto the queue and dequeuing data from the queue may be requested by remote compute devices by sending queue operations which may be processed by the host fabric interface. The host fabric interface may, in some embodiments, fully manage the queue without any assistance from the processor of the compute device. In other embodiments, the processor of the compute device may be responsible for certain tasks, such as garbage collection.
    Type: Grant
    Filed: February 9, 2017
    Date of Patent: June 1, 2021
    Assignee: Intel Corporation
    Inventors: James Dinan, Mario Flajslik, Timo Schneider
  • Patent number: 10963183
    Abstract: Technologies for fine-grained completion tracking of memory buffer accesses include a compute device. The compute device is to establish multiple counter pairs for a memory buffer. Each counter pair includes a locally managed offset and a completion counter. The compute device is also to receive a request from a remote compute device to access the memory buffer, assign one of the counter pairs to the request, advance the locally managed offset of the assigned counter pair by the amount of data to be read or written, and advance the completion counter of the assigned counter pair as the data is read from or written to the memory buffer. Other embodiments are also described and claimed.
    Type: Grant
    Filed: March 20, 2017
    Date of Patent: March 30, 2021
    Assignee: Intel Corporation
    Inventors: James Dinan, Keith D. Underwood, Sayantan Sur, Charles A. Giefer, Mario Flajslik
  • Patent number: 10958589
    Abstract: Technologies for offloaded management of communication are disclosed. In order to manage communication with information that may be available to applications in a compute device, the compute device may offload communication management to a host fabric interface using a credit management system. A credit limit is established, and each message to be sent is added to a queue with a corresponding number of credits required to send the message. The host fabric interface of the compute device may send out messages as credits become available and decrease the number of available credits based on the number of credits required to send a particular message. When an acknowledgement of receipt of a message is received, the number of credits required to send the corresponding message may be added back to an available credit pool.
    Type: Grant
    Filed: March 29, 2017
    Date of Patent: March 23, 2021
    Assignee: Intel Corporation
    Inventors: James Dinan, Sayantan Sur, Mario Flajslik, Keith D. Underwood
  • Patent number: 10693787
    Abstract: Techniques are disclosed to throttle bandwidth imbalanced data transfers. In some examples, an example computer-implemented method may include splitting a payload of a data transfer operation over a network fabric into multiple chunk get operations, starting the execution of a threshold number of the chunk get operations, and scheduling the remaining chunk get operations for subsequent execution. The method may also include executing a scheduled chunk get operation in response determining a completion of an executing chunk get operation. In some embodiments, the chunk get operations may be implemented as triggered operations.
    Type: Grant
    Filed: August 25, 2017
    Date of Patent: June 23, 2020
    Assignee: Intel Corporation
    Inventors: Timo Schneider, Keith D. Underwood, Mario Flajslik, Sayantan Sur, James Dinan
  • Patent number: 10671457
    Abstract: Systems, apparatuses and methods may provide for detecting an outbound communication and identifying a context of the outbound communication. Additionally, a completion status of the outbound communication may be tracked relative to the context. In one example, tracking the completion status includes incrementing a sent messages counter associated with the context in response to the outbound communication, detecting an acknowledgement of the outbound communication based on a network response to the outbound communication, incrementing a received acknowledgements counter associated with the context in response to the acknowledgement, comparing the sent messages counter to the received acknowledgements counter, and triggering a per-context memory ordering operation if the sent messages counter and the received acknowledgements counter have matching values.
    Type: Grant
    Filed: March 27, 2015
    Date of Patent: June 2, 2020
    Assignee: Intel Corporation
    Inventors: Mario Flajslik, James Dinan
  • Patent number: 10652353
    Abstract: Technologies for communication with direct data placement include a number of computing nodes in communication over a network. Each computing node includes a many-core processor having an integrated host fabric interface (HFI) that maintains an association table (AT). In response to receiving a message from a remote device, the HFI determines whether the AT includes an entry associating one or more parameters of the message to a destination processor core. If so, the HFI causes a data transfer agent (DTA) of the destination core to receive the message data. The DTA may place the message data in a private cache of the destination core. Message parameters may include a destination process identifier or other network address and a virtual memory address range. The HFI may automatically update the AT based on communication operations generated by software executed by the processor cores. Other embodiments are described and claimed.
    Type: Grant
    Filed: September 24, 2015
    Date of Patent: May 12, 2020
    Assignee: Intel Corporation
    Inventors: James Dinan, Venkata Krishnan, Srinivas Sridharan, David A. Webb
  • Patent number: 10574733
    Abstract: Technologies for handling message passing interface receive operations include a compute node to determine a plurality of parameters of a receive entry to be posted and determine whether the plurality of parameters includes a wildcard entry. The compute node generates a hash based on at least one parameter of the plurality of parameters in response to determining that the plurality of parameters does not include the wildcard entry and appends the receive entry to a list in a bin of a posted receive data structure, wherein the bin is determined based on the generated hash. The compute node further tracks the wildcard entry in the posted receive data structure in response to determining the plurality of parameters includes the wildcard entry and appends the receive entry to a wildcard list of the posted receive data structure in response to tracking the wildcard entry.
    Type: Grant
    Filed: September 18, 2015
    Date of Patent: February 25, 2020
    Assignee: Intel Corporation
    Inventors: James Dinan, Mario Flajslik, Keith D. Underwood
  • Patent number: 10554568
    Abstract: Technologies for estimating network round-trip times include a sender computing node in network communication with a set of neighboring computing nodes. The sender computing node is configured to determine the set of neighboring computing nodes, as well as a plurality of subsets of the set of neighboring computing nodes. Accordingly, the sender computing node generates a message queue for each of the plurality of subsets, each message queue including a probe message for each neighboring node in the subset to which the message queue corresponds. The sender computing node is further configured to determine a round-trip time for each message queue (i.e., subset of neighboring computing nodes) based on a duration of time between the first probe message of the message queue being transmitted and an acknowledgment being received in response to the last probe message of the message queue being transmitted.
    Type: Grant
    Filed: September 25, 2015
    Date of Patent: February 4, 2020
    Assignee: Intel Corporation
    Inventors: Mario Flajslik, James Dinan
  • Patent number: 10439946
    Abstract: Technologies for endpoint congestion avoidance are disclosed. In order to avoid congestion caused by a network fabric that can transport data to a compute device faster than the compute device can store the data in a particular type of memory, the compute device may in the illustrative embodiment determine a suitable data transfer rate and communicate an indication of the data transfer rate to the remote compute device which is sending the data. The remote compute device may then send the data at the indicated data transfer rate, thus avoiding congestion.
    Type: Grant
    Filed: February 10, 2017
    Date of Patent: October 8, 2019
    Assignee: INTEL CORPORATION
    Inventors: James Dinan, Mario Flajslik, Robert C. Zak
  • Publication number: 20190188111
    Abstract: Methods, apparatus, systems and articles of manufacture to improve performance data collection are disclosed. An example apparatus includes a performance data comparator of a source node to collect the performance data of an application of the source node from the host fabric interface at a polling frequency; an interface to transmit a write back instruction to the host fabric interface, the write back instruction to cause data to be written to a memory address location of memory of the source node to trigger a wake up mode; and a frequency selector to: start the polling frequency to a first polling frequency for a sleep mode; and increase the polling frequency to a second polling frequency in response to the data in the memory address location identifying the wake mode.
    Type: Application
    Filed: February 26, 2019
    Publication date: June 20, 2019
    Inventors: David Ozog, Md. Wasi-ur Rahman, James Dinan
  • Publication number: 20190068501
    Abstract: Techniques are disclosed to throttle bandwidth imbalanced data transfers. In some examples, an example computer-implemented method may include splitting a payload of a data transfer operation over a network fabric into multiple chunk get operations, starting the execution of a threshold number of the chunk get operations, and scheduling the remaining chunk get operations for subsequent execution. The method may also include executing a scheduled chunk get operation in response determining a completion of an executing chunk get operation. In some embodiments, the chunk get operations may be implemented as triggered operations.
    Type: Application
    Filed: August 25, 2017
    Publication date: February 28, 2019
    Applicant: INTEL CORPORATION
    Inventors: TIMO SCHNEIDER, KEITH D. UNDERWOOD, MARIO FLAJSLIK, SAYANTAN SUR, JAMES DINAN
  • Publication number: 20190050274
    Abstract: Technologies for synchronizing triggered operations include a host fabric interface (HFI) of a compute device configured to receive an operation execution command associated with a triggered operation that has been fired and determine whether the operation execution command includes an instruction to update a table entry of a table managed by the HFI. Additionally, the he HFI is configured to issue, in response to a determination that the operation execution command includes the instruction to update the table entry, a triggered list enable (TLE) operation and a triggered list disable (TLD) operation to a table manager of the HFI and disable a corresponding table entry in response to the TLD operation having been triggered, the identified table entry. The HFI is further configured to execute one or more command operations associated with the received operation execution command and re-enable, in response to the TLE operation having been triggered, the table entry. Other embodiments are described herein.
    Type: Application
    Filed: March 30, 2018
    Publication date: February 14, 2019
    Inventors: James Dinan, Mario Flajslik, Timo Schneider, Keith D. Underwood