Patents by Inventor Mario Flajslik
Mario Flajslik has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10250524Abstract: Technologies for increasing the bandwidth of partitioned hierarchical networks is disclosed. If each partition of network groups of a computer network are isolated, then the connections between the network groups of different partitions may go unused. However, careful selection of the network connections between partitions of different network groups may allow for a pseudo-direct connection between two network groups of the same partition using a single non-blocking switch in a network group of a different partition. Such a configuration can increase the effective bandwidth available within a partition without affecting the bandwidth available in another partition.Type: GrantFiled: September 23, 2016Date of Patent: April 2, 2019Assignee: Intel CorporationInventors: Mario Flajslik, Gene Wu, Michael A. Parker
-
Publication number: 20190097935Abstract: Technologies for improving throughput in a network include a node switch. The node switch is to obtain expected performance data indicative of an expected data transfer performance of the node switch. The node switch is also to obtain measured performance data indicative of a measured data transfer performance of the node switch, compare the measured performance data to the expected performance data to determine whether the measured data transfer performance satisfies the expected data transfer performance, determine, as a function of whether the measured data transfer performance satisfies the expected data transfer performance, whether to force a unit of data through a non-minimal path to a destination, and send, in response to a determination to force the unit of data to be sent through a non-minimal path, the unit of data to an output port of the node switch associated with the non-minimal path. Other embodiments are also described.Type: ApplicationFiled: September 27, 2017Publication date: March 28, 2019Inventors: Mario Flajslik, Eric R. Borch, Timo Schneider, Michael A. Parker
-
Publication number: 20190068501Abstract: Techniques are disclosed to throttle bandwidth imbalanced data transfers. In some examples, an example computer-implemented method may include splitting a payload of a data transfer operation over a network fabric into multiple chunk get operations, starting the execution of a threshold number of the chunk get operations, and scheduling the remaining chunk get operations for subsequent execution. The method may also include executing a scheduled chunk get operation in response determining a completion of an executing chunk get operation. In some embodiments, the chunk get operations may be implemented as triggered operations.Type: ApplicationFiled: August 25, 2017Publication date: February 28, 2019Applicant: INTEL CORPORATIONInventors: TIMO SCHNEIDER, KEITH D. UNDERWOOD, MARIO FLAJSLIK, SAYANTAN SUR, JAMES DINAN
-
Publication number: 20190050274Abstract: Technologies for synchronizing triggered operations include a host fabric interface (HFI) of a compute device configured to receive an operation execution command associated with a triggered operation that has been fired and determine whether the operation execution command includes an instruction to update a table entry of a table managed by the HFI. Additionally, the he HFI is configured to issue, in response to a determination that the operation execution command includes the instruction to update the table entry, a triggered list enable (TLE) operation and a triggered list disable (TLD) operation to a table manager of the HFI and disable a corresponding table entry in response to the TLD operation having been triggered, the identified table entry. The HFI is further configured to execute one or more command operations associated with the received operation execution command and re-enable, in response to the TLE operation having been triggered, the table entry. Other embodiments are described herein.Type: ApplicationFiled: March 30, 2018Publication date: February 14, 2019Inventors: James Dinan, Mario Flajslik, Timo Schneider, Keith D. Underwood
-
Publication number: 20190042335Abstract: Technologies for generating triggered conditional events operations include a host fabric interface (HFI) of a compute device configured to receive an operation execution command message associated with a triggered operation that has been fired, process the received operation execution command message to extract and store argument information from the received operation execution command, and increment an event counter associated with the fired triggered operation. The HFI is further configured to perform a triggered compare-and-generate event (TCAGE) operation as a function of the extracted argument information, determine whether to generate a triggering event, generate the triggering event as a function of the performed TCAGE operation, insert the generated triggered event into a triggered operation queue, and update the value of the event counter. Other embodiments are described herein.Type: ApplicationFiled: March 30, 2018Publication date: February 7, 2019Inventors: Mario Flajslik, Keith D. Underwood, Timo Schneider, James Dinan
-
Publication number: 20190042337Abstract: Technologies for extending triggered operations include a host fabric interface (HFI) of a compute device configured to detect a triggering event associated with a counter, increment the counter, and determine whether a value of the counter matches a trigger threshold of a triggered operation in a triggered operation queue associated with the counter. The HFI is further configured to execute, one or more commands associated with the triggered operation upon determining that the value of the counter matches the trigger threshold, and determine, subsequent to the execution of the one or more commands, whether the triggered operation corresponds to a recurring triggered operation. The HFI is additionally configured to increment, in response to a determination that the triggered operation corresponds to a recurring triggered operation, the value of the trigger threshold by a threshold increment and re-insert the triggered operation into the triggered operation queue. Other embodiments are described herein.Type: ApplicationFiled: December 30, 2017Publication date: February 7, 2019Inventors: James Dinan, Mario Flajslik, Timo Schneider, Keith D. Underwood
-
Patent number: 10200472Abstract: Generally, this disclosure provides systems, devices, methods and computer readable media for improved coordination between sender and receiver nodes in a one-sided memory access to a PGAS in a distributed computing environment. The system may include a transceiver module configured to receive a message over a network, the message comprising a data portion and a data size indicator and an offset handler module configured to calculate a destination address from a base address of a memory buffer and an offset counter. The transceiver module may further be configured to write the data portion to the memory buffer at the destination address; and the offset handler module may further be configured to update the offset counter based on the data size indicator.Type: GrantFiled: December 24, 2014Date of Patent: February 5, 2019Assignee: Intel CorporationInventors: Mario Flajslik, James Dinan
-
Patent number: 10200310Abstract: In an example, there is disclosed a compute node, comprising: first one or more logic elements comprising a data producer engine to produce a datum; and a host fabric interface to communicatively couple the compute node to a fabric, the host fabric interface comprising second one or more logic elements comprising a data pulling engine, the data pulling engine to: publish the datum as available; receive a pull request for the datum, the pull request comprising a node identifier for a data consumer; and send the datum to the data consumer via the fabric. There is also disclosed a method of providing a data pulling engine.Type: GrantFiled: December 24, 2015Date of Patent: February 5, 2019Assignee: Intel CorporationInventors: James Dinan, Mario Flajslik, Keith Underwood, David Keppel, Ulf Rainer Hanebutte
-
Patent number: 10178041Abstract: Technologies for aggregation-based message processing include multiple computing nodes in communication over a network. A computing node receives a message from a remote computing node, increments an event counter in response to receiving the message, determines whether an event trigger is satisfied in response to incrementing the counter, and writes a completion event to an event queue if the event trigger is satisfied. An application of the computing node monitors the event queue for the completion event. The application may be executed by a processor core of the computing node, and the other operations may be performed by a host fabric interface of the computing node. The computing node may be a target node and count one-sided messages received from an initiator node, or the computing node may be an initiator node and count acknowledgement messages received from a target node. Other embodiments are described and claimed.Type: GrantFiled: September 23, 2015Date of Patent: January 8, 2019Assignee: Intel CorporationInventors: James Dinan, Mario Flajslik, David Keppel, Ulf R. Hanebutte
-
Publication number: 20190007224Abstract: Technologies for densely packaging network components for large scale indirect topologies include group of switches. The group of switches includes a stack of node switches that includes a first set of ports and a stack of global switches that includes a second set of ports. The stack of node switches are oriented orthogonally to the stack of global switches. Additionally, the first set of ports are oriented towards the second set of ports and the node switches are connected to the global switches through the first and second sets of ports. Other embodiments are also described and claimed.Type: ApplicationFiled: June 29, 2017Publication date: January 3, 2019Inventors: Mario Flajslik, Eric R. Borch, Michael A. Parker, Richard J. Dischler
-
Publication number: 20180351812Abstract: Technologies for dynamic bandwidth management of interconnect fabric include a compute device configured to calculate a predicted fabric bandwidth demand which is expected to be used by the interconnect fabric in a next epoch and subsequent to a present epoch. The compute device is additionally configured to determine whether any global links and/or local links of the interconnect fabric can be disabled during the next epoch as a function of the calculated predicted fabric bandwidth demand and a number of redundant paths associated with the links of the interconnect fabric. The compute device is further configured to disable one or more of the global links and/or the local links that can be disabled, the one or more local links of the plurality of local links that can be disabled. Other embodiments are described herein.Type: ApplicationFiled: March 30, 2018Publication date: December 6, 2018Inventors: Eric R. Borch, Robert C. Zak, Mario Flajslik, Jonathan M. Eastep, Michael A. Parker
-
Publication number: 20180287858Abstract: Technologies for efficiently managing link faults between switches include a fabric monitor. The fabric monitor is to generate routing rules indicative of an ordering of a plurality of global switches connected to a plurality of node switches in a group, monitor a status of links between the global switches and the node switches to determine whether one or more downlinks have failed in the group, adjust, in response to a determination that one or more downlinks have failed in the group, the ordering of the global switches in the routing rules, and send the adjusted routing rules to the group. Other embodiments are also described and claimed.Type: ApplicationFiled: March 31, 2017Publication date: October 4, 2018Inventors: Mario Flajslik, Eric R. Borch, Michael A. Parker
-
Publication number: 20180287954Abstract: Technologies for offloaded management of communication are disclosed. In order to manage communication with information that may be available to applications in a compute device, the compute device may offload communication management to a host fabric interface using a credit management system. A credit limit is established, and each message to be sent is added to a queue with a corresponding number of credits required to send the message. The host fabric interface of the compute device may send out messages as credits become available and decrease the number of available credits based on the number of credits required to send a particular message. When an acknowledgement of receipt of a message is received, the number of credits required to send the corresponding message may be added back to an available credit pool.Type: ApplicationFiled: March 29, 2017Publication date: October 4, 2018Inventors: James Dinan, Sayantan Sur, Mario Flajslik, Keith D. Underwood
-
Publication number: 20180267742Abstract: Technologies for fine-grained completion tracking of memory buffer accesses include a compute device. The compute device is to establish multiple counter pairs for a memory buffer. Each counter pair includes a locally managed offset and a completion counter. The compute device is also to receive a request from a remote compute device to access the memory buffer, assign one of the counter pairs to the request, advance the locally managed offset of the assigned counter pair by the amount of data to be read or written, and advance the completion counter of the assigned counter pair as the data is read from or written to the memory buffer. Other embodiments are also described and claimed.Type: ApplicationFiled: March 20, 2017Publication date: September 20, 2018Inventors: James Dinan, Keith D. Underwood, Sayantan Sur, Charles A. Giefer, Mario Flajslik
-
Patent number: 10073809Abstract: Technologies for one-side remote memory access communication include multiple computing nodes in communication over a network. A receiver computing node receives a message from a sender node and extracts a segment identifier from the message. The receiver computing node determines, based on the segment identifier, a segment start address associated with a partitioned global address space (PGAS) segment of its local memory. The receiver computing node may index a segment table stored in the local memory or in a host fabric interface. The receiver computing node determines a local destination address within the PGAS segment based on the segment start address and an offset included in the message. The receiver computing node performs a remote memory access operation at the local destination address. The receiver computing node may perform those operations in hardware by the host fabric interface of the receiver computing node. Other embodiments are described and claimed.Type: GrantFiled: April 27, 2015Date of Patent: September 11, 2018Assignee: Intel CorporationInventors: James Dinan, Mario Flajslik
-
Publication number: 20180234347Abstract: Technologies for endpoint congestion avoidance are disclosed. In order to avoid congestion caused by a network fabric that can transport data to a compute device faster than the compute device can store the data in a particular type of memory, the compute device may in the illustrative embodiment determine a suitable data transfer rate and communicate an indication of the data transfer rate to the remote compute device which is sending the data. The remote compute device may then send the data at the indicated data transfer rate, thus avoiding congestion.Type: ApplicationFiled: February 10, 2017Publication date: August 16, 2018Inventors: James Dinan, Mario Flajslik, Robert C. Zak
-
Publication number: 20180225144Abstract: Technologies for managing a queue on a compute device are disclosed. In the illustrative embodiment, the queue is managed by a host fabric interface of the compute device. Queue operations such as enqueuing data onto the queue and dequeuing data from the queue may be requested by remote compute devices by sending queue operations which may be processed by the host fabric interface. The host fabric interface may, in some embodiments, fully manage the queue without any assistance from the processor of the compute device. In other embodiments, the processor of the compute device may be responsible for certain tasks, such as garbage collection.Type: ApplicationFiled: February 9, 2017Publication date: August 9, 2018Inventors: James Dinan, Mario Flajslik, Timo Schneider
-
Publication number: 20180091437Abstract: Technologies for increasing the bandwidth of partitioned hierarchical networks is disclosed. If each partition of network groups of a computer network are isolated, then the connections between the network groups of different partitions may go unused. However, careful selection of the network connections between partitions of different network groups may allow for a pseudo-direct connection between two network groups of the same partition using a single non-blocking switch in a network group of a different partition. Such a configuration can increase the effective bandwidth available within a partition without affecting the bandwidth available in another partition.Type: ApplicationFiled: September 23, 2016Publication date: March 29, 2018Inventors: Mario Flajslik, Gene Wu, Michael A. Parker
-
Publication number: 20180089127Abstract: Technologies for a system of communicatively coupled network switches in a hierarchical interconnect network topology include two or more groups that each include two or more first and second level switches in which each of the first level switches are communicatively coupled to each of the plurality of second level switches to form a complete bipartite graph. Additionally, each of the groups is interconnected to each of the other groups via a corresponding global link connecting a second level switch of one group to a corresponding second level switch of another group. Further, each of the first level switches are communicatively coupled to one or more computing nodes. Other embodiments are described herein.Type: ApplicationFiled: September 29, 2016Publication date: March 29, 2018Inventors: Mario Flajslik, Eric R. Borch, Michael A. Parker, Wayne A. Downer
-
Patent number: 9916178Abstract: Technologies for integrated thread scheduling include a computing device having a network interface controller (NIC). The NIC is configured to detect and suspend a thread that is being blocked by one or more communication operations. A thread scheduling engine of the NIC is configured to move the suspended thread from a running queue of the system thread scheduler to a pending queue of the thread scheduling engine. The thread scheduling engine is further configured to move the suspended thread from the pending queue to a ready queue of the thread scheduling engine upon determining any dependencies and/or blocking communications operations have completed. Other embodiments are described and claimed.Type: GrantFiled: September 25, 2015Date of Patent: March 13, 2018Assignee: Intel CorporationInventors: James Dinan, Mario Flajslik, Tom St. John