Patents by Inventor Gengbin ZHENG

Gengbin ZHENG has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20220217071
    Abstract: Methods and apparatus for efficient topology-aware tree search algorithm for a broadcast operation. A broadcast tree for a broadcast operation in a network having a hierarchical structure including nodes logically partitioned at group and switch levels. Lists of visited nodes (vnodes) and unvisited nodes (unodes) are initialized. Beginning at a root node, search iterations are performed in a progressive manner to build the tree, wherein a given search iteration finds a unode that can be reached earliest from a vnode, moves the unode that is found from the unode list to the vnode list and adds new unodes to the unode list based on the location of the unode. Beginning with the switch the root node is connected to, the algorithm progressively adds nodes from other switches in the root group and then from other groups and switches within those other groups and continues until all nodes have been visited.
    Type: Application
    Filed: March 23, 2022
    Publication date: July 7, 2022
    Inventors: Gengbin ZHENG, Maria GARZARAN
  • Publication number: 20210271536
    Abstract: Algorithms for optimizing small message collectives with hardware supported triggered operations and associated methods, apparatus, and systems. The algorithms are implemented in a distributed compute environment comprising a plurality of ranks including a root, a plurality of intermediate nodes, and a plurality of leaf nodes, where each of the plurality of ranks comprising a compute platform having a communication interface including embedded logic for implementing the algorithms. Collectives are employed to transfer data between parent ranks and child ranks. In connection with the collectives, control messages are sent from children of a collective to the parent of the collective informing the parent that the children of the collective have free buffers ready to receive data. The parent employs a counter to determine that a control message has been received from each of its children indicating each child has a free buffer prior to sending data to the children in the collective.
    Type: Application
    Filed: December 23, 2020
    Publication date: September 2, 2021
    Inventors: Maria Garzaran, Nusrat Islam, Gengbin Zheng, Sayantan Sur
  • Patent number: 10846245
    Abstract: Examples include a computing system having an input/output (I/O) device including a plurality of counters, each counter operating as one of a completion counter and a trigger counter, a processing device; and a memory device. The memory device stores instructions that, in response to execution by the processing device, cause the processing device to represent a plurality of triggered operations of collective communication for high-performance computing executable by the I/O device as a directed acyclic graph stored in the memory device, with triggered operations represented as vertices of the directed acyclic graph and dependencies between triggered operations represented as edges of the directed acyclic graph; traverse the directed acyclic graph using a first process to identify and mark vertices that can share a completion counter; and traverse the directed acyclic graph using a second process to assign a completion counter and a trigger counter for each vertex.
    Type: Grant
    Filed: March 14, 2019
    Date of Patent: November 24, 2020
    Assignee: Intel Corporation
    Inventors: Nusrat Islam, Gengbin Zheng, Sayantan Sur, Maria Garzaran, Akhil Langer
  • Publication number: 20190213146
    Abstract: Examples include a computing system having an input/output (I/O) device including a plurality of counters, each counter operating as one of a completion counter and a trigger counter, a processing device; and a memory device. The memory device stores instructions that, in response to execution by the processing device, cause the processing device to represent a plurality of triggered operations of collective communication for high-performance computing executable by the I/O device as a directed acyclic graph stored in the memory device, with triggered operations represented as vertices of the directed acyclic graph and dependencies between triggered operations represented as edges of the directed acyclic graph; traverse the directed acyclic graph using a first process to identify and mark vertices that can share a completion counter; and traverse the directed acyclic graph using a second process to assign a completion counter and a trigger counter for each vertex.
    Type: Application
    Filed: March 14, 2019
    Publication date: July 11, 2019
    Inventors: Nusrat ISLAM, Gengbin ZHENG, Sayantan SUR, Maria GARZARAN, Akhil LANGER