Patents by Inventor Sameh S. Sharkawi

Sameh S. Sharkawi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10776185
    Abstract: Techniques are disclosed for efficient handling of messages in computing systems that include tag matching capable hardware. A message management module provides for handling message events including application receives and channel notifications such that hardware tag matching can continuously run in hardware channels, such as network adapters. When the message event is an application receive the message management module adds the application receive to a tracking queue and determines if the application receive can be posted to a hardware channel capable of tag matching. When the message event is a channel notification, the message management module determines a message action using the message tracking queue and the information in the channel notification.
    Type: Grant
    Filed: December 10, 2018
    Date of Patent: September 15, 2020
    Assignee: International Business Machines Corporation
    Inventors: Sameh S. Sharkawi, Sameer Kumar, Bryan S. Rosenburg
  • Publication number: 20200183764
    Abstract: Techniques are disclosed for efficient handling of messages in computing systems that include tag matching capable hardware. A message management module provides for handling message events including application receives and channel notifications such that hardware tag matching can continuously run in hardware channels, such as network adapters. When the message event is an application receive the message management module adds the application receive to a tracking queue and determines if the application receive can be posted to a hardware channel capable of tag matching. When the message event is a channel notification, the message management module determines a message action using the message tracking queue and the information in the channel notification.
    Type: Application
    Filed: December 10, 2018
    Publication date: June 11, 2020
    Inventors: Sameh S. SHARKAWI, Sameer KUMAR, Bryan S. ROSENBURG
  • Patent number: 10296395
    Abstract: Performing a rooted-v collective operation by an operational group of compute nodes in a parallel computer includes: upon encountering a rooted-v collection operation during execution, identifying, by a root node of an operational group of compute nodes, a count to use for the selection of a collective algorithm for effecting the rooted-v collective operation; broadcasting, by the root node to the other computer nodes in the operational group, an active message, wherein the active message includes the identified count to use for the selection of the collective algorithm; and selecting, by all the compute nodes of the operational group based on the identified count, a same collective algorithm to effect the rooted-v collective operation; and executing the rooted-v collective operation by all compute nodes of the operational group using the selected algorithm.
    Type: Grant
    Filed: May 9, 2016
    Date of Patent: May 21, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Nysal Jan K. A., Sameh S. Sharkawi
  • Patent number: 9830186
    Abstract: Executing an all-to-allv operation on a parallel computer that includes a plurality of compute nodes, including: packing, by each task in an operational group of tasks, vectored contribution data from vectored storage in an all-to-allv contribution data buffer into an all-to-all contribution data buffer, wherein two or more entries in the all-to-allv contribution data buffer are different in size and each entry in the all-to-all contribution data buffer is identical in size; executing with the contribution data as stored in the all-to-all contribution data buffer an all-to-all collective operation by the operational group of tasks; and unpacking, by each task in the operational group of tasks, received contribution data from the all-to-all contribution data buffer into the vectored storage in an all-to-allv contribution data buffer.
    Type: Grant
    Filed: May 27, 2014
    Date of Patent: November 28, 2017
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Nysal Jan K. A., Sameh S. Sharkawi
  • Publication number: 20170322835
    Abstract: Performing a rooted-v collective operation by an operational group of compute nodes in a parallel computer includes: upon encountering a rooted-v collection operation during execution, identifying, by a root node of an operational group of compute nodes, a count to use for the selection of a collective algorithm for effecting the rooted-v collective operation; broadcasting, by the root node to the other computer nodes in the operational group, an active message, wherein the active message includes the identified count to use for the selection of the collective algorithm; and selecting, by all the compute nodes of the operational group based on the identified count, a same collective algorithm to effect the rooted-v collective operation; and executing the rooted-v collective operation by all compute nodes of the operational group using the selected algorithm.
    Type: Application
    Filed: May 9, 2016
    Publication date: November 9, 2017
    Inventors: NYSAL JAN K.A., SAMEH S. SHARKAWI
  • Patent number: 9772876
    Abstract: Executing an all-to-allv operation on a parallel computer that includes a plurality of compute nodes, including: packing, by each task in an operational group of tasks, vectored contribution data from vectored storage in an all-to-allv contribution data buffer into an all-to-all contribution data buffer, wherein two or more entries in the all-to-allv contribution data buffer are different in size and each entry in the all-to-all contribution data buffer is identical in size; executing with the contribution data as stored in the all-to-all contribution data buffer an all-to-all collective operation by the operational group of tasks; and unpacking, by each task in the operational group of tasks, received contribution data from the all-to-all contribution data buffer into the vectored storage in an all-to-allv contribution data buffer.
    Type: Grant
    Filed: January 6, 2014
    Date of Patent: September 26, 2017
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Nysal Jan K.A., Sameh S. Sharkawi
  • Patent number: 9513611
    Abstract: Adjusting environmental variables in an adaptive parameter adjustment runtime environment, including: executing a parallel program by the adaptive parameter adjustment runtime environment, including beginning operations with a set of default global parameter values; maintaining a list of configurable parameters; changing a parameter value for a parameter in the list of configurable parameters; determining whether an effect of changing the parameter value is positive, negative, or neutral; responsive to determining that the effect of changing the parameter value is positive, changing the parameter value for the parameter; responsive to determining that the effect of changing the parameter value is negative, changing the parameter value for the parameter to a previous value; and responsive to determining that the effect of changing the parameter value is neutral, performing a list management operation on the list of configurable parameters.
    Type: Grant
    Filed: May 15, 2014
    Date of Patent: December 6, 2016
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Nysal Jan K.A., Sameh S. Sharkawi
  • Patent number: 9495205
    Abstract: Constructing a logical tree topology in a parallel computer that includes compute nodes, where each compute node includes a hardware acceleration unit and executes an identical number of tasks and the tasks of each node have a rank, includes: creating hardware acceleration groups, with each hardware acceleration group including one task from each node, where the one task from each node has the same rank; assigning one task of a root compute node as a global root of the logical tree topology; assigning tasks of the root compute node other than the global root as local children of the global root; and assigning each of the global root and local children of the root compute node as a root of a subtree of tasks, wherein each subtree comprises the tasks of a hardware acceleration group.
    Type: Grant
    Filed: April 30, 2014
    Date of Patent: November 15, 2016
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Nysal Jan K. A., Sameh S. Sharkawi
  • Patent number: 9495204
    Abstract: Constructing a logical tree topology in a parallel computer that includes compute nodes, where each compute node includes a hardware acceleration unit and executes an identical number of tasks and the tasks of each node have a rank, includes: creating hardware acceleration groups, with each hardware acceleration group including one task from each node, where the one task from each node has the same rank; assigning one task of a root compute node as a global root of the logical tree topology; assigning tasks of the root compute node other than the global root as local children of the global root; and assigning each of the global root and local children of the root compute node as a root of a subtree of tasks, wherein each subtree comprises the tasks of a hardware acceleration group.
    Type: Grant
    Filed: January 6, 2014
    Date of Patent: November 15, 2016
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Nysal Jan K. A., Sameh S. Sharkawi
  • Patent number: 9454139
    Abstract: Adjusting environmental variables in an adaptive parameter adjustment runtime environment, including: executing a parallel program by the adaptive parameter adjustment runtime environment, including beginning operations with a set of default global parameter values; maintaining a list of configurable parameters; changing a parameter value for a parameter in the list of configurable parameters; determining whether an effect of changing the parameter value is positive, negative, or neutral; responsive to determining that the effect of changing the parameter value is positive, changing the parameter value for the parameter; responsive to determining that the effect of changing the parameter value is negative, changing the parameter value for the parameter to a previous value; and responsive to determining that the effect of changing the parameter value is neutral, performing a list management operation on the list of configurable parameters.
    Type: Grant
    Filed: January 7, 2014
    Date of Patent: September 27, 2016
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Nysal Jan K. A., Sameh S. Sharkawi
  • Patent number: 9411777
    Abstract: In a parallel computer, performing a rooted-v collective operation by an operational group of compute nodes includes: identifying, in source code by a collective algorithm selection optimizing module, a gather operation followed by a rooted-v collective operation; replacing, by the collective algorithm selection optimizing module, the gather operation with an allgather operation; executing, by the compute nodes, the allgather operation; selecting, by each compute node in dependence upon results of the allgather operation, an algorithm for effecting the rooted-v collective operation; and executing, by each compute node, the rooted-v collective operation with the selected algorithm.
    Type: Grant
    Filed: September 16, 2014
    Date of Patent: August 9, 2016
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Nysal Jan K. A., Sameh S. Sharkawi
  • Patent number: 9348651
    Abstract: Constructing a logical tree topology in a parallel computer that includes compute nodes, where each node executes a number of tasks and at least one node executes a number of tasks different from another node includes: identifying a compute node executing a greatest number of tasks; selecting, as a global root, a task from the identified compute node, including assigning the task as a local root of the identified compute node and assigning each of the other tasks of the identified compute node as a child of the local root; selecting, from each of the other compute nodes, one task to be a local root, including assigning each task other than the local root as a child of the local root; and assigning each local root of the other compute nodes to be a child of one of the tasks of the identified compute node other than the global root.
    Type: Grant
    Filed: December 5, 2013
    Date of Patent: May 24, 2016
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Nysal Jan K. A., Sameh S. Sharkawi
  • Patent number: 9336053
    Abstract: Constructing a logical tree topology in a parallel computer that includes compute nodes, where each node executes a number of tasks and at least one node executes a number of tasks different from another node includes: identifying a compute node executing a greatest number of tasks; selecting, as a global root, a task from the identified compute node, including assigning the task as a local root of the identified compute node and assigning each of the other tasks of the identified compute node as a child of the local root; selecting, from each of the other compute nodes, one task to be a local root, including assigning each task other than the local root as a child of the local root; and assigning each local root of the other compute nodes to be a child of one of the tasks of the identified compute node other than the global root.
    Type: Grant
    Filed: May 29, 2014
    Date of Patent: May 10, 2016
    Assignee: International Business Machines Corporation
    Inventors: Charles J. Archer, Nysal Jan K.A., Sameh S. Sharkawi
  • Publication number: 20160077998
    Abstract: In a parallel computer, performing a rooted-v collective operation by an operational group of compute nodes includes: identifying, in source code by a collective algorithm selection optimizing module, a gather operation followed by a rooted-v collective operation; replacing, by the collective algorithm selection optimizing module, the gather operation with an allgather operation; executing, by the compute nodes, the allgather operation; selecting, by each compute node in dependence upon results of the allgather operation, an algorithm for effecting the rooted-v collective operation; and executing, by each compute node, the rooted-v collective operation with the selected algorithm.
    Type: Application
    Filed: September 16, 2014
    Publication date: March 17, 2016
    Inventors: CHARLES J. ARCHER, NYSAL JAN K.A., SAMEH S. SHARKAWI
  • Patent number: 9223505
    Abstract: Administering inter-core communication via shared memory may be carried out in a system in which each core is associated with a mailbox in a shared memory region. Such administration may include constructing a mailbox latency table describing latency of writing data from each core to each mailbox; constructing a locking latency table describing latency of each core in acquiring a lock for each of the mailboxes; identifying, from the tables, groups of a cores having mailbox and locking latency within a predefined range of acceptable latency values; and for each identified group of cores, establishing, for every pair of cores in the group of cores, a private channel, including pinning, for each private channel established for a pair of cores, one local memory segment per core.
    Type: Grant
    Filed: September 18, 2013
    Date of Patent: December 29, 2015
    Assignee: GLOBALFOUNDRIES Inc.
    Inventors: Charles J. Archer, Nysal Jan K. A., Sameh S. Sharkawi
  • Patent number: 9135142
    Abstract: A performance projection system includes a test IHS and a currently existing IHS. The performance projection system includes surrogate programs and user application software. The test IHS employs a memory that includes a virtual future IHS, currently existing IHS, surrogate programs, and user application software for determination of runtime and HW counter performance data. The user application software and surrogate programs execute on the currently existing MS to provide designers with runtime data and HW counter or microarchitecture dependent data. Designers execute surrogate programs on the future IHS to provide runtime and HW counter data. Designers normalize and weight the runtime and HW counter data to provide a representative surrogate program for comparison to user application software performance on the future IHS. Using a scaling factor, designers may generate a projection of runtime performance for the user application software executing on the future IHS.
    Type: Grant
    Filed: December 24, 2008
    Date of Patent: September 15, 2015
    Assignee: International Business Machines Corporation
    Inventors: Robert H. Bell, Jr., Luigi Brochard, Donald Robert DeSota, Venkat R. Indukuru, Rajendra D. Panda, Sameh S. Sharkawi
  • Publication number: 20150193262
    Abstract: Constructing a logical tree topology in a parallel computer that includes compute nodes, where each compute node includes a hardware acceleration unit and executes an identical number of tasks and the tasks of each node have a rank, includes: creating hardware acceleration groups, with each hardware acceleration group including one task from each node, where the one task from each node has the same rank; assigning one task of a root compute node as a global root of the logical tree topology; assigning tasks of the root compute node other than the global root as local children of the global root; and assigning each of the global root and local children of the root compute node as a root of a subtree of tasks, wherein each subtree comprises the tasks of a hardware acceleration group.
    Type: Application
    Filed: April 30, 2014
    Publication date: July 9, 2015
    Applicant: International Business Machines Corporation
    Inventors: CHARLES J. ARCHER, NYSAL JAN K.A., SAMEH S. SHARKAWI
  • Publication number: 20150193271
    Abstract: Executing an all-to-allv operation on a parallel computer that includes a plurality of compute nodes, including: packing, by each task in an operational group of tasks, vectored contribution data from vectored storage in an all-to-allv contribution data buffer into an all-to-all contribution data buffer, wherein two or more entries in the all-to-allv contribution data buffer are different in size and each entry in the all-to-all contribution data buffer is identical in size; executing with the contribution data as stored in the all-to-all contribution data buffer an all-to-all collective operation by the operational group of tasks; and unpacking, by each task in the operational group of tasks, received contribution data from the all-to-all contribution data buffer into the vectored storage in an all-to-allv contribution data buffer.
    Type: Application
    Filed: January 6, 2014
    Publication date: July 9, 2015
    Applicant: International Business Machines Corporation
    Inventors: CHARLES J. ARCHER, NYSAL JAN K.A., SAMEH S. SHARKAWI
  • Publication number: 20150192910
    Abstract: Adjusting environmental variables in an adaptive parameter adjustment runtime environment, including: executing a parallel program by the adaptive parameter adjustment runtime environment, including beginning operations with a set of default global parameter values; maintaining a list of configurable parameters; changing a parameter value for a parameter in the list of configurable parameters; determining whether an effect of changing the parameter value is positive, negative, or neutral; responsive to determining that the effect of changing the parameter value is positive, changing the parameter value for the parameter; responsive to determining that the effect of changing the parameter value is negative, changing the parameter value for the parameter to a previous value; and responsive to determining that the effect of changing the parameter value is neutral, performing a list management operation on the list of configurable parameters.
    Type: Application
    Filed: May 15, 2014
    Publication date: July 9, 2015
    Applicant: International Business Machines Corporation
    Inventors: CHARLES J. ARCHER, NYSAL JAN K.A., SAMEH S. SHARKAWI
  • Publication number: 20150193269
    Abstract: Executing an all-to-allv operation on a parallel computer that includes a plurality of compute nodes, including: packing, by each task in an operational group of tasks, vectored contribution data from vectored storage in an all-to-allv contribution data buffer into an all-to-all contribution data buffer, wherein two or more entries in the all-to-allv contribution data buffer are different in size and each entry in the all-to-all contribution data buffer is identical in size; executing with the contribution data as stored in the all-to-all contribution data buffer an all-to-all collective operation by the operational group of tasks; and unpacking, by each task in the operational group of tasks, received contribution data from the all-to-all contribution data buffer into the vectored storage in an all-to-allv contribution data buffer.
    Type: Application
    Filed: May 27, 2014
    Publication date: July 9, 2015
    Applicant: International Business Machines Corporation
    Inventors: CHARLES J. ARCHER, NYSAL JAN K.A., SAMEH S. SHARKAWI