Patents by Inventor Sameh S. Sharkawi
Sameh S. Sharkawi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 10776185Abstract: Techniques are disclosed for efficient handling of messages in computing systems that include tag matching capable hardware. A message management module provides for handling message events including application receives and channel notifications such that hardware tag matching can continuously run in hardware channels, such as network adapters. When the message event is an application receive the message management module adds the application receive to a tracking queue and determines if the application receive can be posted to a hardware channel capable of tag matching. When the message event is a channel notification, the message management module determines a message action using the message tracking queue and the information in the channel notification.Type: GrantFiled: December 10, 2018Date of Patent: September 15, 2020Assignee: International Business Machines CorporationInventors: Sameh S. Sharkawi, Sameer Kumar, Bryan S. Rosenburg
-
Publication number: 20200183764Abstract: Techniques are disclosed for efficient handling of messages in computing systems that include tag matching capable hardware. A message management module provides for handling message events including application receives and channel notifications such that hardware tag matching can continuously run in hardware channels, such as network adapters. When the message event is an application receive the message management module adds the application receive to a tracking queue and determines if the application receive can be posted to a hardware channel capable of tag matching. When the message event is a channel notification, the message management module determines a message action using the message tracking queue and the information in the channel notification.Type: ApplicationFiled: December 10, 2018Publication date: June 11, 2020Inventors: Sameh S. SHARKAWI, Sameer KUMAR, Bryan S. ROSENBURG
-
Patent number: 10296395Abstract: Performing a rooted-v collective operation by an operational group of compute nodes in a parallel computer includes: upon encountering a rooted-v collection operation during execution, identifying, by a root node of an operational group of compute nodes, a count to use for the selection of a collective algorithm for effecting the rooted-v collective operation; broadcasting, by the root node to the other computer nodes in the operational group, an active message, wherein the active message includes the identified count to use for the selection of the collective algorithm; and selecting, by all the compute nodes of the operational group based on the identified count, a same collective algorithm to effect the rooted-v collective operation; and executing the rooted-v collective operation by all compute nodes of the operational group using the selected algorithm.Type: GrantFiled: May 9, 2016Date of Patent: May 21, 2019Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Nysal Jan K. A., Sameh S. Sharkawi
-
Executing an all-to-allv operation on a parallel computer that includes a plurality of compute nodes
Patent number: 9830186Abstract: Executing an all-to-allv operation on a parallel computer that includes a plurality of compute nodes, including: packing, by each task in an operational group of tasks, vectored contribution data from vectored storage in an all-to-allv contribution data buffer into an all-to-all contribution data buffer, wherein two or more entries in the all-to-allv contribution data buffer are different in size and each entry in the all-to-all contribution data buffer is identical in size; executing with the contribution data as stored in the all-to-all contribution data buffer an all-to-all collective operation by the operational group of tasks; and unpacking, by each task in the operational group of tasks, received contribution data from the all-to-all contribution data buffer into the vectored storage in an all-to-allv contribution data buffer.Type: GrantFiled: May 27, 2014Date of Patent: November 28, 2017Assignee: International Business Machines CorporationInventors: Charles J. Archer, Nysal Jan K. A., Sameh S. Sharkawi -
Publication number: 20170322835Abstract: Performing a rooted-v collective operation by an operational group of compute nodes in a parallel computer includes: upon encountering a rooted-v collection operation during execution, identifying, by a root node of an operational group of compute nodes, a count to use for the selection of a collective algorithm for effecting the rooted-v collective operation; broadcasting, by the root node to the other computer nodes in the operational group, an active message, wherein the active message includes the identified count to use for the selection of the collective algorithm; and selecting, by all the compute nodes of the operational group based on the identified count, a same collective algorithm to effect the rooted-v collective operation; and executing the rooted-v collective operation by all compute nodes of the operational group using the selected algorithm.Type: ApplicationFiled: May 9, 2016Publication date: November 9, 2017Inventors: NYSAL JAN K.A., SAMEH S. SHARKAWI
-
Executing an all-to-ally operation on a parallel computer that includes a plurality of compute nodes
Patent number: 9772876Abstract: Executing an all-to-allv operation on a parallel computer that includes a plurality of compute nodes, including: packing, by each task in an operational group of tasks, vectored contribution data from vectored storage in an all-to-allv contribution data buffer into an all-to-all contribution data buffer, wherein two or more entries in the all-to-allv contribution data buffer are different in size and each entry in the all-to-all contribution data buffer is identical in size; executing with the contribution data as stored in the all-to-all contribution data buffer an all-to-all collective operation by the operational group of tasks; and unpacking, by each task in the operational group of tasks, received contribution data from the all-to-all contribution data buffer into the vectored storage in an all-to-allv contribution data buffer.Type: GrantFiled: January 6, 2014Date of Patent: September 26, 2017Assignee: International Business Machines CorporationInventors: Charles J. Archer, Nysal Jan K.A., Sameh S. Sharkawi -
Patent number: 9513611Abstract: Adjusting environmental variables in an adaptive parameter adjustment runtime environment, including: executing a parallel program by the adaptive parameter adjustment runtime environment, including beginning operations with a set of default global parameter values; maintaining a list of configurable parameters; changing a parameter value for a parameter in the list of configurable parameters; determining whether an effect of changing the parameter value is positive, negative, or neutral; responsive to determining that the effect of changing the parameter value is positive, changing the parameter value for the parameter; responsive to determining that the effect of changing the parameter value is negative, changing the parameter value for the parameter to a previous value; and responsive to determining that the effect of changing the parameter value is neutral, performing a list management operation on the list of configurable parameters.Type: GrantFiled: May 15, 2014Date of Patent: December 6, 2016Assignee: International Business Machines CorporationInventors: Charles J. Archer, Nysal Jan K.A., Sameh S. Sharkawi
-
Patent number: 9495205Abstract: Constructing a logical tree topology in a parallel computer that includes compute nodes, where each compute node includes a hardware acceleration unit and executes an identical number of tasks and the tasks of each node have a rank, includes: creating hardware acceleration groups, with each hardware acceleration group including one task from each node, where the one task from each node has the same rank; assigning one task of a root compute node as a global root of the logical tree topology; assigning tasks of the root compute node other than the global root as local children of the global root; and assigning each of the global root and local children of the root compute node as a root of a subtree of tasks, wherein each subtree comprises the tasks of a hardware acceleration group.Type: GrantFiled: April 30, 2014Date of Patent: November 15, 2016Assignee: International Business Machines CorporationInventors: Charles J. Archer, Nysal Jan K. A., Sameh S. Sharkawi
-
Patent number: 9495204Abstract: Constructing a logical tree topology in a parallel computer that includes compute nodes, where each compute node includes a hardware acceleration unit and executes an identical number of tasks and the tasks of each node have a rank, includes: creating hardware acceleration groups, with each hardware acceleration group including one task from each node, where the one task from each node has the same rank; assigning one task of a root compute node as a global root of the logical tree topology; assigning tasks of the root compute node other than the global root as local children of the global root; and assigning each of the global root and local children of the root compute node as a root of a subtree of tasks, wherein each subtree comprises the tasks of a hardware acceleration group.Type: GrantFiled: January 6, 2014Date of Patent: November 15, 2016Assignee: International Business Machines CorporationInventors: Charles J. Archer, Nysal Jan K. A., Sameh S. Sharkawi
-
Patent number: 9454139Abstract: Adjusting environmental variables in an adaptive parameter adjustment runtime environment, including: executing a parallel program by the adaptive parameter adjustment runtime environment, including beginning operations with a set of default global parameter values; maintaining a list of configurable parameters; changing a parameter value for a parameter in the list of configurable parameters; determining whether an effect of changing the parameter value is positive, negative, or neutral; responsive to determining that the effect of changing the parameter value is positive, changing the parameter value for the parameter; responsive to determining that the effect of changing the parameter value is negative, changing the parameter value for the parameter to a previous value; and responsive to determining that the effect of changing the parameter value is neutral, performing a list management operation on the list of configurable parameters.Type: GrantFiled: January 7, 2014Date of Patent: September 27, 2016Assignee: International Business Machines CorporationInventors: Charles J. Archer, Nysal Jan K. A., Sameh S. Sharkawi
-
Patent number: 9411777Abstract: In a parallel computer, performing a rooted-v collective operation by an operational group of compute nodes includes: identifying, in source code by a collective algorithm selection optimizing module, a gather operation followed by a rooted-v collective operation; replacing, by the collective algorithm selection optimizing module, the gather operation with an allgather operation; executing, by the compute nodes, the allgather operation; selecting, by each compute node in dependence upon results of the allgather operation, an algorithm for effecting the rooted-v collective operation; and executing, by each compute node, the rooted-v collective operation with the selected algorithm.Type: GrantFiled: September 16, 2014Date of Patent: August 9, 2016Assignee: International Business Machines CorporationInventors: Charles J. Archer, Nysal Jan K. A., Sameh S. Sharkawi
-
Patent number: 9348651Abstract: Constructing a logical tree topology in a parallel computer that includes compute nodes, where each node executes a number of tasks and at least one node executes a number of tasks different from another node includes: identifying a compute node executing a greatest number of tasks; selecting, as a global root, a task from the identified compute node, including assigning the task as a local root of the identified compute node and assigning each of the other tasks of the identified compute node as a child of the local root; selecting, from each of the other compute nodes, one task to be a local root, including assigning each task other than the local root as a child of the local root; and assigning each local root of the other compute nodes to be a child of one of the tasks of the identified compute node other than the global root.Type: GrantFiled: December 5, 2013Date of Patent: May 24, 2016Assignee: International Business Machines CorporationInventors: Charles J. Archer, Nysal Jan K. A., Sameh S. Sharkawi
-
Patent number: 9336053Abstract: Constructing a logical tree topology in a parallel computer that includes compute nodes, where each node executes a number of tasks and at least one node executes a number of tasks different from another node includes: identifying a compute node executing a greatest number of tasks; selecting, as a global root, a task from the identified compute node, including assigning the task as a local root of the identified compute node and assigning each of the other tasks of the identified compute node as a child of the local root; selecting, from each of the other compute nodes, one task to be a local root, including assigning each task other than the local root as a child of the local root; and assigning each local root of the other compute nodes to be a child of one of the tasks of the identified compute node other than the global root.Type: GrantFiled: May 29, 2014Date of Patent: May 10, 2016Assignee: International Business Machines CorporationInventors: Charles J. Archer, Nysal Jan K.A., Sameh S. Sharkawi
-
Publication number: 20160077998Abstract: In a parallel computer, performing a rooted-v collective operation by an operational group of compute nodes includes: identifying, in source code by a collective algorithm selection optimizing module, a gather operation followed by a rooted-v collective operation; replacing, by the collective algorithm selection optimizing module, the gather operation with an allgather operation; executing, by the compute nodes, the allgather operation; selecting, by each compute node in dependence upon results of the allgather operation, an algorithm for effecting the rooted-v collective operation; and executing, by each compute node, the rooted-v collective operation with the selected algorithm.Type: ApplicationFiled: September 16, 2014Publication date: March 17, 2016Inventors: CHARLES J. ARCHER, NYSAL JAN K.A., SAMEH S. SHARKAWI
-
Patent number: 9223505Abstract: Administering inter-core communication via shared memory may be carried out in a system in which each core is associated with a mailbox in a shared memory region. Such administration may include constructing a mailbox latency table describing latency of writing data from each core to each mailbox; constructing a locking latency table describing latency of each core in acquiring a lock for each of the mailboxes; identifying, from the tables, groups of a cores having mailbox and locking latency within a predefined range of acceptable latency values; and for each identified group of cores, establishing, for every pair of cores in the group of cores, a private channel, including pinning, for each private channel established for a pair of cores, one local memory segment per core.Type: GrantFiled: September 18, 2013Date of Patent: December 29, 2015Assignee: GLOBALFOUNDRIES Inc.Inventors: Charles J. Archer, Nysal Jan K. A., Sameh S. Sharkawi
-
Patent number: 9135142Abstract: A performance projection system includes a test IHS and a currently existing IHS. The performance projection system includes surrogate programs and user application software. The test IHS employs a memory that includes a virtual future IHS, currently existing IHS, surrogate programs, and user application software for determination of runtime and HW counter performance data. The user application software and surrogate programs execute on the currently existing MS to provide designers with runtime data and HW counter or microarchitecture dependent data. Designers execute surrogate programs on the future IHS to provide runtime and HW counter data. Designers normalize and weight the runtime and HW counter data to provide a representative surrogate program for comparison to user application software performance on the future IHS. Using a scaling factor, designers may generate a projection of runtime performance for the user application software executing on the future IHS.Type: GrantFiled: December 24, 2008Date of Patent: September 15, 2015Assignee: International Business Machines CorporationInventors: Robert H. Bell, Jr., Luigi Brochard, Donald Robert DeSota, Venkat R. Indukuru, Rajendra D. Panda, Sameh S. Sharkawi
-
Publication number: 20150193262Abstract: Constructing a logical tree topology in a parallel computer that includes compute nodes, where each compute node includes a hardware acceleration unit and executes an identical number of tasks and the tasks of each node have a rank, includes: creating hardware acceleration groups, with each hardware acceleration group including one task from each node, where the one task from each node has the same rank; assigning one task of a root compute node as a global root of the logical tree topology; assigning tasks of the root compute node other than the global root as local children of the global root; and assigning each of the global root and local children of the root compute node as a root of a subtree of tasks, wherein each subtree comprises the tasks of a hardware acceleration group.Type: ApplicationFiled: April 30, 2014Publication date: July 9, 2015Applicant: International Business Machines CorporationInventors: CHARLES J. ARCHER, NYSAL JAN K.A., SAMEH S. SHARKAWI
-
Executing An All-To-Allv Operation On A Parallel Computer That Includes A Plurality Of Compute Nodes
Publication number: 20150193271Abstract: Executing an all-to-allv operation on a parallel computer that includes a plurality of compute nodes, including: packing, by each task in an operational group of tasks, vectored contribution data from vectored storage in an all-to-allv contribution data buffer into an all-to-all contribution data buffer, wherein two or more entries in the all-to-allv contribution data buffer are different in size and each entry in the all-to-all contribution data buffer is identical in size; executing with the contribution data as stored in the all-to-all contribution data buffer an all-to-all collective operation by the operational group of tasks; and unpacking, by each task in the operational group of tasks, received contribution data from the all-to-all contribution data buffer into the vectored storage in an all-to-allv contribution data buffer.Type: ApplicationFiled: January 6, 2014Publication date: July 9, 2015Applicant: International Business Machines CorporationInventors: CHARLES J. ARCHER, NYSAL JAN K.A., SAMEH S. SHARKAWI -
Publication number: 20150192910Abstract: Adjusting environmental variables in an adaptive parameter adjustment runtime environment, including: executing a parallel program by the adaptive parameter adjustment runtime environment, including beginning operations with a set of default global parameter values; maintaining a list of configurable parameters; changing a parameter value for a parameter in the list of configurable parameters; determining whether an effect of changing the parameter value is positive, negative, or neutral; responsive to determining that the effect of changing the parameter value is positive, changing the parameter value for the parameter; responsive to determining that the effect of changing the parameter value is negative, changing the parameter value for the parameter to a previous value; and responsive to determining that the effect of changing the parameter value is neutral, performing a list management operation on the list of configurable parameters.Type: ApplicationFiled: May 15, 2014Publication date: July 9, 2015Applicant: International Business Machines CorporationInventors: CHARLES J. ARCHER, NYSAL JAN K.A., SAMEH S. SHARKAWI
-
EXECUTING AN ALL-TO-ALLV OPERATION ON A PARALLEL COMPUTER THAT INCLUDES A PLURALITY OF COMPUTE NODES
Publication number: 20150193269Abstract: Executing an all-to-allv operation on a parallel computer that includes a plurality of compute nodes, including: packing, by each task in an operational group of tasks, vectored contribution data from vectored storage in an all-to-allv contribution data buffer into an all-to-all contribution data buffer, wherein two or more entries in the all-to-allv contribution data buffer are different in size and each entry in the all-to-all contribution data buffer is identical in size; executing with the contribution data as stored in the all-to-all contribution data buffer an all-to-all collective operation by the operational group of tasks; and unpacking, by each task in the operational group of tasks, received contribution data from the all-to-all contribution data buffer into the vectored storage in an all-to-allv contribution data buffer.Type: ApplicationFiled: May 27, 2014Publication date: July 9, 2015Applicant: International Business Machines CorporationInventors: CHARLES J. ARCHER, NYSAL JAN K.A., SAMEH S. SHARKAWI