Patents by Inventor William P. LePera
William P. LePera has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11711425Abstract: According to an aspect, a computer-implemented method for performing distributed communication operations includes receiving a request, by a first computing system, to perform a distributed communication operation and obtaining, by the first computing system, a tree structure for performing the distributed communication operation, wherein the first computing system is a root node of the tree structure. The method also includes creating, by the first computing system, a message having header information and a payload for the distributed communication operation and transmitting, by the first computing system, a portion of the message to each child node of the first computing system, wherein the portion transmitted to each child node is unique.Type: GrantFiled: October 17, 2022Date of Patent: July 25, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Joshua J. Hursey, Austen William Lauria, William P. LePera, Scott Miller, Robert Perricone
-
Patent number: 11620254Abstract: An embodiment includes mapping, responsive to receiving a request for a container image from a container host, the requested container image to a first computer memory on a registry server. The embodiment also includes exposing a window storing the mapped container image to the container host using a collective window-creation call with the container host. The embodiment also includes processing a Remote Direct Memory Access (RDMA) data transfer request to select a lock type for the window during the RDMA data transfer. The embodiment also includes imposing the selected lock type on the window during the RDMA data transfer. The embodiment also includes releasing the selected lock type from the window upon detecting completion of the RDMA data transfer.Type: GrantFiled: June 3, 2020Date of Patent: April 4, 2023Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Scott Miller, Austen William Lauria, Sameh Sherif Sharkawi, William P. LePera
-
Patent number: 11455191Abstract: On a first compute resource, execution of a first task is triggered, execution of a portion of the first task being conditioned on a second task executing on a second compute resource. A state indicator of the second task is monitored, the state indicator indicating whether or not the second task is currently executing on the second compute resource. Responsive to the state indicator indicating that the second task is not currently executing, execution of the portion of the first task is suspended. A change in the state indicator is determined to have occurred. Responsive to the determining, received connection information for the second task is forwarded to the first task. Execution of the portion of the first task is re-triggered on the first compute resource.Type: GrantFiled: October 13, 2020Date of Patent: September 27, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Scott Miller, Austen William Lauria, Sameh Sherif Sharkawi, William P. LePera
-
Patent number: 11347594Abstract: A computer-implemented method and system for inter-processor communications fault handling in high performance computing networks. The method includes detecting that an InfiniBand (IB) queue pair has transitioned into an error state based on an unsuccessful completion status that relates to unsuccessful delivery of a message from an initiator endpoint at a first server device to at least one target endpoint at a second server device. The initiator and target endpoints are associated with at least one application under execution. An embodiment includes inferring, when the unsuccessful completion status is indicated as flushed, that the message was in a send queue of the IB queue pair when the IB queue pair transitioned into the error state. An embodiment includes establishing an IB Direct Connect queue pair connection between the target and initiator endpoints. An embodiment includes re-queueing the message in the IB queue pair for dispatch to the target endpoint.Type: GrantFiled: November 26, 2019Date of Patent: May 31, 2022Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: William P. LePera, Sameh Sherif Sharkawi
-
Patent number: 11321068Abstract: A computer implemented method uses memory coherence to enhance latency and bandwidth performance, the method including receiving, by a host, a call from an application. The method also includes, determining that the call includes a device allocation command, wherein the device allocation command is configured to allocate a set of data on a graphical processing unit. The method further includes intercepting the call. The method includes, initiating an alternate data allocation command; and returning the alternate data allocation command to the application. Further aspects of the present disclosure are directed to systems and computer program products containing functionality consistent with the method described above.Type: GrantFiled: September 5, 2019Date of Patent: May 3, 2022Assignee: International Business Machines CorporationInventors: William P. LePera, Austen William Lauria, Scott Miller, Sameh Sherif Sharkawi
-
Publication number: 20220114024Abstract: On a first compute resource, execution of a first task is triggered, execution of a portion of the first task being conditioned on a second task executing on a second compute resource. A state indicator of the second task is monitored, the state indicator indicating whether or not the second task is currently executing on the second compute resource. Responsive to the state indicator indicating that the second task is not currently executing, execution of the portion of the first task is suspended. A change in the state indicator is determined to have occurred. Responsive to the determining, received connection information for the second task is forwarded to the first task. Execution of the portion of the first task is re-triggered on the first compute resource.Type: ApplicationFiled: October 13, 2020Publication date: April 14, 2022Applicant: International Business Machines CorporationInventors: Scott Miller, Austen William Lauria, Sameh Sherif Sharkawi, William P. LePera
-
Patent number: 11221906Abstract: Technology for determining whether an inter-process type message has been successfully sent from a first process to a second process running on a single computer with a single processor(s) set. A variable (for example, a bit value) is used to indicate whether the inter-process message has been communicated between the processes. A timer and a predetermined timeout threshold are used to determine if the inter-process message has been pending for too long without being successfully communicated.Type: GrantFiled: January 10, 2020Date of Patent: January 11, 2022Assignee: International Business Machines CorporationInventors: William P. LePera, Sameh Sherif Sharkawi, Austen William Lauria
-
Publication number: 20210382846Abstract: An embodiment includes mapping, responsive to receiving a request for a container image from a container host, the requested container image to a first computer memory on a registry server. The embodiment also includes exposing a window storing the mapped container image to the container host using a collective window-creation call with the container host. The embodiment also includes processing a Remote Direct Memory Access (RDMA) data transfer request to select a lock type for the window during the RDMA data transfer. The embodiment also includes imposing the selected lock type on the window during the RDMA data transfer. The embodiment also includes releasing the selected lock type from the window upon detecting completion of the RDMA data transfer.Type: ApplicationFiled: June 3, 2020Publication date: December 9, 2021Applicant: International Business Machines CorporationInventors: Scott Miller, Austen William Lauria, Sameh Sherif Sharkawi, William P. LePera
-
Patent number: 11074071Abstract: An embodiment includes storing original environment data in a memory of a computing device, then sourcing a script in a child command shell that includes an environment variable set-up command for setting an environmental characteristic of a new computing environment associated with the child command shell. The new environment data is also stored in the memory of the computing device that defines the new computing environment associated with the child command shell. The original computing environment is then restored by terminating the child command shell and returning to the target command shell. The original environment data is compared to the new environment data to determine the differences between the two environments, and the original computing environment is then modified to match the new computing environment created by the sourced script.Type: GrantFiled: January 13, 2020Date of Patent: July 27, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Scott Miller, Mark Allen, Austen William Lauria, William P. LePera
-
Publication number: 20210216310Abstract: An embodiment includes storing original environment data in a memory of a computing device, then sourcing a script in a child command shell that includes an environment variable set-up command for setting an environmental characteristic of a new computing environment associated with the child command shell. The new environment data is also stored in the memory of the computing device that defines the new computing environment associated with the child command shell. The original computing environment is then restored by terminating the child command shell and returning to the target command shell. The original environment data is compared to the new environment data to determine the differences between the two environments, and the original computing environment is then modified to match the new computing environment created by the sourced script.Type: ApplicationFiled: January 13, 2020Publication date: July 15, 2021Applicant: International Business Machines CorporationInventors: Scott Miller, Mark Allen, Austen William Lauria, William P. LePera
-
Publication number: 20210216387Abstract: Technology for determining whether an inter-process type message has been successfully sent from a first process to a second process running on a single computer with a single processor(s) set. A variable (for example, a bit value) is used to indicate whether the inter-process message has been communicated between the processes. A timer and a predetermined timeout threshold are used to determine if the inter-process message has been pending for too long without being successfully communicated.Type: ApplicationFiled: January 10, 2020Publication date: July 15, 2021Inventors: William P. LePera, Sameh Sherif Sharkawi, Austen William Lauria
-
Publication number: 20210157691Abstract: A computer-implemented method and system for inter-processor communications fault handling in high performance computing networks. The method includes detecting that an InfiniBand (IB) queue pair has transitioned into an error state based on an unsuccessful completion status that relates to unsuccessful delivery of a message from an initiator endpoint at a first server device to at least one target endpoint at a second server device. The initiator and target endpoints are associated with at least one application under execution. An embodiment includes inferring, when the unsuccessful completion status is indicated as flushed, that the message was in a send queue of the IB queue pair when the IB queue pair transitioned into the error state. An embodiment includes establishing an IB Direct Connect queue pair connection between the target and initiator endpoints. An embodiment includes re-queueing the message in the IB queue pair for dispatch to the target endpoint.Type: ApplicationFiled: November 26, 2019Publication date: May 27, 2021Applicant: International Business Machines CorporationInventors: William P. LePera, Sameh Sherif Sharkawi
-
Publication number: 20210072967Abstract: A computer implemented method uses memory coherence to enhance latency and bandwidth performance, the method including receiving, by a host, a call from an application. The method also includes, determining that the call includes a device allocation command, wherein the device allocation command is configured to allocate a set of data on a graphical processing unit. The method further includes intercepting the call. The method includes, initiating an alternate data allocation command; and returning the alternate data allocation command to the application. Further aspects of the present disclosure are directed to systems and computer program products containing functionality consistent with the method described above.Type: ApplicationFiled: September 5, 2019Publication date: March 11, 2021Inventors: William P. LePera, Austen William Lauria, Scott Miller, Sameh Sherif Sharkawi
-
Patent number: 10901820Abstract: An embodiment includes sending, via a queue pair (QP) at a first endpoint, a message to a second endpoint. The embodiment also includes detecting an error state of the QP caused by a failure at a third endpoint that automatically halts messages via the QP. The embodiment includes determining that communication between the first and second endpoints via the QP is viable, and placing messages to the second endpoint in a separate queue from messages to an unreachable endpoint. The embodiment also includes re-establishing communications between the first and second endpoints via the QP, and polling the second endpoint for an indication of a delivered message. Any messages indicated as having already been delivered are removed from the queue before re-starting communications with the second endpoint.Type: GrantFiled: January 7, 2020Date of Patent: January 26, 2021Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: William P. LePera, Sameh Sherif Sharkawi
-
Patent number: 10379883Abstract: A method, apparatus and program product simulate a high performance computing (HPC) application environment by creating a cluster of virtual nodes in one or more operating system instances executing on one or more physical computing node, thereby enabling a plurality of parallel tasks from an HPC application to be executed on the cluster of virtual nodes.Type: GrantFiled: August 13, 2014Date of Patent: August 13, 2019Assignee: International Business Machines CorporationInventors: Jun He, Tsai-Yang Jea, William P. LePera, Hanhong Xue
-
Patent number: 10360050Abstract: A method, apparatus and program product simulate a high performance computing (HPC) application environment by creating a cluster of virtual nodes in one or more operating system instances executing on one or more physical computing node, thereby enabling a plurality of parallel tasks from an HPC application to be executed on the cluster of virtual nodes.Type: GrantFiled: January 17, 2014Date of Patent: July 23, 2019Assignee: International Business Machines CorporationInventors: Jun He, Tsai-Yang Jea, William P. LePera, Hanhong Xue
-
Patent number: 9104501Abstract: A job may be divided into multiple tasks that may execute in parallel on one or more compute nodes. The tasks executing on the same compute node may be coordinated using barrier synchronization. However, to perform barrier synchronization, the tasks use (or attach) to a barrier synchronization register which establishes a common checkpoint for each of the tasks. A leader task may use a shared memory region to publish to follower tasks the location of the barrier synchronization register—i.e., a barrier synchronization register ID. The follower tasks may then monitor the shared memory to determine the barrier synchronization register ID. The leader task may also use a count to ensure all the tasks attach to the BSR. This advantageously avoids any task-to-task communication which may reduce overhead and improve performance.Type: GrantFiled: December 7, 2012Date of Patent: August 11, 2015Assignee: International Business Machines CorporationInventors: Tsai-Yang Jea, William P. Lepera, HanHong Xue, Zhi Zhang
-
Patent number: 9092272Abstract: A job may be divided into multiple tasks that may execute in parallel on one or more compute nodes. The tasks executing on the same compute node may be coordinated using barrier synchronization. However, to perform barrier synchronization, the tasks use (or attach) to a barrier synchronization register which establishes a common checkpoint for each of the tasks. A leader task may use a shared memory region to publish to follower tasks the location of the barrier synchronization register—i.e., a barrier synchronization register ID. The follower tasks may then monitor the shared memory to determine the barrier synchronization register ID. The leader task may also use a count to ensure all the tasks attach to the BSR. This advantageously avoids any task-to-task communication which may reduce overhead and improve performance.Type: GrantFiled: December 8, 2011Date of Patent: July 28, 2015Assignee: International Business Machines CorporationInventors: Tsai-Yang Jea, William P. LePera, Hanhong Xue, Zhi Zhang
-
Publication number: 20150205888Abstract: A method, apparatus and program product simulate a high performance computing (HPC) application environment by creating a cluster of virtual nodes in one or more operating system instances executing on one or more physical computing node, thereby enabling a plurality of parallel tasks from an HPC application to be executed on the cluster of virtual nodes.Type: ApplicationFiled: January 17, 2014Publication date: July 23, 2015Applicant: International Business Machines CorporationInventors: Jun He, Tsai-Yang Jea, William P. LePera, Hanhong Xue
-
Publication number: 20150205625Abstract: A method, apparatus and program product simulate a high performance computing (HPC) application environment by creating a cluster of virtual nodes in one or more operating system instances executing on one or more physical computing node, thereby enabling a plurality of parallel tasks from an HPC application to be executed on the cluster of virtual nodes.Type: ApplicationFiled: August 13, 2014Publication date: July 23, 2015Inventors: Jun He, Tsai-Yang Jea, William P. LePera, Hanhong Xue