Patents by Inventor Bryan S. Rosenburg
Bryan S. Rosenburg has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11301165Abstract: A data management system and method for accelerating shared file checkpointing. Written application data is aggregated in an application data file created in a local burst buffer memory at a compute node, and an associated data mapping built index to maintain information related to the offsets into a shared file at which segments of the application data is to be stored in a parallel file system, and where in the buffer those segments are located. The node asynchronously transfers a data file containing the application data and the associated data mapping index to a file server for shared file storage. The data management system and method further accelerates shared file checkpointing in which a shared file, together with a map file that specifies how the shared file is to be distributed, is asynchronously transferred to local burst buffer memories at the nodes to accelerate reading of the shared file.Type: GrantFiled: April 26, 2018Date of Patent: April 12, 2022Assignee: International Business Machines CorporationInventors: Thomas Gooding, Pierre Lemarinier, Bryan S. Rosenburg
-
Patent number: 11121951Abstract: A method for managing a network queue memory includes receiving sensor information about the network queue memory, predicting a memory failure in the network queue memory based on the sensor information, and outputting a notification through a plurality of nodes forming a network and using the network queue memory, the notification configuring communications between the nodes.Type: GrantFiled: November 19, 2017Date of Patent: September 14, 2021Assignee: International Business Machines CorporationInventors: Carlos H. Andrade Costa, Chen-Yong Cher, Yoonho Park, Bryan S. Rosenburg, Kyung D. Ryu
-
Patent number: 10877847Abstract: An illustrative embodiment includes a method for checkpointing and restarting an application executing at least in part on one or more central processing units coupled to one or more hardware accelerators. The method comprises checkpointing the application at least in part by: transferring checkpoint data of the application to the one or more hardware accelerators; performing distributed compression of the application checkpoint data at least in part using the one or more hardware accelerators; and writing the compressed application checkpoint data to a storage device. The method further comprises restarting the application at least in part by: reading the compressed application checkpoint data from the storage device; transferring the compressed checkpoint data to one or more hardware accelerators; and performing distributed decompression of the application checkpoint data at least in part using said one or more hardware accelerators.Type: GrantFiled: October 9, 2018Date of Patent: December 29, 2020Assignee: International Business Machines CorporationInventors: Fausto Artico, Bryan S. Rosenburg
-
Patent number: 10776185Abstract: Techniques are disclosed for efficient handling of messages in computing systems that include tag matching capable hardware. A message management module provides for handling message events including application receives and channel notifications such that hardware tag matching can continuously run in hardware channels, such as network adapters. When the message event is an application receive the message management module adds the application receive to a tracking queue and determines if the application receive can be posted to a hardware channel capable of tag matching. When the message event is a channel notification, the message management module determines a message action using the message tracking queue and the information in the channel notification.Type: GrantFiled: December 10, 2018Date of Patent: September 15, 2020Assignee: International Business Machines CorporationInventors: Sameh S. Sharkawi, Sameer Kumar, Bryan S. Rosenburg
-
Publication number: 20200183764Abstract: Techniques are disclosed for efficient handling of messages in computing systems that include tag matching capable hardware. A message management module provides for handling message events including application receives and channel notifications such that hardware tag matching can continuously run in hardware channels, such as network adapters. When the message event is an application receive the message management module adds the application receive to a tracking queue and determines if the application receive can be posted to a hardware channel capable of tag matching. When the message event is a channel notification, the message management module determines a message action using the message tracking queue and the information in the channel notification.Type: ApplicationFiled: December 10, 2018Publication date: June 11, 2020Inventors: Sameh S. SHARKAWI, Sameer KUMAR, Bryan S. ROSENBURG
-
Publication number: 20200110670Abstract: An illustrative embodiment includes a method for checkpointing and restarting an application executing at least in part on one or more central processing units coupled to one or more hardware accelerators. The method comprises checkpointing the application at least in part by: transferring checkpoint data of the application to the one or more hardware accelerators; performing distributed compression of the application checkpoint data at least in part using the one or more hardware accelerators; and writing the compressed application checkpoint data to a storage device. The method further comprises restarting the application at least in part by: reading the compressed application checkpoint data from the storage device; transferring the compressed checkpoint data to one or more hardware accelerators; and performing distributed decompression of the application checkpoint data at least in part using said one or more hardware accelerators.Type: ApplicationFiled: October 9, 2018Publication date: April 9, 2020Inventors: FAUSTO ARTICO, BRYAN S. ROSENBURG
-
Publication number: 20190332318Abstract: A data management system and method for accelerating shared file checkpointing. Written application data is aggregated in an application data file created in a local burst buffer memory at a compute node, and an associated data mapping built index to maintain information related to the offsets into a shared file at which segments of the application data is to be stored in a parallel file system, and where in the buffer those segments are located. The node asynchronously transfers a data file containing the application data and the associated data mapping index to a file server for shared file storage. The data management system and method further accelerates shared file checkpointing in which a shared file, together with a map file that specifies how the shared file is to be distributed, is asynchronously transferred to local burst buffer memories at the nodes to accelerate reading of the shared file.Type: ApplicationFiled: April 26, 2018Publication date: October 31, 2019Inventors: Thomas Gooding, Pierre Lemarinier, Bryan S. Rosenburg
-
Patent number: 10289329Abstract: A method, data processing system and program product utilize dynamic logical storage volume sizing for burst buffers or other local storage for computing nodes to optimize job stage in, execution and/or stage out.Type: GrantFiled: February 15, 2017Date of Patent: May 14, 2019Assignee: International Business Machines CorporationInventors: Thomas M. Gooding, David L. Hermsmeier, Jin Ma, Gary J. Mincher, Bryan S. Rosenburg
-
Patent number: 10268384Abstract: Techniques for transferring files between machines include creating a zero-length target file on non-volatile storage, truncating the file to a desired size, and allocating storage on the non-volatile storage for each block of the target file. The technique also includes determining a logical block address (LBA) for each location in the target file. The technique further includes sending a request to an input/output (I/O) node to transfer a source file to the non-volatile storage, where the request includes a mapping between the LBAs and file offsets. The technique includes opening the source file and a block device at the I/O node. The technique further includes reading each block from the source file and writing each block to the target file on the non-volatile storage utilizing the block device, and then closing the source file and the block device.Type: GrantFiled: September 16, 2016Date of Patent: April 23, 2019Assignee: International Business Machines CorporationInventors: Michael E. Aho, Thomas M. Gooding, Bryan S. Rosenburg
-
Patent number: 10141955Abstract: A method for providing selective memory error protection responsive to a predictable failure notification associated with at least one portion of a memory in a computing system includes: obtaining an active error correcting code (ECC) configuration corresponding to the portion of the memory; determining whether the active ECC configuration is sufficient to correct at least one error in the portion of the memory affected by the predictable failure notification; when the active ECC configuration is insufficient to correct the error, determining whether data corruption can be tolerated by an application running on the computing system; when data corruption cannot be tolerated by the application, determining whether a stronger ECC level is available and, if a stronger ECC level is available, increasing a strength of the active ECC configuration; and when data corruption can be tolerated, performing page reassignment and aggregation of non-critical data.Type: GrantFiled: April 11, 2015Date of Patent: November 27, 2018Assignee: International Business Machines CorporationInventors: Carlos H. Andrade Costa, Chen-Yong Cher, Yoonho Park, Bryan S. Rosenburg, Kyung D. Ryu
-
Patent number: 10073739Abstract: A method for selective duplication of subtasks in a high-performance computing system includes: monitoring a health status of one or more nodes in a high-performance computing system, where one or more subtasks of a parallel task execute on the one or more nodes; identifying one or more nodes as having a likelihood of failure which exceeds a first prescribed threshold; selectively duplicating the one or more subtasks that execute on the one or more nodes having a likelihood of failure which exceeds the first prescribed threshold; and notifying a messaging library that one or more subtasks were duplicated.Type: GrantFiled: December 2, 2015Date of Patent: September 11, 2018Assignee: International Business Machines CorporationInventors: Carlos H. Andrade Costa, Chen-Yong Cher, Yoonho Park, Bryan S. Rosenburg, Kyung D. Ryu
-
Publication number: 20180232143Abstract: A method, data processing system and program product utilize dynamic logical storage volume sizing for burst buffers or other local storage for computing nodes to optimize job stage in, execution and/or stage out.Type: ApplicationFiled: February 15, 2017Publication date: August 16, 2018Inventors: Thomas M. Gooding, David L. Hermsmeier, Jin Ma, Gary J. Mincher, Bryan S. Rosenburg
-
Patent number: 10007242Abstract: A computer detects a request by a process for access to a shadow control page, wherein the shadow control page allows the process access to one or more devices. The computer assigns the shadow control page and a key to the process associated with the request. The computer detects a request by the process via the assigned shadow control page for creation of a subset of devices from the one or more devices. The computer inputs information detailing an association between the subset of devices and the assigned key into a subset definition table, wherein the subset definition table includes one or more keys and one or more corresponding subsets.Type: GrantFiled: June 11, 2015Date of Patent: June 26, 2018Assignee: International Business Machines CorporationInventors: Thomas W. Fox, Hans M. Jacobson, Ravi Nair, Bryan S. Rosenburg
-
Publication number: 20180097712Abstract: A method for managing a network queue memory includes receiving sensor information about the network queue memory, predicting a memory failure in the network queue memory based on the sensor information, and outputting a notification through a plurality of nodes forming a network and using the network queue memory, the notification configuring communications between the nodes.Type: ApplicationFiled: November 19, 2017Publication date: April 5, 2018Inventors: Carlos H. Andrade Costa, Chen-Yong Cher, Yoonho Park, Bryan S. Rosenburg, Kyung D. Ryu
-
Publication number: 20180081540Abstract: Techniques for transferring files between machines include creating a zero-length target file on non-volatile storage, truncating the file to a desired size, and allocating storage on the non-volatile storage for each block of the target file. The technique also includes determining a logical block address (LBA) for each location in the target file. The technique further includes sending a request to an input/output (I/O) node to transfer a source file to the non-volatile storage, where the request includes a mapping between the LBAs and file offsets. The technique includes opening the source file and a block device at the I/O node. The technique further includes reading each block from the source file and writing each block to the target file on the non-volatile storage utilizing the block device, and then closing the source file and the block device.Type: ApplicationFiled: September 16, 2016Publication date: March 22, 2018Inventors: Michael E. AHO, Thomas M. GOODING, Bryan S. ROSENBURG
-
Patent number: 9916678Abstract: Data processing techniques are provided to increase a computational speed of iterative computations that are performed over a domain of data points, such as stencil computations. For example, a method includes loading a set of domain data points into a cache memory; obtaining an iteration count T, and a base stencil operator having a first set of coefficients; generating a convolved stencil operator having a second set of coefficients, wherein the convolved stencil operator is generated by convolving the base stencil operator with itself at least one time; and iteratively processing the set of domain data points in the cache memory using the convolved stencil operator no more than T/2 iterations to obtain final processing results. The final processing results are similar to processing results that would be obtained by iteratively processing the set of domain data points using the base stencil operator for the iteration count T.Type: GrantFiled: August 7, 2017Date of Patent: March 13, 2018Assignee: International Business Machines CorporationInventors: Guilherme C. Januario, Yoonho Park, Bryan S. Rosenburg
-
Publication number: 20170358124Abstract: Data processing techniques are provided to increase a computational speed of iterative computations that are performed over a domain of data points, such as stencil computations. For example, a method includes loading a set of domain data points into a cache memory; obtaining an iteration count T, and a base stencil operator having a first set of coefficients; generating a convolved stencil operator having a second set of coefficients, wherein the convolved stencil operator is generated by convolving the base stencil operator with itself at least one time; and iteratively processing the set of domain data points in the cache memory using the convolved stencil operator no more than T/2 iterations to obtain final processing results. The final processing results are similar to processing results that would be obtained by iteratively processing the set of domain data points using the base stencil operator for the iteration count T.Type: ApplicationFiled: August 7, 2017Publication date: December 14, 2017Inventors: Guilherme C. Januario, Yoonho Park, Bryan S. Rosenburg
-
Patent number: 9825827Abstract: A method for managing a network queue memory includes receiving sensor information about the network queue memory, predicting a memory failure in the network queue memory based on the sensor information, and outputting a notification through a plurality of nodes forming a network and using the network queue memory, the notification configuring communications between the nodes.Type: GrantFiled: August 7, 2014Date of Patent: November 21, 2017Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Carlos H. Andrade Costa, Chen-Yong Cher, Yoonho Park, Bryan S. Rosenburg, Kyung D. Ryu
-
Publication number: 20170192937Abstract: Data processing techniques are provided to increase a computational speed of iterative computations that are performed over a domain of data points, such as stencil computations. For example, a method includes loading a set of domain data points into a cache memory; obtaining an iteration count T, and a base stencil operator having a first set of coefficients; generating a convolved stencil operator having a second set of coefficients, wherein the convolved stencil operator is generated by convolving the base stencil operator with itself at least one time; and iteratively processing the set of domain data points in the cache memory using the convolved stencil operator no more than T/2 iterations to obtain final processing results. The final processing results are similar to processing results that would be obtained by iteratively processing the set of domain data points using the base stencil operator for the iteration count T.Type: ApplicationFiled: December 31, 2015Publication date: July 6, 2017Inventors: Guilherme C. Januario, Yoonho Park, Bryan S. Rosenburg
-
Patent number: 9535774Abstract: A method for providing notification of a predictable memory failure includes the steps of: obtaining information regarding at least one condition associated with a memory; calculating a memory failure probability as a function of the obtained information; calculating a failure probability threshold; and generating a signal when the memory failure probability exceeds the failure probability threshold, the signal being indicative of a predicted future memory failure.Type: GrantFiled: September 9, 2013Date of Patent: January 3, 2017Assignee: International Business Machines CorporationInventors: Chen-Yong Cher, Carlos H. Andrade Costa, Yoonho Park, Bryan S. Rosenburg, Kyung D. Ryu