Patents by Inventor Abdullah Kayi

Abdullah Kayi has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11886969
    Abstract: Embodiments of a method are disclosed. The method includes performing distributed deep learning training on a batch of training data. The method also includes determining training times representing an amount of time between a beginning batch time and an end batch time. Further, the method includes modifying a communication aspect of the communication straggler to reduce a future network communication time for the communication straggler to send a future result of the distributed deep learning training on a new batch of training data in response to the centralized parameter server determining that the learner is the communication straggler.
    Type: Grant
    Filed: July 9, 2020
    Date of Patent: January 30, 2024
    Assignee: International Business Machines Corporation
    Inventors: Wei Zhang, Xiaodong Cui, Abdullah Kayi, Alper Buyuktosunoglu
  • Patent number: 11875256
    Abstract: Embodiments of a method are disclosed. The method includes performing decentralized distributed deep learning training on a batch of training data. Additionally, the method includes determining a training time wherein the learner performs the decentralized distributed deep learning training on the batch of training data. Further, the method includes generating a table having the training time and other processing times for corresponding other learners performing the decentralized distributed deep learning training on corresponding other batches of other training data. The method also includes determining that the learner is a straggler based on the table and a threshold for the training time. Additionally, the method includes modifying a processing aspect of the straggler to reduce a future training time of the straggler for performing the decentralized distributed deep learning training on a new batch of training data in response to determining the learner is the straggler.
    Type: Grant
    Filed: July 9, 2020
    Date of Patent: January 16, 2024
    Assignee: International Business Machines Corporation
    Inventors: Wei Zhang, Xiaodong Cui, Abdullah Kayi, Alper Buyuktosunoglu
  • Patent number: 11836220
    Abstract: Systems, computer-implemented methods, and computer program products to facilitate updating, such as averaging and/or training, of one or more statistical sets are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can include a computing component that averages a statistical set, provided by the system, with an additional statistical set, that is compatible with the statistical set, to compute an averaged statistical set, where the additional statistical set is obtained from a selected additional system of a plurality of additional systems. The computer executable components also can include a selecting component that selects the selected additional system according to a randomization pattern.
    Type: Grant
    Filed: March 1, 2023
    Date of Patent: December 5, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Xiaodong Cui, Wei Zhang, Mingrui Liu, Abdullah Kayi, Youssef Mroueh, Alper Buyuktosunoglu
  • Publication number: 20230205843
    Abstract: Systems, computer-implemented methods, and computer program products to facilitate updating, such as averaging and/or training, of one or more statistical sets are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can include a computing component that averages a statistical set, provided by the system, with an additional statistical set, that is compatible with the statistical set, to compute an averaged statistical set, where the additional statistical set is obtained from a selected additional system of a plurality of additional systems. The computer executable components also can include a selecting component that selects the selected additional system according to a randomization pattern.
    Type: Application
    Filed: March 1, 2023
    Publication date: June 29, 2023
    Inventors: Xiaodong Cui, Wei Zhang, Mingrui Liu, Abdullah Kayi, Youssef Mroueh, Alper Buyuktosunoglu
  • Patent number: 11651293
    Abstract: Embodiments of a method are disclosed. The method includes performing a batch of decentralized deep learning training for a machine learning model in coordination with multiple local homogenous learners on a deep learning training compute node, and in coordination with multiple super learners on corresponding deep learning training compute nodes. The method also includes exchanging communications with the super learners in accordance with an asynchronous decentralized parallel stochastic gradient descent (ADPSGD) protocol. The communications are associated with the batch of deep learning training.
    Type: Grant
    Filed: July 22, 2020
    Date of Patent: May 16, 2023
    Assignee: International Business Machines Corporation
    Inventors: Wei Zhang, Xiaodong Cui, Abdullah Kayi, Alper Buyuktosunoglu
  • Patent number: 11636280
    Abstract: Systems, computer-implemented methods, and computer program products to facilitate updating, such as averaging and/or training, of one or more statistical sets are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can include a computing component that averages a statistical set, provided by the system, with an additional statistical set, that is compatible with the statistical set, to compute an averaged statistical set, where the additional statistical set is obtained from a selected additional system of a plurality of additional systems. The computer executable components also can include a selecting component that selects the selected additional system according to a randomization pattern.
    Type: Grant
    Filed: January 27, 2021
    Date of Patent: April 25, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Xiaodong Cui, Wei Zhang, Mingrui Liu, Abdullah Kayi, Youssef Mroueh, Alper Buyuktosunoglu
  • Publication number: 20230080480
    Abstract: A system comprises compute nodes distributed over a network and configured to perform a pipeline parallel process. The system also comprises an extended memory comprising a global virtual address space which is shared by the compute nodes. The extended memory is configured to enable the compute nodes to exchange data over the network when the compute nodes perform the pipeline parallel process.
    Type: Application
    Filed: September 13, 2021
    Publication date: March 16, 2023
    Inventors: Abdullah Kayi, Tayfun Gokmen
  • Patent number: 11561844
    Abstract: An approach is disclosed that configures a computer system node from components that are each connected to an intra-node network. The configuring is performed by selecting a set of components, including at least one processor, and assigning each of the components a different address range within the node. An operating system is run on the processor included in the node with the operating system accessing each of the assigned components.
    Type: Grant
    Filed: June 12, 2020
    Date of Patent: January 24, 2023
    Assignee: International Business Machines Corporation
    Inventors: James A. Kahle, Charles R. Johns, Constantinos Evangelinos, Abdullah Kayi
  • Publication number: 20220327374
    Abstract: Computer hardware and/or software that performs the following operations: (i) updating a machine learning model by synchronously applying, to the machine learning model, a first set of training results received from a set of trainers having respective training datasets; (ii) receiving, from one or more trainers of the set of trainers, a first set of metrics pertaining to at least some of the training results of the first set of training results; and (iii) based, at least in part, on the first set of metrics, determining to subsequently update the machine learning model via asynchronous application of subsequent training results received from respective trainers of the set of trainers.
    Type: Application
    Filed: April 9, 2021
    Publication date: October 13, 2022
    Inventors: Abdullah Kayi, Wei Zhang, Xiaodong Cui, Alper Buyuktosunoglu
  • Publication number: 20220245397
    Abstract: Systems, computer-implemented methods, and computer program products to facilitate updating, such as averaging and/or training, of one or more statistical sets are provided. According to an embodiment, a system can comprise a memory that stores computer executable components and a processor that executes the computer executable components stored in the memory. The computer executable components can include a computing component that averages a statistical set, provided by the system, with an additional statistical set, that is compatible with the statistical set, to compute an averaged statistical set, where the additional statistical set is obtained from a selected additional system of a plurality of additional systems. The computer executable components also can include a selecting component that selects the selected additional system according to a randomization pattern.
    Type: Application
    Filed: January 27, 2021
    Publication date: August 4, 2022
    Inventors: Xiaodong Cui, Wei Zhang, Mingrui Liu, Abdullah Kayi, Youssef Mroueh, Alper Buyuktosunoglu
  • Patent number: 11288194
    Abstract: An approach is disclosed that maintains a consistent view of a virtual address by a local node which writes a first value to the virtual address and, after writing the first value, establishes a snapshot consistency state of the virtual address. The virtual address is shared amongst any number of processes and the processes includes a writing process and other processes that read from the virtual address. After writing the first value, the writing process writes a second value to the virtual address. Even after writing the second value, the first value is still visible to the other processes.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: March 29, 2022
    Assignee: International Business Machines Corporation
    Inventors: Charles R. Johns, James A. Kahle, Martin Ohmacht, Changhoan Kim, Jose R. Brunheroto, Constantinos Evangelinos, Abdullah Kayi, Alessandro Morari, James C. Sexton, Patrick D. Siegl
  • Publication number: 20220027796
    Abstract: Embodiments of a method are disclosed. The method includes performing a batch of decentralized deep learning training for a machine learning model in coordination with multiple local homogenous learners on a deep learning training compute node, and in coordination with multiple super learners on corresponding deep learning training compute nodes. The method also includes exchanging communications with the super learners in accordance with an asynchronous decentralized parallel stochastic gradient descent (ADPSGD) protocol. The communications are associated with the batch of deep learning training.
    Type: Application
    Filed: July 22, 2020
    Publication date: January 27, 2022
    Inventors: Wei Zhang, Xiaodong Cui, Abdullah Kayi, Alper Buyuktosunoglu
  • Publication number: 20220012629
    Abstract: Embodiments of a method are disclosed. The method includes performing distributed deep learning training on multiple batches of training data using corresponding learners. Additionally, the method includes determining training times wherein the learners perform the distributed deep learning training on the batches of training data. The method also includes modifying a processing aspect of the straggler to reduce a future training time of the straggler for performing the distributed deep learning training on a new batch of training data in response to identifying a straggler of the learners by a centralized control.
    Type: Application
    Filed: July 9, 2020
    Publication date: January 13, 2022
    Inventors: Wei Zhang, Xiaodong Cui, Abdullah Kayi, Alper Buyuktosunoglu
  • Publication number: 20220012642
    Abstract: Embodiments of a method are disclosed. The method includes performing distributed deep learning training on a batch of training data. The method also includes determining training times representing an amount of time between a beginning batch time and an end batch time. Further, the method includes modifying a communication aspect of the communication straggler to reduce a future network communication time for the communication straggler to send a future result of the distributed deep learning training on a new batch of training data in response to the centralized parameter server determining that the learner is the communication straggler.
    Type: Application
    Filed: July 9, 2020
    Publication date: January 13, 2022
    Inventors: Wei Zhang, Xiaodong Cui, Abdullah Kayi, Alper Buyuktosunoglu
  • Publication number: 20220012584
    Abstract: Embodiments of a method are disclosed. The method includes performing decentralized distributed deep learning training on a batch of training data. Additionally, the method includes determining a training time wherein the learner performs the decentralized distributed deep learning training on the batch of training data. Further, the method includes generating a table having the training time and other processing times for corresponding other learners performing the decentralized distributed deep learning training on corresponding other batches of other training data. The method also includes determining that the learner is a straggler based on the table and a threshold for the training time. Additionally, the method includes modifying a processing aspect of the straggler to reduce a future training time of the straggler for performing the decentralized distributed deep learning training on a new batch of training data in response to determining the learner is the straggler.
    Type: Application
    Filed: July 9, 2020
    Publication date: January 13, 2022
    Inventors: Wei Zhang, Xiaodong Cui, Abdullah Kayi, Alper Buyuktosunoglu
  • Publication number: 20210142153
    Abstract: Embodiments are directed to forming and training a resistive processing unit (RPU) system. The RPU system is formed from a plurality of RPU tiles, whereby the RPU tiles are the atomic building block of the RPU system. The plurality of RPU tiles is configured as a plurality of RPU chips. The plurality of RPU compute nodes is formed from the plurality of RPU chips. The plurality of RPU compute nodes can further be connected by a low latency, high speed network. The RPU system is trained for an artificial neural network model using the atomic matrix operations of a forward cycle, backward cycle, and matrix update.
    Type: Application
    Filed: November 7, 2019
    Publication date: May 13, 2021
    Inventors: TAYFUN GOKMEN, Abdullah Kayi
  • Patent number: 10956125
    Abstract: Methods and systems for shuffling data are described. A processor may generate pair data from source data. The processor may insert the pair data into local tuple spaces. In response to a request for a particular key, the processor may determine a presence of the requested key in a global tuple space. The processor may, in response to a presence of the requested key in the global tuple space, update the global tuple space. The update may be based on the pair data among the local tuple spaces including the existing key. The processor may, in response to an absence of the requested key in the global tuple space, insert pair data including the missing key from the local tuple spaces into the global tuple space. The processor may fetch the requested pair data, and may shuffle the fetched data to generate a dataset.
    Type: Grant
    Filed: December 21, 2017
    Date of Patent: March 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Carlos Henrique Andrade Costa, Abdullah Kayi, Yoonho Park, Charles Johns
  • Patent number: 10915460
    Abstract: An approach is described that accesses data in a shared memory that is shared amongst nodes that include a local node and remote nodes. The local node receives a name corresponding to a named data element in a Coordination Namespace, the Coordination Namespace having been created in a memory distributed amongst the nodes. A hash function is applied to at least a portion of the name with a result of the hash function being a natural node indicator. Data corresponding to the named data element is requested from a natural node identified by the indicator. Based on the request, a response is received from the natural node.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: February 9, 2021
    Assignee: International Business Machines Corporation
    Inventors: Ravi Nair, Charles R. Johns, James A. Kahle, Changhoan Kim, Jose R. Brunheroto, Constantinos Evangelinos, Abdullah Kayi, Patrick D. Siegl
  • Patent number: 10891274
    Abstract: Methods and systems for shuffling data to generate a dataset are described. A first map module may generate first pair data, and a second map module may generate second pair data, from source data. The first map module may insert the first pair data into a first local tuple space accessible to the first map module. The second map module may insert the second pair data into a second local tuple space accessible to the second map module. A shuffle module may request pair data that includes a particular key. The first and second pair data may be inserted into a global tuple space accessible by the first and second map modules. The shuffle module may identify the requested pair data in the global tuple space, and may fetch the identified pair data from a memory. The shuffle module may shuffle the fetched pair data to generate the dataset.
    Type: Grant
    Filed: December 21, 2017
    Date of Patent: January 12, 2021
    Assignee: International Business Machines Corporation
    Inventors: Abdullah Kayi, Carlos Henrique Andrade Costa, Yoonho Park, Charles Johns
  • Publication number: 20200364094
    Abstract: An approach is disclosed that configures a computer system node from components that are each connected to an intra-node network. The configuring is performed by selecting a set of components, including at least one processor, and assigning each of the components a different address range within the node. An operating system is run on the processor included in the node with the operating system accessing each of the assigned components.
    Type: Application
    Filed: June 12, 2020
    Publication date: November 19, 2020
    Inventors: James A. Kahle, Charles R. Johns, Constantinos Evangelinos, Abdullah Kayi