Patents by Inventor Yufei Ren

Yufei Ren has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 10643147
    Abstract: Version vector-based rules are used to facilitate asynchronous execution of machine learning algorithms. The method uses version vector based rule to generate aggregated parameters and determine when to return the parameters. The method also includes coordinating the versions of aggregated parameter sets among all the parameter servers. This allows to broadcast to enforce the version consistency; generate parameter sets in an on-demand manner to facilitate version control. Furthermore the method includes enhancing the version consistency at the learner's side and resolving the inconsistent version when mismatching versions are detected.
    Type: Grant
    Filed: May 31, 2016
    Date of Patent: May 5, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michel H. T. Hack, Yufei Ren, Yandong Wang, Li Zhang
  • Patent number: 10614356
    Abstract: A network interface controller of a machine receives a packet including at least one model parameter of a neural network model from a server. The packet includes a virtual address associated with the network interface controller, and the machine further includes a plurality of graphics processing units coupled to the network interface controller by a bus. The network interface controller translates the virtual address to a memory address associated with each of the plurality of graphics processing units. The network interface controller broadcasts the at least one model parameter to the memory address associated with each of the plurality of graphics processing units.
    Type: Grant
    Filed: April 24, 2017
    Date of Patent: April 7, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Minwei Feng, Yufei Ren, Yandong Wang, Li Zhang, Wei Zhang
  • Publication number: 20200051201
    Abstract: A computer-implemented topology-aware all-reduce method for an environment including a plurality of systems is provided. Each system of the systems includes a plurality of computing modules. The computer-implemented topology-aware all-reduce method according to aspects of the invention includes locally partitioning and scattering data slices among the computing modules of each system to produce local summation results. The local summation results are copied from the computing modules to corresponding host memories of the f systems. A cross system all-reduce operation is executed among the systems to cause an exchange of the local summation results across the host memories and a determination of final summation partitions from the local summation results. The final summation partitions are copied from the host memories to the corresponding computing modules of each system.
    Type: Application
    Filed: August 8, 2018
    Publication date: February 13, 2020
    Inventors: Li Zhang, Xingbo Wu, Wei Zhang, YUFEI REN
  • Patent number: 10423575
    Abstract: Computational storage techniques for distribute computing are disclosed. The computational storage server receives input from multiple clients, which is used by the server when executing one or more computation functions. The computational storage server can aggregate multiple client inputs before applying one or more computation functions. The computational storage server sets up: a first memory area for storing input received from multiple clients; a second memory area designated for storing the computation functions to be executed by the computational storage server using the input data received from the multiple clients; a client specific memory management area for storing metadata related to computations performed by the computational storage server for specific clients; and a persistent storage area for storing checkpoints associated with aggregating computations performed by the computation functions.
    Type: Grant
    Filed: March 2, 2017
    Date of Patent: September 24, 2019
    Assignee: International Business Machines Corporation
    Inventors: Michel H. T. Hack, Yufei Ren, Wei Tan, Yandong Wang, Xingbo Wu, Li Zhang, Wei Zhang
  • Publication number: 20190205728
    Abstract: A method for providing a graphical visualization of a neural network to a user is provided. The method includes generating the graphical visualization of the neural network at least in part by: representing layers of the neural network as respective three-dimensional blocks, wherein at least a first dimension of a given block is proportional to a computational complexity of a layer of the neural network represented by the given block; and representing data flows between the layers of the neural network as respective three-dimensional structures connecting blocks representing the layers of the neural network, wherein a first dimension of a given structure is proportional to each of a first dimension and a second dimension of a data flow represented by the given structure. The method also includes displaying the graphical visualization of the neural network to the user.
    Type: Application
    Filed: December 28, 2017
    Publication date: July 4, 2019
    Inventors: Minwei Feng, Yufei Ren, Yaodong Wang, Li Zhang, Wei Zhang
  • Patent number: 10324890
    Abstract: A cache management system performs cache management in a Remote Direct Memory Access (RDMA) key value data store. The cache management system receives a request from at least one client configured to access a data item stored in a data location of a remote server, and determines a popularity of the data item based on a frequency at which the data location is accessed by the at least one client. The system is further configured to determine a lease period of the data item based on the frequency and assigning the lease period to the data location.
    Type: Grant
    Filed: June 25, 2018
    Date of Patent: June 18, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michel H. Hack, Yufei Ren, Yandong Wang, Li Zhang
  • Patent number: 10318199
    Abstract: A method for compressing a group of key-value pairs, the method including dividing the group of key-value pairs into a plurality of segments, creating a plurality of blocks, each block of the plurality of blocks corresponding to a segment of the plurality of segments, and compressing each block of the plurality of blocks.
    Type: Grant
    Filed: November 18, 2015
    Date of Patent: June 11, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michel H. T. Hack, Yufei Ren, Yandong Wang, Xingbo Wu, Li Zhang
  • Patent number: 10225361
    Abstract: A caching management method includes embedding a notification request tag in a dummy file, uploading the dummy file to a cache server, recording a timestamp indicating a first point in time that the dummy file is uploaded to the cache server, receiving an eviction notification indicating a second point in time that the dummy file is evicted from the cache server, and calculating an eviction time indicating an amount of time taken for the dummy file to be evicted from the cache server. Transmission of the eviction notification is triggered in response to processing the notification request tag, and the dummy file is not retrieved from the cache server between the first point in time and the second point in time. The eviction time is equal to a difference between the first point in time and the second point in time.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: March 5, 2019
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michel Hack, Yufei Ren, Yandong Wang, Li Zhang
  • Publication number: 20180329861
    Abstract: A cache management system performs cache management in a Remote Direct Memory Access (RDMA) key value data store. The cache management system receives a request from at least one client configured to access a data item stored in a data location of a remote server, and determines a popularity of the data item based on a frequency at which the data location is accessed by the at least one client. The system is further configured to determine a lease period of the data item based on the frequency and assigning the lease period to the data location.
    Type: Application
    Filed: July 23, 2018
    Publication date: November 15, 2018
    Inventors: Michel H. Hack, YUFEI REN, Yandong Wang, Li Zhang
  • Publication number: 20180322383
    Abstract: A storage controller of a machine receives training data associated with a neural network model. The neural network model includes a plurality of layers, and the machine further including at least one graphics processing unit. The storage controller trains at least one layer of the plurality of layers of the neural network model using the training data to generate processed training data. A size of the processed data is less than a size of the training data. Training of the at least one layer includes adjusting one or more weights of the at least one layer using the training data. The storage controller sends the processed training data to at least one graphics processing unit of the machine. The at least one graphics processing unit is configured to store the processed training data and train one or more remaining layers of the plurality of layers using the processed training data.
    Type: Application
    Filed: May 2, 2017
    Publication date: November 8, 2018
    Applicant: International Business Machines Corporation
    Inventors: Minwei Feng, Yufei Ren, Yandong Wang, Li Zhang, Wei Zhang
  • Publication number: 20180307972
    Abstract: A network interface controller of a machine receives a packet including at least one model parameter of a neural network model from a server. The packet includes a virtual address associated with the network interface controller, and the machine further includes a plurality of graphics processing units coupled to the network interface controller by a bus. The network interface controller translates the virtual address to a memory address associated with each of the plurality of graphics processing units. The network interface controller broadcasts the at least one model parameter to the memory address associated with each of the plurality of graphics processing units.
    Type: Application
    Filed: April 24, 2017
    Publication date: October 25, 2018
    Applicant: International Business Machines Corporation
    Inventors: Minwei Feng, Yufei Ren, Yandong Wang, Li Zhang, Wei Zhang
  • Publication number: 20180307651
    Abstract: A cache management system performs cache management in a Remote Direct Memory Access (RDMA) key value data store. The cache management system receives a request from at least one client configured to access a data item stored in a data location of a remote server, and determines a popularity of the data item based on a frequency at which the data location is accessed by the at least one client. The system is further configured to determine a lease period of the data item based on the frequency and assigning the lease period to the data location.
    Type: Application
    Filed: June 25, 2018
    Publication date: October 25, 2018
    Inventors: Michel H. Hack, Yufei Ren, Yandong Wang, Li Zhang
  • Publication number: 20180253423
    Abstract: Computational storage techniques for distribute computing are disclosed. The computational storage server receives input from multiple clients, which is used by the server when executing one or more computation functions. The computational storage server can aggregate multiple client inputs before applying one or more computation functions. The computational storage server sets up: a first memory area for storing input received from multiple clients; a second memory area designated for storing the computation functions to be executed by the computational storage server using the input data received from the multiple clients; a client specific memory management area for storing metadata related to computations performed by the computational storage server for specific clients; and a persistent storage area for storing checkpoints associated with aggregating computations performed by the computation functions.
    Type: Application
    Filed: March 2, 2017
    Publication date: September 6, 2018
    Inventors: MICHEL H. T. HACK, YUFEI REN, WEI TAN, YANDONG WANG, XINGBO WU, LI ZHANG, WEI ZHANG
  • Publication number: 20180253646
    Abstract: A processing unit topology of a neural network including a plurality of processing units is determined. The neural network includes at least one machine in which each machine includes a plurality of nodes, and wherein each node includes at least one of the plurality of processing units. One or more of the processing units are grouped into a first group according to a first affinity. The first group is configured, using a processor and a memory, to use a first aggregation procedure for exchanging model parameters of a model of the neural network between the processing units of the first group. One or more of the processing units are grouped into a second group according to a second affinity. The second group is configured to use a second aggregation procedure for exchanging the model parameters between the processing units of the second group.
    Type: Application
    Filed: March 5, 2017
    Publication date: September 6, 2018
    Applicant: International Business Machines Corporation
    Inventors: Minwei Feng, Yufei Ren, Yandong Wang, Li Zhang, Wei Zhang
  • Patent number: 10037302
    Abstract: A cache management system performs cache management in a Remote Direct Memory Access (RDMA) key value data store. The cache management system receives a request from at least one client configured to access a data item stored in a data location of a remote server, and determines a popularity of the data item based on a frequency at which the data location is accessed by the at least one client. The system is further configured to determine a lease period of the data item based on the frequency and assigning the lease period to the data location.
    Type: Grant
    Filed: November 30, 2015
    Date of Patent: July 31, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michel H. Hack, Yufei Ren, Yandong Wang, Li Zhang
  • Patent number: 10031883
    Abstract: A cache management system performs cache management in a Remote Direct Memory Access (RDMA) key value data store. The cache management system receives a request from at least one client configured to access a data item stored in a data location of a remote server, and determines a popularity of the data item based on a frequency at which the data location is accessed by the at least one client. The system is further configured to determine a lease period of the data item based on the frequency and assigning the lease period to the data location.
    Type: Grant
    Filed: October 16, 2015
    Date of Patent: July 24, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Michel H. Hack, Yufei Ren, Yandong Wang, Li Zhang
  • Publication number: 20180129969
    Abstract: A machine receives a first set of global parameters from a global parameter server. The first set of global parameters includes data that weights one or more operands used in an algorithm that models an entity type. Multiple learner processors in the machine execute the algorithm using the first set of global parameters and a mini-batch of data known to describe the entity type. The machine generates a consolidated set of gradients that describes a direction for the first set of global parameters in order to improve an accuracy of the algorithm in modeling the entity type when using the first set of global parameters and the mini-batch of data. The machine transmits the consolidated set of gradients to the global parameter server. The machine then receives a second set of global parameters from the global parameter server, where the second set of global parameters is a modification of the first set of global parameters based on the consolidated set of gradients.
    Type: Application
    Filed: November 10, 2016
    Publication date: May 10, 2018
    Inventors: MINWEI FENG, YUFEI REN, YANDONG WANG, LI ZHANG, WEI ZHANG
  • Publication number: 20180101790
    Abstract: A method includes storing parameter versions utilized by learner instances in each of two or more epochs in a parameter receiving buffer of a parameter server, the learner instances performing distributed execution of workload computations of a machine learning algorithm. The method also includes creating a parameter roster in the parameter server comprising parameter version vectors specifying the parameter versions used by each of the learner instances during each of the two or more epochs. The method further includes generating one or more aggregated parameter sets for storage in an aggregated parameters buffer by aggregating parameter versions from the parameter receiving buffer based on the parameter version vectors in the parameter roster and providing aggregated parameter sets from the aggregated parameters buffer to the learner instances for deterministic replay of the distributed execution of the workload computations of the machine learning algorithm.
    Type: Application
    Filed: October 11, 2016
    Publication date: April 12, 2018
    Inventors: Michel H.T. Hack, Yufei Ren, Yangdong Wang, Li Zhang, Wei Zhang
  • Publication number: 20180007158
    Abstract: A caching management method includes embedding a notification request tag in a dummy file, uploading the dummy file to a cache server, recording a timestamp indicating a first point in time that the dummy file is uploaded to the cache server, receiving an eviction notification indicating a second point in time that the dummy file is evicted from the cache server, and calculating an eviction time indicating an amount of time taken for the dummy file to be evicted from the cache server. Transmission of the eviction notification is triggered in response to processing the notification request tag, and the dummy file is not retrieved from the cache server between the first point in time and the second point in time. The eviction time is equal to a difference between the first point in time and the second point in time.
    Type: Application
    Filed: June 29, 2016
    Publication date: January 4, 2018
    Inventors: MICHEL HACK, YUFEI REN, YANDONG WANG, LI ZHANG
  • Publication number: 20170344904
    Abstract: Version vector-based rules are used to facilitate asynchronous execution of machine learning algorithms. The method uses version vector based rule to generate aggregated parameters and determine when to return the parameters. The method also includes coordinating the versions of aggregated parameter sets among all the parameter servers. This allows to broadcast to enforce the version consistency; generate parameter sets in an on-demand manner to facilitate version control. Furthermore the method includes enhancing the version consistency at the learner's side and resolving the inconsistent version when mismatching versions are detected.
    Type: Application
    Filed: May 31, 2016
    Publication date: November 30, 2017
    Inventors: Michel H.T. Hack, Yufei Ren, Yandong Wang, Li Zhang