Patents by Inventor Shaosheng Cao

Shaosheng Cao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20200151395
    Abstract: Embodiments of the present application disclose a cluster-based word vector processing method, apparatus, and device. Solutions are include: in a cluster having a server cluster and a worker computer cluster, in which each worker computer in the worker computer cluster separately reads some corpuses in parallel, extracts a word and context words of the word from the read corpuses, obtains corresponding word vectors from a server in the server cluster, and trains the corresponding word vectors, and the server cluster updates word vectors of same words that are stored before the training according to training results of one or more respective worker computers with respect to the word vectors of the same words.
    Type: Application
    Filed: January 15, 2020
    Publication date: May 14, 2020
    Inventors: Shaosheng CAO, Xinxing YANG, Jun ZHOU, Xiaolong LI
  • Publication number: 20200142875
    Abstract: Embodiments of the present specification disclose random walking and a cluster-based random walking method, apparatus and device. A solution includes: obtaining information about each node included in graph data, generating, according to the information about each node, an index vector reflecting a degree value of a respective node, then generating an element vector reflecting an identifier of an adjacent node of the node, and generating a random sequence according to the index vector and the element vector, to implement random walks in the graph data. The solution is applicable to clusters and individual machines.
    Type: Application
    Filed: January 7, 2020
    Publication date: May 7, 2020
    Inventors: Shaosheng CAO, Xinxing YANG, Jun ZHOU, Xiaolong LI
  • Publication number: 20200142877
    Abstract: Embodiments of the present specification disclose random walking and a cluster-based random walking method, apparatus and device. A solution includes: obtaining information about each node included in graph data, generating, according to the information about each node, a hash table reflecting a correspondence between the node and an adjacent node of the node, and generating a random sequence according to the hash table, to implement random walking in the graph data. The solution is applicable to clusters and single machines.
    Type: Application
    Filed: January 7, 2020
    Publication date: May 7, 2020
    Inventors: Shaosheng CAO, Xinxing YANG, Jun ZHOU
  • Publication number: 20200134262
    Abstract: A word vector processing method is provided. Word segmentation is performed on a corpus to obtain words, and n-gram strokes corresponding to the words are determined. Each n-gram stroke represents n successive strokes of a corresponding word. Word vectors of the words and stroke vectors of the n-gram strokes are initialized corresponding to the words. After performing the word segmentation, the n-gram strokes are determined, and the word vectors and stroke vectors are determined, training the word vectors and the stroke vectors.
    Type: Application
    Filed: September 30, 2019
    Publication date: April 30, 2020
    Applicant: Alibaba Group Holding Limited
    Inventors: Shaosheng Cao, Xiaolong Li
  • Publication number: 20200034740
    Abstract: An N×M dimensional target matrix is generated based on N data samples and M dimensional data features respectively corresponding to the N data samples. Encryption calculation is performed on the N×M dimensional target matrix based on a Principal Component Analysis (PCA) algorithm to obtain an N×K dimensional encryption matrix K is less than M. The N×K dimensional encryption matrix is transmitted to a modeling server. The modeling server trains a machine learning model by using the N×K dimensional encryption matrix as a training sample.
    Type: Application
    Filed: September 30, 2019
    Publication date: January 30, 2020
    Applicant: Alibaba Group Holding Limited
    Inventors: Xinxing Yang, Shaosheng Cao, Jun Zhou, Xiaolong Li
  • Patent number: 10430518
    Abstract: A word vector processing method is provided. Word segmentation is performed on a corpus to obtain words, and n-gram strokes corresponding to the words are determined. Each n-gram stroke represents n successive strokes of a corresponding word. Word vectors of the words and stroke vectors of the n-gram strokes are initialized corresponding to the words. After performing the word segmentation, the n-gram strokes are determined, and the word vectors and stroke vectors are determined, training the word vectors and the stroke vectors.
    Type: Grant
    Filed: January 18, 2018
    Date of Patent: October 1, 2019
    Assignee: Alibaba Group Holding Limited
    Inventors: Shaosheng Cao, Xiaolong Li
  • Publication number: 20180210876
    Abstract: A word vector processing method is provided. Word segmentation is performed on a corpus to obtain words, and n-gram strokes corresponding to the words are determined. Each n-gram stroke represents n successive strokes of a corresponding word. Word vectors of the words and stroke vectors of the n-gram strokes are initialized corresponding to the words. After performing the word segmentation, the n-gram strokes are determined, and the word vectors and stroke vectors are determined, training the word vectors and the stroke vectors.
    Type: Application
    Filed: January 18, 2018
    Publication date: July 26, 2018
    Applicant: Alibaba Group Holding Limited
    Inventors: Shaosheng Cao, Xiaolong Li