Patents by Inventor Shaosheng Cao

Shaosheng Cao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11288318
    Abstract: Implementations of this disclosure provide for obtaining dynamic embedding vectors of nodes in relationship graphs. An example method includes determining N neighboring nodes of a first node of a plurality of nodes; obtaining respective input embedding vectors of the first node and the N neighboring nodes, the input embedding vector of each node being determined based on a respective static embedding vector and a respective positional embedding vector of the node; inputting the respective input embedding vectors of the first node and the N neighboring nodes into a pre-trained embedding model that includes one or more sequentially connected computing blocks, each computing block including a corresponding self-attention layer that outputs N+1 output vectors corresponding to N+1 input vectors; and receiving respective dynamic embedding vectors of the first node and the N neighboring nodes output by the pre-trained embedding model.
    Type: Grant
    Filed: August 19, 2021
    Date of Patent: March 29, 2022
    Assignee: Advanced New Technologies Co., Ltd.
    Inventors: Shaosheng Cao, Qing Cui
  • Publication number: 20220076101
    Abstract: Object feature information acquisition methods, systems, and devices, including computer programs encoded on computer storage media are provided. One of the methods includes: obtaining N relation networks of N time instances, wherein each of the relation networks comprises a plurality of nodes and connection relationships between the nodes, and each of the relation networks comprises a first node representing a first user; determining (i) a spatial aggregation feature of the first node at a first time instance and (ii) a node feature of the first node; inputting N spatial aggregation features of the N time instances into a sequential neural network; determining, based on an output result of the sequential neural network, N spatio-temporal expressions of the first node at the N time instances; and aggregating the N spatio-temporal expressions to obtain a spatio-temporal aggregation feature of the first node as feature information of the first user.
    Type: Application
    Filed: June 29, 2021
    Publication date: March 10, 2022
    Inventors: Shuo YANG, Zhiqiang ZHANG, Shaosheng CAO, Jun ZHOU
  • Patent number: 11257007
    Abstract: An N×M dimensional target matrix is generated based on N data samples and M dimensional data features respectively corresponding to the N data samples. Encryption calculation is performed on the N×M dimensional target matrix based on a Principal Component Analysis (PCA) algorithm to obtain an N×K dimensional encryption matrix K is less than M. The N×K dimensional encryption matrix is transmitted to a modeling server. The modeling server trains a machine learning model by using the N×K dimensional encryption matrix as a training sample.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: February 22, 2022
    Assignee: Advanced New Technologies Co., Ltd.
    Inventors: Xinxing Yang, Shaosheng Cao, Jun Zhou, Xiaolong Li
  • Publication number: 20210382945
    Abstract: Implementations of this disclosure provide for obtaining dynamic embedding vectors of nodes in relationship graphs. An example method includes determining N neighboring nodes of a first node of a plurality of nodes; obtaining respective input embedding vectors of the first node and the N neighboring nodes, the input embedding vector of each node being determined based on a respective static embedding vector and a respective positional embedding vector of the node; inputting the respective input embedding vectors of the first node and the N neighboring nodes into a pre-trained embedding model that includes one or more sequentially connected computing blocks, each computing block including a corresponding self-attention layer that outputs N+1 output vectors corresponding to N+1 input vectors; and receiving respective dynamic embedding vectors of the first node and the N neighboring nodes output by the pre-trained embedding model.
    Type: Application
    Filed: August 19, 2021
    Publication date: December 9, 2021
    Applicant: Advanced New Technologies Co., Ltd.
    Inventors: Shaosheng Cao, Qing Cui
  • Patent number: 11100167
    Abstract: Implementations of this disclosure provide for obtaining dynamic embedding vectors of nodes in relationship graphs. An example method includes determining N neighboring nodes of a first node of a plurality of nodes; obtaining respective input embedding vectors of the first node and the N neighboring nodes, the input embedding vector of each node being determined based on a respective static embedding vector and a respective positional embedding vector of the node; inputting the respective input embedding vectors of the first node and the N neighboring nodes into a pre-trained embedding model that includes one or more sequentially connected computing blocks, each computing block including a corresponding self-attention layer that outputs N+1 output vectors corresponding to N+1 input vectors; and receiving respective dynamic embedding vectors of the first node and the N neighboring nodes output by the pre-trained embedding model.
    Type: Grant
    Filed: March 4, 2020
    Date of Patent: August 24, 2021
    Assignee: Advanced New Technologies Co., Ltd.
    Inventors: Shaosheng Cao, Qing Cui
  • Patent number: 11074246
    Abstract: Implementations of the present specification disclose method, apparatus, and device for processing graph data using a random walk-based process. The process is applicable to either a cluster of machines, a stand-alone machine, or both. In one aspect, the method includes: obtaining, by a cluster, data describing a graph that has nodes and edges between the nodes, wherein the cluster comprises (i) a server cluster that includes a plurality of server machines and (ii) a working machine cluster that includes a plurality of working machines; generating a two-dimensional array based on the data, wherein generating the two-dimensional array comprises generating, for each node included in the graph, a row comprising respective identifiers of adjacent nodes of the node; and generating, based on the two-dimensional array, a random sequence that represents a random walk processing of the data by the cluster.
    Type: Grant
    Filed: February 28, 2020
    Date of Patent: July 27, 2021
    Assignee: Advanced New Technologies Co., Ltd.
    Inventors: Shaosheng Cao, Xinxing Yang, Jun Zhou
  • Patent number: 11030411
    Abstract: Implementations of the present specification disclose a method for generating word vectors, apparatus, and device. The method includes: obtaining words by segmenting a corpus; establishing a feature vector of each obtained word based on n-ary characters; training a convolutional neural network based on the feature vectors of the obtained words and the feature vectors of context words associated with each obtained word in the corpus; and generating a word vector for each obtained word based on the feature vector of the obtained word and the trained convolutional neural network.
    Type: Grant
    Filed: May 26, 2020
    Date of Patent: June 8, 2021
    Assignee: ALIBABA GROUP HOLDING LIMITED
    Inventors: Shaosheng Cao, Jun Zhou
  • Patent number: 10901971
    Abstract: Embodiments of the present specification disclose random walking and a cluster-based random walking method, apparatus and device. A solution includes: obtaining information about each node included in graph data, generating, according to the information about each node, a hash table reflecting a correspondence between the node and an adjacent node of the node, and generating a random sequence according to the hash table, to implement random walking in the graph data. The solution is applicable to clusters and single machines.
    Type: Grant
    Filed: January 7, 2020
    Date of Patent: January 26, 2021
    Assignee: ADVANCED NEW TECHNOLOGIES CO., LTD.
    Inventors: Shaosheng Cao, Xinxing Yang, Jun Zhou
  • Publication number: 20210011788
    Abstract: Embodiments of the present specification disclose a vector-processing method, apparatus, and device for RPC information. The scheme comprises: acquiring an RPC-information sequence consisting of a plurality of RPC-information units of a user; establishing and initializing feature vectors of the RPC-information units; and training the feature vectors according to the RPC-information sequence and the feature vectors, so as to obtain feature vectors with accurate expression.
    Type: Application
    Filed: January 16, 2019
    Publication date: January 14, 2021
    Applicant: Alibaba Group Holding Limited
    Inventors: Shaosheng Cao, Jun Zhou
  • Patent number: 10878199
    Abstract: A word vector processing method is provided. Word segmentation is performed on a corpus to obtain words, and n-gram strokes corresponding to the words are determined. Each n-gram stroke represents n successive strokes of a corresponding word. Word vectors of the words and stroke vectors of the n-gram strokes are initialized corresponding to the words. After performing the word segmentation, the n-gram strokes are determined, and the word vectors and stroke vectors are determined, training the word vectors and the stroke vectors.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: December 29, 2020
    Assignee: Advanced New Technologies Co., Ltd.
    Inventors: Shaosheng Cao, Xiaolong Li
  • Patent number: 10846483
    Abstract: A cluster includes a server cluster and a worker computer cluster. Each worker computer included in the worker computer cluster separately obtains a word and at least one context word of the word that are extracted from a corpus. The worker computer obtains word vectors for the word and the at least one context word. The worker computer calculates a gradient according to the word, the at least one context word, and the word vectors. The worker computer asynchronously updates the gradient to a server included in the server cluster. The server updates the word vectors for the word and the at least one context word of the word according to the gradient.
    Type: Grant
    Filed: January 29, 2020
    Date of Patent: November 24, 2020
    Assignee: ADVANCED NEW TECHNOLOGIES CO., LTD.
    Inventors: Shaosheng Cao, Xinxing Yang, Jun Zhou
  • Publication number: 20200356598
    Abstract: Implementations of this disclosure provide for obtaining dynamic embedding vectors of nodes in relationship graphs. An example method includes determining N neighboring nodes of a first node of a plurality of nodes; obtaining respective input embedding vectors of the first node and the N neighboring nodes, the input embedding vector of each node being determined based on a respective static embedding vector and a respective positional embedding vector of the node; inputting the respective input embedding vectors of the first node and the N neighboring nodes into a pre-trained embedding model that includes one or more sequentially connected computing blocks, each computing block including a corresponding self-attention layer that outputs N+1 output vectors corresponding to N+1 input vectors; and receiving respective dynamic embedding vectors of the first node and the N neighboring nodes output by the pre-trained embedding model.
    Type: Application
    Filed: March 4, 2020
    Publication date: November 12, 2020
    Applicant: Alibaba Group Holding Limited
    Inventors: Shaosheng Cao, Qing Cui
  • Patent number: 10824819
    Abstract: Implementations of the present specification disclose methods, apparatuses, and devices for generating word vectors. The method includes: obtaining individual words by segmenting a corpus; establishing a feature vector of each word based on n-ary characters; training a recurrent neural network based on the feature vectors of the obtained words and feature vectors of context words associated with the obtained words in the corpus; and generating a word vector for each obtained word based on the feature vector of the obtained word and the trained recurrent neural network.
    Type: Grant
    Filed: May 20, 2020
    Date of Patent: November 3, 2020
    Assignee: ALIBABA GROUP HOLDING LIMITED
    Inventors: Shaosheng Cao, Jun Zhou
  • Patent number: 10776334
    Abstract: Embodiments of the present specification disclose random walking and a cluster-based random walking method, apparatus and device. A solution includes: obtaining information about each node included in graph data, generating, according to the information about each node, an index vector reflecting a degree value of a respective node, then generating an element vector reflecting an identifier of an adjacent node of the node, and generating a random sequence according to the index vector and the element vector, to implement random walks in the graph data. The solution is applicable to clusters and individual machines.
    Type: Grant
    Filed: January 7, 2020
    Date of Patent: September 15, 2020
    Assignee: Alibaba Group Holding Limited
    Inventors: Shaosheng Cao, Xinxing Yang, Jun Zhou, Xiaolong Li
  • Publication number: 20200285811
    Abstract: Implementations of the present specification disclose a method for generating word vectors, apparatus, and device. The method includes: obtaining words by segmenting a corpus; establishing a feature vector of each obtained word based on n-ary characters; training a convolutional neural network based on the feature vectors of the obtained words and the feature vectors of context words associated with each obtained word in the corpus; and generating a word vector for each obtained word based on the feature vector of the obtained word and the trained convolutional neural network.
    Type: Application
    Filed: May 26, 2020
    Publication date: September 10, 2020
    Inventors: Shaosheng CAO, Jun ZHOU
  • Patent number: 10769383
    Abstract: Embodiments of the present application disclose a cluster-based word vector processing method, apparatus, and device. Solutions are include: in a cluster having a server cluster and a worker computer cluster, in which each worker computer in the worker computer cluster separately reads some corpuses in parallel, extracts a word and context words of the word from the read corpuses, obtains corresponding word vectors from a server in the server cluster, and trains the corresponding word vectors, and the server cluster updates word vectors of same words that are stored before the training according to training results of one or more respective worker computers with respect to the word vectors of the same words.
    Type: Grant
    Filed: January 15, 2020
    Date of Patent: September 8, 2020
    Assignee: Alibaba Group Holding Limited
    Inventors: Shaosheng Cao, Xinxing Yang, Jun Zhou, Xiaolong Li
  • Publication number: 20200279080
    Abstract: Implementations of the present specification disclose methods, apparatuses, and devices for generating word vectors. The method includes: obtaining individual words by segmenting a corpus; establishing a feature vector of each word based on n-ary characters; training a recurrent neural network based on the feature vectors of the obtained words and feature vectors of context words associated with the obtained words in the corpus; and generating a word vector for each obtained word based on the feature vector of the obtained word and the trained recurrent neural network.
    Type: Application
    Filed: May 20, 2020
    Publication date: September 3, 2020
    Inventors: Shaosheng CAO, Jun ZHOU
  • Patent number: 10740561
    Abstract: Disclosed herein are methods, systems, and apparatus, including computer programs encoded on computer storage media, for entity prediction. One of the methods includes performing word segmentation on text to be predicted to obtain a plurality of words. For each word of the plurality of words, a determination is made whether the word has a pre-trained word vector. In response to determining that the word has a pre-trained word vector, the pre-trained word vector for the word is obtained. In response to determining that the word does not have a pre-trained word vector, a word vector for the word is determined based on a pre-trained stroke vector. The word vector and the pre-trained stroke vector are trained based on a text sample and a word vector model. An entity associated with the text is predicted by inputting word vectors of the plurality of words into an entity prediction model.
    Type: Grant
    Filed: October 31, 2019
    Date of Patent: August 11, 2020
    Assignee: Alibaba Group Holding Limited
    Inventors: Shaosheng Cao, Jun Zhou
  • Publication number: 20200201844
    Abstract: Implementations of the present specification disclose method, apparatus, and device for processing graph data using a random walk-based process. The process is applicable to either a cluster of machines, a stand-alone machine, or both. In one aspect, the method includes: obtaining, by a cluster, data describing a graph that has nodes and edges between the nodes, wherein the cluster comprises (i) a server cluster that includes a plurality of server machines and (ii) a working machine cluster that includes a plurality of working machines; generating a two-dimensional array based on the data, wherein generating the two-dimensional array comprises generating, for each node included in the graph, a row comprising respective identifiers of adjacent nodes of the node; and generating, based on the two-dimensional array, a random sequence that represents a random walk processing of the data by the cluster.
    Type: Application
    Filed: February 28, 2020
    Publication date: June 25, 2020
    Applicant: Alibaba Group Holding Limited
    Inventors: Shaosheng Cao, Xinxing Yang, Jun Zhou
  • Publication number: 20200167527
    Abstract: A cluster includes a server cluster and a worker computer cluster. Each worker computer included in the worker computer cluster separately obtains a word and at least one context word of the word that are extracted from a corpus. The worker computer obtains word vectors for the word and the at least one context word. The worker computer calculates a gradient according to the word, the at least one context word, and the word vectors. The worker computer asynchronously updates the gradient to a server included in the server cluster. The server updates the word vectors for the word and the at least one context word of the word according to the gradient.
    Type: Application
    Filed: January 29, 2020
    Publication date: May 28, 2020
    Inventors: Shaosheng CAO, Xinxing YANG, Jun ZHOU