Patents by Inventor Shaosheng Cao
Shaosheng Cao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11288318Abstract: Implementations of this disclosure provide for obtaining dynamic embedding vectors of nodes in relationship graphs. An example method includes determining N neighboring nodes of a first node of a plurality of nodes; obtaining respective input embedding vectors of the first node and the N neighboring nodes, the input embedding vector of each node being determined based on a respective static embedding vector and a respective positional embedding vector of the node; inputting the respective input embedding vectors of the first node and the N neighboring nodes into a pre-trained embedding model that includes one or more sequentially connected computing blocks, each computing block including a corresponding self-attention layer that outputs N+1 output vectors corresponding to N+1 input vectors; and receiving respective dynamic embedding vectors of the first node and the N neighboring nodes output by the pre-trained embedding model.Type: GrantFiled: August 19, 2021Date of Patent: March 29, 2022Assignee: Advanced New Technologies Co., Ltd.Inventors: Shaosheng Cao, Qing Cui
-
Publication number: 20220076101Abstract: Object feature information acquisition methods, systems, and devices, including computer programs encoded on computer storage media are provided. One of the methods includes: obtaining N relation networks of N time instances, wherein each of the relation networks comprises a plurality of nodes and connection relationships between the nodes, and each of the relation networks comprises a first node representing a first user; determining (i) a spatial aggregation feature of the first node at a first time instance and (ii) a node feature of the first node; inputting N spatial aggregation features of the N time instances into a sequential neural network; determining, based on an output result of the sequential neural network, N spatio-temporal expressions of the first node at the N time instances; and aggregating the N spatio-temporal expressions to obtain a spatio-temporal aggregation feature of the first node as feature information of the first user.Type: ApplicationFiled: June 29, 2021Publication date: March 10, 2022Inventors: Shuo YANG, Zhiqiang ZHANG, Shaosheng CAO, Jun ZHOU
-
Patent number: 11257007Abstract: An N×M dimensional target matrix is generated based on N data samples and M dimensional data features respectively corresponding to the N data samples. Encryption calculation is performed on the N×M dimensional target matrix based on a Principal Component Analysis (PCA) algorithm to obtain an N×K dimensional encryption matrix K is less than M. The N×K dimensional encryption matrix is transmitted to a modeling server. The modeling server trains a machine learning model by using the N×K dimensional encryption matrix as a training sample.Type: GrantFiled: September 30, 2019Date of Patent: February 22, 2022Assignee: Advanced New Technologies Co., Ltd.Inventors: Xinxing Yang, Shaosheng Cao, Jun Zhou, Xiaolong Li
-
Publication number: 20210382945Abstract: Implementations of this disclosure provide for obtaining dynamic embedding vectors of nodes in relationship graphs. An example method includes determining N neighboring nodes of a first node of a plurality of nodes; obtaining respective input embedding vectors of the first node and the N neighboring nodes, the input embedding vector of each node being determined based on a respective static embedding vector and a respective positional embedding vector of the node; inputting the respective input embedding vectors of the first node and the N neighboring nodes into a pre-trained embedding model that includes one or more sequentially connected computing blocks, each computing block including a corresponding self-attention layer that outputs N+1 output vectors corresponding to N+1 input vectors; and receiving respective dynamic embedding vectors of the first node and the N neighboring nodes output by the pre-trained embedding model.Type: ApplicationFiled: August 19, 2021Publication date: December 9, 2021Applicant: Advanced New Technologies Co., Ltd.Inventors: Shaosheng Cao, Qing Cui
-
Patent number: 11100167Abstract: Implementations of this disclosure provide for obtaining dynamic embedding vectors of nodes in relationship graphs. An example method includes determining N neighboring nodes of a first node of a plurality of nodes; obtaining respective input embedding vectors of the first node and the N neighboring nodes, the input embedding vector of each node being determined based on a respective static embedding vector and a respective positional embedding vector of the node; inputting the respective input embedding vectors of the first node and the N neighboring nodes into a pre-trained embedding model that includes one or more sequentially connected computing blocks, each computing block including a corresponding self-attention layer that outputs N+1 output vectors corresponding to N+1 input vectors; and receiving respective dynamic embedding vectors of the first node and the N neighboring nodes output by the pre-trained embedding model.Type: GrantFiled: March 4, 2020Date of Patent: August 24, 2021Assignee: Advanced New Technologies Co., Ltd.Inventors: Shaosheng Cao, Qing Cui
-
Patent number: 11074246Abstract: Implementations of the present specification disclose method, apparatus, and device for processing graph data using a random walk-based process. The process is applicable to either a cluster of machines, a stand-alone machine, or both. In one aspect, the method includes: obtaining, by a cluster, data describing a graph that has nodes and edges between the nodes, wherein the cluster comprises (i) a server cluster that includes a plurality of server machines and (ii) a working machine cluster that includes a plurality of working machines; generating a two-dimensional array based on the data, wherein generating the two-dimensional array comprises generating, for each node included in the graph, a row comprising respective identifiers of adjacent nodes of the node; and generating, based on the two-dimensional array, a random sequence that represents a random walk processing of the data by the cluster.Type: GrantFiled: February 28, 2020Date of Patent: July 27, 2021Assignee: Advanced New Technologies Co., Ltd.Inventors: Shaosheng Cao, Xinxing Yang, Jun Zhou
-
Patent number: 11030411Abstract: Implementations of the present specification disclose a method for generating word vectors, apparatus, and device. The method includes: obtaining words by segmenting a corpus; establishing a feature vector of each obtained word based on n-ary characters; training a convolutional neural network based on the feature vectors of the obtained words and the feature vectors of context words associated with each obtained word in the corpus; and generating a word vector for each obtained word based on the feature vector of the obtained word and the trained convolutional neural network.Type: GrantFiled: May 26, 2020Date of Patent: June 8, 2021Assignee: ALIBABA GROUP HOLDING LIMITEDInventors: Shaosheng Cao, Jun Zhou
-
Patent number: 10901971Abstract: Embodiments of the present specification disclose random walking and a cluster-based random walking method, apparatus and device. A solution includes: obtaining information about each node included in graph data, generating, according to the information about each node, a hash table reflecting a correspondence between the node and an adjacent node of the node, and generating a random sequence according to the hash table, to implement random walking in the graph data. The solution is applicable to clusters and single machines.Type: GrantFiled: January 7, 2020Date of Patent: January 26, 2021Assignee: ADVANCED NEW TECHNOLOGIES CO., LTD.Inventors: Shaosheng Cao, Xinxing Yang, Jun Zhou
-
Publication number: 20210011788Abstract: Embodiments of the present specification disclose a vector-processing method, apparatus, and device for RPC information. The scheme comprises: acquiring an RPC-information sequence consisting of a plurality of RPC-information units of a user; establishing and initializing feature vectors of the RPC-information units; and training the feature vectors according to the RPC-information sequence and the feature vectors, so as to obtain feature vectors with accurate expression.Type: ApplicationFiled: January 16, 2019Publication date: January 14, 2021Applicant: Alibaba Group Holding LimitedInventors: Shaosheng Cao, Jun Zhou
-
Patent number: 10878199Abstract: A word vector processing method is provided. Word segmentation is performed on a corpus to obtain words, and n-gram strokes corresponding to the words are determined. Each n-gram stroke represents n successive strokes of a corresponding word. Word vectors of the words and stroke vectors of the n-gram strokes are initialized corresponding to the words. After performing the word segmentation, the n-gram strokes are determined, and the word vectors and stroke vectors are determined, training the word vectors and the stroke vectors.Type: GrantFiled: September 30, 2019Date of Patent: December 29, 2020Assignee: Advanced New Technologies Co., Ltd.Inventors: Shaosheng Cao, Xiaolong Li
-
Patent number: 10846483Abstract: A cluster includes a server cluster and a worker computer cluster. Each worker computer included in the worker computer cluster separately obtains a word and at least one context word of the word that are extracted from a corpus. The worker computer obtains word vectors for the word and the at least one context word. The worker computer calculates a gradient according to the word, the at least one context word, and the word vectors. The worker computer asynchronously updates the gradient to a server included in the server cluster. The server updates the word vectors for the word and the at least one context word of the word according to the gradient.Type: GrantFiled: January 29, 2020Date of Patent: November 24, 2020Assignee: ADVANCED NEW TECHNOLOGIES CO., LTD.Inventors: Shaosheng Cao, Xinxing Yang, Jun Zhou
-
Publication number: 20200356598Abstract: Implementations of this disclosure provide for obtaining dynamic embedding vectors of nodes in relationship graphs. An example method includes determining N neighboring nodes of a first node of a plurality of nodes; obtaining respective input embedding vectors of the first node and the N neighboring nodes, the input embedding vector of each node being determined based on a respective static embedding vector and a respective positional embedding vector of the node; inputting the respective input embedding vectors of the first node and the N neighboring nodes into a pre-trained embedding model that includes one or more sequentially connected computing blocks, each computing block including a corresponding self-attention layer that outputs N+1 output vectors corresponding to N+1 input vectors; and receiving respective dynamic embedding vectors of the first node and the N neighboring nodes output by the pre-trained embedding model.Type: ApplicationFiled: March 4, 2020Publication date: November 12, 2020Applicant: Alibaba Group Holding LimitedInventors: Shaosheng Cao, Qing Cui
-
Patent number: 10824819Abstract: Implementations of the present specification disclose methods, apparatuses, and devices for generating word vectors. The method includes: obtaining individual words by segmenting a corpus; establishing a feature vector of each word based on n-ary characters; training a recurrent neural network based on the feature vectors of the obtained words and feature vectors of context words associated with the obtained words in the corpus; and generating a word vector for each obtained word based on the feature vector of the obtained word and the trained recurrent neural network.Type: GrantFiled: May 20, 2020Date of Patent: November 3, 2020Assignee: ALIBABA GROUP HOLDING LIMITEDInventors: Shaosheng Cao, Jun Zhou
-
Patent number: 10776334Abstract: Embodiments of the present specification disclose random walking and a cluster-based random walking method, apparatus and device. A solution includes: obtaining information about each node included in graph data, generating, according to the information about each node, an index vector reflecting a degree value of a respective node, then generating an element vector reflecting an identifier of an adjacent node of the node, and generating a random sequence according to the index vector and the element vector, to implement random walks in the graph data. The solution is applicable to clusters and individual machines.Type: GrantFiled: January 7, 2020Date of Patent: September 15, 2020Assignee: Alibaba Group Holding LimitedInventors: Shaosheng Cao, Xinxing Yang, Jun Zhou, Xiaolong Li
-
Publication number: 20200285811Abstract: Implementations of the present specification disclose a method for generating word vectors, apparatus, and device. The method includes: obtaining words by segmenting a corpus; establishing a feature vector of each obtained word based on n-ary characters; training a convolutional neural network based on the feature vectors of the obtained words and the feature vectors of context words associated with each obtained word in the corpus; and generating a word vector for each obtained word based on the feature vector of the obtained word and the trained convolutional neural network.Type: ApplicationFiled: May 26, 2020Publication date: September 10, 2020Inventors: Shaosheng CAO, Jun ZHOU
-
Patent number: 10769383Abstract: Embodiments of the present application disclose a cluster-based word vector processing method, apparatus, and device. Solutions are include: in a cluster having a server cluster and a worker computer cluster, in which each worker computer in the worker computer cluster separately reads some corpuses in parallel, extracts a word and context words of the word from the read corpuses, obtains corresponding word vectors from a server in the server cluster, and trains the corresponding word vectors, and the server cluster updates word vectors of same words that are stored before the training according to training results of one or more respective worker computers with respect to the word vectors of the same words.Type: GrantFiled: January 15, 2020Date of Patent: September 8, 2020Assignee: Alibaba Group Holding LimitedInventors: Shaosheng Cao, Xinxing Yang, Jun Zhou, Xiaolong Li
-
Publication number: 20200279080Abstract: Implementations of the present specification disclose methods, apparatuses, and devices for generating word vectors. The method includes: obtaining individual words by segmenting a corpus; establishing a feature vector of each word based on n-ary characters; training a recurrent neural network based on the feature vectors of the obtained words and feature vectors of context words associated with the obtained words in the corpus; and generating a word vector for each obtained word based on the feature vector of the obtained word and the trained recurrent neural network.Type: ApplicationFiled: May 20, 2020Publication date: September 3, 2020Inventors: Shaosheng CAO, Jun ZHOU
-
Patent number: 10740561Abstract: Disclosed herein are methods, systems, and apparatus, including computer programs encoded on computer storage media, for entity prediction. One of the methods includes performing word segmentation on text to be predicted to obtain a plurality of words. For each word of the plurality of words, a determination is made whether the word has a pre-trained word vector. In response to determining that the word has a pre-trained word vector, the pre-trained word vector for the word is obtained. In response to determining that the word does not have a pre-trained word vector, a word vector for the word is determined based on a pre-trained stroke vector. The word vector and the pre-trained stroke vector are trained based on a text sample and a word vector model. An entity associated with the text is predicted by inputting word vectors of the plurality of words into an entity prediction model.Type: GrantFiled: October 31, 2019Date of Patent: August 11, 2020Assignee: Alibaba Group Holding LimitedInventors: Shaosheng Cao, Jun Zhou
-
Publication number: 20200201844Abstract: Implementations of the present specification disclose method, apparatus, and device for processing graph data using a random walk-based process. The process is applicable to either a cluster of machines, a stand-alone machine, or both. In one aspect, the method includes: obtaining, by a cluster, data describing a graph that has nodes and edges between the nodes, wherein the cluster comprises (i) a server cluster that includes a plurality of server machines and (ii) a working machine cluster that includes a plurality of working machines; generating a two-dimensional array based on the data, wherein generating the two-dimensional array comprises generating, for each node included in the graph, a row comprising respective identifiers of adjacent nodes of the node; and generating, based on the two-dimensional array, a random sequence that represents a random walk processing of the data by the cluster.Type: ApplicationFiled: February 28, 2020Publication date: June 25, 2020Applicant: Alibaba Group Holding LimitedInventors: Shaosheng Cao, Xinxing Yang, Jun Zhou
-
Publication number: 20200167527Abstract: A cluster includes a server cluster and a worker computer cluster. Each worker computer included in the worker computer cluster separately obtains a word and at least one context word of the word that are extracted from a corpus. The worker computer obtains word vectors for the word and the at least one context word. The worker computer calculates a gradient according to the word, the at least one context word, and the word vectors. The worker computer asynchronously updates the gradient to a server included in the server cluster. The server updates the word vectors for the word and the at least one context word of the word according to the gradient.Type: ApplicationFiled: January 29, 2020Publication date: May 28, 2020Inventors: Shaosheng CAO, Xinxing YANG, Jun ZHOU