Patents by Inventor Shulong Tan

Shulong Tan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Transformation for fast inner product search on graph

Patent number: 11989233

Abstract: Presented herein are embodiments of a fast search on graph methodology for Maximum Inner Product Search (MIPS). This optimization problem is challenging since traditional Approximate Nearest Neighbor (ANN) search methods may not perform efficiently in the nonmetric similarity measure. Embodiments herein are based on the property that a Möbius/Möbius-like transformation introduces an isomorphism between a subgraph of 2-Delaunay graph and Delaunay graph for inner product. Under this observation, embodiments of a novel graph indexing and searching methodology are presented to find the optimal solution with the largest inner product with the query. Experiments show significant improvements compared to existing methods.

Type: Grant

Filed: September 27, 2020

Date of Patent: May 21, 2024

Assignee: Baidu USA LLC

Inventors: Shulong Tan, Zhixin Zhou, Zhaozhuo Xu, Ping Li
Approximate nearest neighbor search for single instruction, multiple thread (SIMT) or single instruction, multiple data (SIMD) type processors

Patent number: 11914669

Abstract: Approximate nearest neighbor (ANN) searching is a fundamental problem in computer science with numerous applications in area such as machine learning and data mining. For typical graph-based ANN methods, the searching method is executed iteratively, and the execution dependency prohibits graphics processor unit (GPU)/GPU-type processor adaptations. Presented herein are embodiments of a novel framework that decouples the searching on graph methodology into stages, in order to parallel the performance-crucial distance computation. Furthermore, in one or more embodiments, to obtain better parallelism on GPU-type components, also disclosed are novel ANN-specific optimization methods that eliminate dynamic memory allocations and trade computations for less memory consumption. Embodiments were empirically compared against other methods, and the results confirm the effectiveness.

Type: Grant

Filed: November 11, 2020

Date of Patent: February 27, 2024

Assignee: Baidu USA LLC

Inventors: Weijie Zhao, Shulong Tan, Ping Li
FAST NEURAL RANKING ON BIPARTITE GRAPH INDICES

Publication number: 20230195733

Abstract: Presented are systems and methods that construct BipartitE Graph INdices (BEGIN) embodiments for fast neural ranking. BEGIN embodiments comprise two types of nodes: sampled queries and base or searching objects. In one or more embodiments, edges connecting these nodes are constructed by using a neural network ranking measure. These embodiments extend traditional search-on-graph methods and lend themselves to fast neural ranking. Experimental results demonstrate the effectiveness and efficiency of such embodiments.

Type: Application

Filed: December 17, 2021

Publication date: June 22, 2023

Applicant: Baidu USA LLC

Inventors: Shulong TAN, Weijie ZHAO, Ping LI
PROXIMITY GRAPH MAINTENANCE FOR FAST ONLINE NEAREST NEIGHBOR SEARCH

Publication number: 20230077267

Abstract: Incremental proximity graph maintenance (IPGM) systems and methods for online ANN search support both online vertex deletion and insertion of vertices on proximity graphs. In various embodiments, updating a proximity graph comprises receiving a workload that represents a set of vertices in the proximity graph, each vertex being associated with a type of operation such as a query, insertion, or deletion. For a query or an insertion, a search may be executed on the graph to obtain a set of top-K vertices for each vertex. In the case of a deletion, a vertex may be deleted from the proximity graph, and a local or global reconnection update method may be used to reconstruct at least a portion of the proximity graph.

Type: Application

Filed: August 20, 2021

Publication date: March 9, 2023

Applicant: Baidu USA LLC

Inventors: Shulong TAN, Zhaozhuo XU, Weijie ZHAO, Zhixin ZHOU, Ping LI
Hierarchical multi-task term embedding learning for synonym prediction

Patent number: 11580415

Abstract: Due to the high language use variability in real-life, manual construction of semantic resources to cover all synonyms is prohibitively expensive and may result in limited coverage. Described herein are systems and methods that automate the process of synonymy resource development, including both formal entities and noisy descriptions from end-users. Embodiments of a multi-task model with hierarchical task relationship are presented that learn more representative entity/term embeddings and apply them to synonym prediction. In model embodiments, a skip-gram word embedding model is extended by introducing an auxiliary task “neighboring word/term semantic type prediction” and hierarchically organize them based on the task complexity. In one or more embodiments, existing term-term synonymous knowledge is integrated into the word embedding learning framework.

Type: Grant

Filed: July 9, 2019

Date of Patent: February 14, 2023

Assignee: Baidu USA LLC

Inventors: Hongliang Fei, Shulong Tan, Ping Li
NORM ADJUSTED PROXIMITY GRAPH FOR FAST INNER PRODUCT RETRIEVAL

Publication number: 20230035337

Abstract: Efficient inner product search is important for many data ranking services, such as recommendation and Information Retrieval. Efficient retrieval via inner product dramatically influences the performance of such data searching and retrieval systems. To resolve deficiencies of prior approaches, embodiments of a new index graph construction approach, referred to generally as Norm Adjusted Proximity Graph (NAPG), for approximate Maximum Inner Product Search (MIPS) are presented. With adjusting factors estimated on sampled data, NAPG embodiments select more meaningful data points to connect with when constructing a graph-based index for inner product search. Extensive experiments verify that the improved graph-based index pushes the state-of-the-art of inner product search forward greatly, in the trade-off between search efficiency and effectiveness.

Type: Application

Filed: February 18, 2022

Publication date: February 2, 2023

Applicant: Baidu USA LLC

Inventors: Shulong TAN, Zhaozhuo XU, Weijie ZHAO, Hongliang FEI, Zhixin ZHOU, Ping LI
Systems and methods for estimating healthcare resource demand

Patent number: 11195128

Abstract: Presented are systems and methods that allow healthcare providers and governments to infer demand for healthcare resources to ensure effective and timely healthcare services to patients by reducing healthcare supply shortages, emergencies, and healthcare costs. In embodiments, this is accomplished by gathering data from a number of sources to generate labeled records from which entity features and relationships between entities are extracted, correlates, and/or combined with other external healthcare data. In embodiments, this information is used to train a model that predicts healthcare resource demands given a set of input conditions or factors.

Type: Grant

Filed: August 2, 2016

Date of Patent: December 7, 2021

Assignee: Baidu USA LLC

Inventors: Yi Zhen, Hongliang Fei, Shulong Tan, Wei Fan
APPROXIMATE NEAREST NEIGHBOR SEARCH FOR SINGLE INSTRUCTION, MULTIPLE THREAD (SIMT) OR SINGLE INSTRUCTION, MULTIPLE DATA (SIMD) TYPE PROCESSORS

Publication number: 20210157606

Abstract: Approximate nearest neighbor (ANN) searching is a fundamental problem in computer science with numerous applications in area such as machine learning and data mining. For typical graph-based ANN methods, the searching method is executed iteratively, and the execution dependency prohibits graphics processor unit (GPU)/GPU-type processor adaptations. Presented herein are embodiments of a novel framework that decouples the searching on graph methodology into stages, in order to parallel the performance-crucial distance computation. Furthermore, in one or more embodiments, to obtain better parallelism on GPU-type components, also disclosed are novel ANN-specific optimization methods that eliminate dynamic memory allocations and trade computations for less memory consumption. Embodiments were empirically compared against other methods, and the results confirm the effectiveness.

Type: Application

Filed: November 11, 2020

Publication date: May 27, 2021

Applicant: Baidu USA LLC

Inventors: Weijie ZHAO, Shulong TAN, Ping LI
TRANSFORMATION FOR FAST INNER PRODUCT SEARCH ON GRAPH

Publication number: 20210133246

Abstract: Presented herein are embodiments of a fast search on graph methodology for Maximum Inner Product Search (MIPS). This optimization problem is challenging since traditional Approximate Nearest Neighbor (ANN) search methods may not perform efficiently in the nonmetric similarity measure. Embodiments herein are based on the property that a Möbius/Möbius-like transformation introduces an isomorphism between a subgraph of 2-Delaunay graph and Delaunay graph for inner product. Under this observation, embodiments of a novel graph indexing and searching methodology are presented to find the optimal solution with the largest inner product with the query. Experiments show significant improvements compared to existing methods.

Type: Application

Filed: September 27, 2020

Publication date: May 6, 2021

Applicant: Baidu USA LLC

Inventors: Shulong TAN, Zhixin ZHOU, Zhaozhuo XU, Ping LI
EFFICIENT RETRIEVAL OF TOP SIMILARITY REPRESENTATIONS

Publication number: 20210117459

Abstract: Retrieval of relevant vectors produced by representation learning can critically influence the efficiency in Natural Language Processing (NLP) tasks. Presented herein are systems and methods for searching vectors via a typical nonmetric matching function: inner product. Embodiments, which construct an approximate Inner Product Delaunay Graph (IPDG) for top-1 Maximum Inner Product Search (MIPS), transform retrieving the most suitable latent vectors into a graph search problem with great benefits of efficiency. Experiments on data representations learned for different machine learning tasks verify the outperforming effectiveness and efficiency of IPDG embodiments.

Type: Application

Filed: September 16, 2020

Publication date: April 22, 2021

Applicant: Baidu USA LLC

Inventors: Shulong TAN, Zhixin ZHOU, Zhaozhuo XU, Ping LI
HIERARCHICAL MULTI-TASK TERM EMBEDDING LEARNING FOR SYNONYM PREDICTION

Publication number: 20210012215

Abstract: Due to the high language use variability in real-life, manual construction of semantic resources to cover all synonyms is prohibitively expensive and may result in limited coverage. Described herein are systems and methods that automate the process of synonymy resource development, including both formal entities and noisy descriptions from end-users. Embodiments of a multi-task model with hierarchical task relationship are presented that learn more representative entity/term embeddings and apply them to synonym prediction. In model embodiments, a skip-gram word embedding model is extended by introducing an auxiliary task “neighboring word/term semantic type prediction” and hierarchically organize them based on the task complexity. In one or more embodiments, existing term-term synonymous knowledge is integrated into the word embedding learning framework.

Type: Application

Filed: July 9, 2019

Publication date: January 14, 2021

Applicant: Baidu USA LLC

Inventors: Hongliang FEI, Shulong TAN, Ping LI
Systems and methods for relation inference

Patent number: 10650305

Abstract: Presented are relation inference methods and systems that use deep learning techniques for data mining documents to discover a relation between terms of interest in a given field covering a specific topic. For example, in the healthcare domain, various embodiments of the present disclosure provide for a relation inference system that mines large-scale medical documents in a free-text database to extract symptom and disease terms and generates relation information that aids in disease diagnosis. In embodiments, this is accomplished by training and using an RNN, such as an LSTM, a Gated Recurrent Unit (GRU), etc., that takes advantage of a term dictionary to examine co-occurrences of terms of interest within documents to discover correlations between the terms. The correlation may then be used to predict statistically most probable terms (e.g., a disease) related to a given search term (e.g., a symptom).

Type: Grant

Filed: July 8, 2016

Date of Patent: May 12, 2020

Assignee: Baidu USA LLC

Inventors: Chaochun Liu, Nan Du, Shulong Tan, Hongliang Fei, Wei Fan
Systems and methods for homogeneous entity grouping

Patent number: 10372743

Abstract: Systems and methods are disclosed to identify entities that have a similar meaning, and may, in embodiments, be grouped into entity groups for knowledge base construction. In embodiments, the entity relations of similarity or non-similarity for an entity pair are predicted as a binary relationship. In embodiments, the prediction may be based upon similarity score between the entities and the entity features, which features are constructed using an entity feature or representation model. In embodiments, the prediction may be an iterative process involving minimum human checking and existing knowledge update. In embodiments, one or more entity groups are formed using graph search from the predicted entity pairs. In embodiments, a group centroid entity may be selected to represent each group based on one or more factors, such as its generality or popularity.

Type: Grant

Filed: July 20, 2016

Date of Patent: August 6, 2019

Assignee: Baidu USA LLC

Inventors: Shulong Tan, Hongliang Fei, Yi Zhen, Yu Cao, Bocong Liu, Chaochun Liu, Richard Chun Ching Wang, Dawen Zhou, Wei Fan
SYSTEMS AND METHODS FOR ESTIMATING HEALTHCARE RESOURCE DEMAND

Publication number: 20180039735

Abstract: Presented are systems and methods that allow healthcare providers and governments to infer demand for healthcare resources to ensure effective and timely healthcare services to patients by reducing healthcare supply shortages, emergencies, and healthcare costs. In embodiments, this is accomplished by gathering data from a number of sources to generate labeled records from which entity features and relationships between entities are extracted, correlates, and/or combined with other external healthcare data. In embodiments, this information is used to train a model that predicts healthcare resource demands given a set of input conditions or factors.

Type: Application

Filed: August 2, 2016

Publication date: February 8, 2018

Applicant: Baidu USA LLC

Inventors: Yi Zhen, Hongliang Fei, Shulong Tan, Wei Fan
SYSTEMS AND METHODS FOR FINER-GRAINED MEDICAL ENTITY EXTRACTION

Publication number: 20180025121

Abstract: Systems and methods are disclosed provide improved automated extraction of medical-related information. In embodiments, finer-grained medical-related data, such as medical entities, including symptoms, diseases, dimensions, and temporal information, can be extracted. In embodiments, by extracted finer level medical-related information from an input statement and generating visual displays of that information, a medical professional can readily see relevant medical information that provides medical entities and associated dimension information, as well as evolving history.

Type: Application

Filed: July 20, 2016

Publication date: January 25, 2018

Applicant: Baidu USA LLC

Inventors: Hongliang Fei, Shulong Tan, Yi Zhen, Erheng Zhong, Chaochun Liu, Dawen Zhou, Wei Fan
SYSTEMS AND METHODS FOR HOMOGENEOUS ENTITY GROUPING

Publication number: 20180025008

Abstract: Systems and methods are disclosed to identify entities that have a similar meaning, and may, in embodiments, be grouped into entity groups for knowledge base construction. In embodiments, the entity relations of similarity or non-similarity for an entity pair are predicted as a binary relationship. In embodiments, the prediction may be based upon similarity score between the entities and the entity features, which features are constructed using an entity feature or representation model. In embodiments, the prediction may be an iterative process involving minimum human checking and existing knowledge update. In embodiments, one or more entity groups are formed using graph search from the predicted entity pairs. In embodiments, a group centroid entity may be selected to represent each group based on one or more factors, such as its generality or popularity.

Type: Application

Filed: July 20, 2016

Publication date: January 25, 2018

Applicant: Baidu USA LLC

Inventors: Shulong Tan, Hongliang Fei, Yi Zhen, Yu Cao, Bocong Liu, Chaochun Liu, Richard Chun Ching Wang, Dawen Zhou, Wei Fan
SYSTEMS AND METHODS FOR RELATION INFERENCE

Publication number: 20180012121

Abstract: Presented are relation inference methods and systems that use deep learning techniques for data mining documents to discover a relation between terms of interest in a given field covering a specific topic. For example, in the healthcare domain, various embodiments of the present disclosure provide for a relation inference system that mines large-scale medical documents in a free-text database to extract symptom and disease terms and generates relation information that aids in disease diagnosis. In embodiments, this is accomplished by training and using an RNN, such as an LSTM, a Gated Recurrent Unit (GRU), etc., that takes advantage of a term dictionary to examine co-occurrences of terms of interest within documents to discover correlations between the terms. The correlation may then be used to predict statistically most probable terms (e.g., a disease) related to a given search term (e.g., a symptom).

Type: Application

Filed: July 8, 2016

Publication date: January 11, 2018

Applicant: Baidu USA LLC

Inventors: Chaochun Liu, Nan Du, Shulong Tan, Hongliang Fei, Wei Fan