Patents by Inventor Shulong Tan
Shulong Tan has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11989233Abstract: Presented herein are embodiments of a fast search on graph methodology for Maximum Inner Product Search (MIPS). This optimization problem is challenging since traditional Approximate Nearest Neighbor (ANN) search methods may not perform efficiently in the nonmetric similarity measure. Embodiments herein are based on the property that a Möbius/Möbius-like transformation introduces an isomorphism between a subgraph of 2-Delaunay graph and Delaunay graph for inner product. Under this observation, embodiments of a novel graph indexing and searching methodology are presented to find the optimal solution with the largest inner product with the query. Experiments show significant improvements compared to existing methods.Type: GrantFiled: September 27, 2020Date of Patent: May 21, 2024Assignee: Baidu USA LLCInventors: Shulong Tan, Zhixin Zhou, Zhaozhuo Xu, Ping Li
-
Patent number: 11914669Abstract: Approximate nearest neighbor (ANN) searching is a fundamental problem in computer science with numerous applications in area such as machine learning and data mining. For typical graph-based ANN methods, the searching method is executed iteratively, and the execution dependency prohibits graphics processor unit (GPU)/GPU-type processor adaptations. Presented herein are embodiments of a novel framework that decouples the searching on graph methodology into stages, in order to parallel the performance-crucial distance computation. Furthermore, in one or more embodiments, to obtain better parallelism on GPU-type components, also disclosed are novel ANN-specific optimization methods that eliminate dynamic memory allocations and trade computations for less memory consumption. Embodiments were empirically compared against other methods, and the results confirm the effectiveness.Type: GrantFiled: November 11, 2020Date of Patent: February 27, 2024Assignee: Baidu USA LLCInventors: Weijie Zhao, Shulong Tan, Ping Li
-
Publication number: 20230195733Abstract: Presented are systems and methods that construct BipartitE Graph INdices (BEGIN) embodiments for fast neural ranking. BEGIN embodiments comprise two types of nodes: sampled queries and base or searching objects. In one or more embodiments, edges connecting these nodes are constructed by using a neural network ranking measure. These embodiments extend traditional search-on-graph methods and lend themselves to fast neural ranking. Experimental results demonstrate the effectiveness and efficiency of such embodiments.Type: ApplicationFiled: December 17, 2021Publication date: June 22, 2023Applicant: Baidu USA LLCInventors: Shulong TAN, Weijie ZHAO, Ping LI
-
Publication number: 20230077267Abstract: Incremental proximity graph maintenance (IPGM) systems and methods for online ANN search support both online vertex deletion and insertion of vertices on proximity graphs. In various embodiments, updating a proximity graph comprises receiving a workload that represents a set of vertices in the proximity graph, each vertex being associated with a type of operation such as a query, insertion, or deletion. For a query or an insertion, a search may be executed on the graph to obtain a set of top-K vertices for each vertex. In the case of a deletion, a vertex may be deleted from the proximity graph, and a local or global reconnection update method may be used to reconstruct at least a portion of the proximity graph.Type: ApplicationFiled: August 20, 2021Publication date: March 9, 2023Applicant: Baidu USA LLCInventors: Shulong TAN, Zhaozhuo XU, Weijie ZHAO, Zhixin ZHOU, Ping LI
-
Patent number: 11580415Abstract: Due to the high language use variability in real-life, manual construction of semantic resources to cover all synonyms is prohibitively expensive and may result in limited coverage. Described herein are systems and methods that automate the process of synonymy resource development, including both formal entities and noisy descriptions from end-users. Embodiments of a multi-task model with hierarchical task relationship are presented that learn more representative entity/term embeddings and apply them to synonym prediction. In model embodiments, a skip-gram word embedding model is extended by introducing an auxiliary task “neighboring word/term semantic type prediction” and hierarchically organize them based on the task complexity. In one or more embodiments, existing term-term synonymous knowledge is integrated into the word embedding learning framework.Type: GrantFiled: July 9, 2019Date of Patent: February 14, 2023Assignee: Baidu USA LLCInventors: Hongliang Fei, Shulong Tan, Ping Li
-
Publication number: 20230035337Abstract: Efficient inner product search is important for many data ranking services, such as recommendation and Information Retrieval. Efficient retrieval via inner product dramatically influences the performance of such data searching and retrieval systems. To resolve deficiencies of prior approaches, embodiments of a new index graph construction approach, referred to generally as Norm Adjusted Proximity Graph (NAPG), for approximate Maximum Inner Product Search (MIPS) are presented. With adjusting factors estimated on sampled data, NAPG embodiments select more meaningful data points to connect with when constructing a graph-based index for inner product search. Extensive experiments verify that the improved graph-based index pushes the state-of-the-art of inner product search forward greatly, in the trade-off between search efficiency and effectiveness.Type: ApplicationFiled: February 18, 2022Publication date: February 2, 2023Applicant: Baidu USA LLCInventors: Shulong TAN, Zhaozhuo XU, Weijie ZHAO, Hongliang FEI, Zhixin ZHOU, Ping LI
-
Patent number: 11195128Abstract: Presented are systems and methods that allow healthcare providers and governments to infer demand for healthcare resources to ensure effective and timely healthcare services to patients by reducing healthcare supply shortages, emergencies, and healthcare costs. In embodiments, this is accomplished by gathering data from a number of sources to generate labeled records from which entity features and relationships between entities are extracted, correlates, and/or combined with other external healthcare data. In embodiments, this information is used to train a model that predicts healthcare resource demands given a set of input conditions or factors.Type: GrantFiled: August 2, 2016Date of Patent: December 7, 2021Assignee: Baidu USA LLCInventors: Yi Zhen, Hongliang Fei, Shulong Tan, Wei Fan
-
Publication number: 20210157606Abstract: Approximate nearest neighbor (ANN) searching is a fundamental problem in computer science with numerous applications in area such as machine learning and data mining. For typical graph-based ANN methods, the searching method is executed iteratively, and the execution dependency prohibits graphics processor unit (GPU)/GPU-type processor adaptations. Presented herein are embodiments of a novel framework that decouples the searching on graph methodology into stages, in order to parallel the performance-crucial distance computation. Furthermore, in one or more embodiments, to obtain better parallelism on GPU-type components, also disclosed are novel ANN-specific optimization methods that eliminate dynamic memory allocations and trade computations for less memory consumption. Embodiments were empirically compared against other methods, and the results confirm the effectiveness.Type: ApplicationFiled: November 11, 2020Publication date: May 27, 2021Applicant: Baidu USA LLCInventors: Weijie ZHAO, Shulong TAN, Ping LI
-
Publication number: 20210133246Abstract: Presented herein are embodiments of a fast search on graph methodology for Maximum Inner Product Search (MIPS). This optimization problem is challenging since traditional Approximate Nearest Neighbor (ANN) search methods may not perform efficiently in the nonmetric similarity measure. Embodiments herein are based on the property that a Möbius/Möbius-like transformation introduces an isomorphism between a subgraph of 2-Delaunay graph and Delaunay graph for inner product. Under this observation, embodiments of a novel graph indexing and searching methodology are presented to find the optimal solution with the largest inner product with the query. Experiments show significant improvements compared to existing methods.Type: ApplicationFiled: September 27, 2020Publication date: May 6, 2021Applicant: Baidu USA LLCInventors: Shulong TAN, Zhixin ZHOU, Zhaozhuo XU, Ping LI
-
Publication number: 20210117459Abstract: Retrieval of relevant vectors produced by representation learning can critically influence the efficiency in Natural Language Processing (NLP) tasks. Presented herein are systems and methods for searching vectors via a typical nonmetric matching function: inner product. Embodiments, which construct an approximate Inner Product Delaunay Graph (IPDG) for top-1 Maximum Inner Product Search (MIPS), transform retrieving the most suitable latent vectors into a graph search problem with great benefits of efficiency. Experiments on data representations learned for different machine learning tasks verify the outperforming effectiveness and efficiency of IPDG embodiments.Type: ApplicationFiled: September 16, 2020Publication date: April 22, 2021Applicant: Baidu USA LLCInventors: Shulong TAN, Zhixin ZHOU, Zhaozhuo XU, Ping LI
-
Publication number: 20210012215Abstract: Due to the high language use variability in real-life, manual construction of semantic resources to cover all synonyms is prohibitively expensive and may result in limited coverage. Described herein are systems and methods that automate the process of synonymy resource development, including both formal entities and noisy descriptions from end-users. Embodiments of a multi-task model with hierarchical task relationship are presented that learn more representative entity/term embeddings and apply them to synonym prediction. In model embodiments, a skip-gram word embedding model is extended by introducing an auxiliary task “neighboring word/term semantic type prediction” and hierarchically organize them based on the task complexity. In one or more embodiments, existing term-term synonymous knowledge is integrated into the word embedding learning framework.Type: ApplicationFiled: July 9, 2019Publication date: January 14, 2021Applicant: Baidu USA LLCInventors: Hongliang FEI, Shulong TAN, Ping LI
-
Patent number: 10650305Abstract: Presented are relation inference methods and systems that use deep learning techniques for data mining documents to discover a relation between terms of interest in a given field covering a specific topic. For example, in the healthcare domain, various embodiments of the present disclosure provide for a relation inference system that mines large-scale medical documents in a free-text database to extract symptom and disease terms and generates relation information that aids in disease diagnosis. In embodiments, this is accomplished by training and using an RNN, such as an LSTM, a Gated Recurrent Unit (GRU), etc., that takes advantage of a term dictionary to examine co-occurrences of terms of interest within documents to discover correlations between the terms. The correlation may then be used to predict statistically most probable terms (e.g., a disease) related to a given search term (e.g., a symptom).Type: GrantFiled: July 8, 2016Date of Patent: May 12, 2020Assignee: Baidu USA LLCInventors: Chaochun Liu, Nan Du, Shulong Tan, Hongliang Fei, Wei Fan
-
Patent number: 10372743Abstract: Systems and methods are disclosed to identify entities that have a similar meaning, and may, in embodiments, be grouped into entity groups for knowledge base construction. In embodiments, the entity relations of similarity or non-similarity for an entity pair are predicted as a binary relationship. In embodiments, the prediction may be based upon similarity score between the entities and the entity features, which features are constructed using an entity feature or representation model. In embodiments, the prediction may be an iterative process involving minimum human checking and existing knowledge update. In embodiments, one or more entity groups are formed using graph search from the predicted entity pairs. In embodiments, a group centroid entity may be selected to represent each group based on one or more factors, such as its generality or popularity.Type: GrantFiled: July 20, 2016Date of Patent: August 6, 2019Assignee: Baidu USA LLCInventors: Shulong Tan, Hongliang Fei, Yi Zhen, Yu Cao, Bocong Liu, Chaochun Liu, Richard Chun Ching Wang, Dawen Zhou, Wei Fan
-
Publication number: 20180039735Abstract: Presented are systems and methods that allow healthcare providers and governments to infer demand for healthcare resources to ensure effective and timely healthcare services to patients by reducing healthcare supply shortages, emergencies, and healthcare costs. In embodiments, this is accomplished by gathering data from a number of sources to generate labeled records from which entity features and relationships between entities are extracted, correlates, and/or combined with other external healthcare data. In embodiments, this information is used to train a model that predicts healthcare resource demands given a set of input conditions or factors.Type: ApplicationFiled: August 2, 2016Publication date: February 8, 2018Applicant: Baidu USA LLCInventors: Yi Zhen, Hongliang Fei, Shulong Tan, Wei Fan
-
Publication number: 20180025121Abstract: Systems and methods are disclosed provide improved automated extraction of medical-related information. In embodiments, finer-grained medical-related data, such as medical entities, including symptoms, diseases, dimensions, and temporal information, can be extracted. In embodiments, by extracted finer level medical-related information from an input statement and generating visual displays of that information, a medical professional can readily see relevant medical information that provides medical entities and associated dimension information, as well as evolving history.Type: ApplicationFiled: July 20, 2016Publication date: January 25, 2018Applicant: Baidu USA LLCInventors: Hongliang Fei, Shulong Tan, Yi Zhen, Erheng Zhong, Chaochun Liu, Dawen Zhou, Wei Fan
-
Publication number: 20180025008Abstract: Systems and methods are disclosed to identify entities that have a similar meaning, and may, in embodiments, be grouped into entity groups for knowledge base construction. In embodiments, the entity relations of similarity or non-similarity for an entity pair are predicted as a binary relationship. In embodiments, the prediction may be based upon similarity score between the entities and the entity features, which features are constructed using an entity feature or representation model. In embodiments, the prediction may be an iterative process involving minimum human checking and existing knowledge update. In embodiments, one or more entity groups are formed using graph search from the predicted entity pairs. In embodiments, a group centroid entity may be selected to represent each group based on one or more factors, such as its generality or popularity.Type: ApplicationFiled: July 20, 2016Publication date: January 25, 2018Applicant: Baidu USA LLCInventors: Shulong Tan, Hongliang Fei, Yi Zhen, Yu Cao, Bocong Liu, Chaochun Liu, Richard Chun Ching Wang, Dawen Zhou, Wei Fan
-
Publication number: 20180012121Abstract: Presented are relation inference methods and systems that use deep learning techniques for data mining documents to discover a relation between terms of interest in a given field covering a specific topic. For example, in the healthcare domain, various embodiments of the present disclosure provide for a relation inference system that mines large-scale medical documents in a free-text database to extract symptom and disease terms and generates relation information that aids in disease diagnosis. In embodiments, this is accomplished by training and using an RNN, such as an LSTM, a Gated Recurrent Unit (GRU), etc., that takes advantage of a term dictionary to examine co-occurrences of terms of interest within documents to discover correlations between the terms. The correlation may then be used to predict statistically most probable terms (e.g., a disease) related to a given search term (e.g., a symptom).Type: ApplicationFiled: July 8, 2016Publication date: January 11, 2018Applicant: Baidu USA LLCInventors: Chaochun Liu, Nan Du, Shulong Tan, Hongliang Fei, Wei Fan