Patents by Inventor Hongliang Fei

Hongliang Fei has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11886446
    Abstract: Existing research on cross-lingual retrieval cannot take good advantage of large-scale pretrained language models, such as multilingual BERT and XLM. The absence of cross-lingual passage-level relevance data for finetuning and the lack of query-document style pretraining are some of the key factors of this issue. Accordingly, embodiments of two novel retrieval-oriented pretraining tasks are presented herein to further pretrain cross-lingual language models for downstream retrieval tasks, such as cross-lingual ad-hoc retrieval (CUR) and cross-lingual question answering (CLQA). In one or more embodiments, distant supervision data was constructed from multilingual texts using section alignment to support retrieval-oriented language model pretraining. In one or more embodiments, directly finetuning language models on part of an evaluation collection was performed by making Transformers capable of accepting longer sequences.
    Type: Grant
    Filed: January 14, 2022
    Date of Patent: January 30, 2024
    Assignee: Baidu USA LLC
    Inventors: Hongliang Fei, Puxuan Yu, Ping Li
  • Publication number: 20230410155
    Abstract: Deep neural network (DNN) models have been widely used for user-relevance content prediction. Presented herein is a new user-relevance framework, embodiments of which may be referred as Gating-Enhanced Multi-task Neural Networks (GemNN). In one or more, neural network-based multi-task learning model embodiments herein predict user engagement with content in a coarse-to-fine manner, which gradually reduces content candidates and allows parameter sharing from upstream tasks to downstream tasks to improve the training efficiency. Also, in one or more embodiments, a gating mechanism was introduced between embedding layers and multi-layer perceptions to learn feature interactions and control the information flow fed to MLP layers. Tested embodiments demonstrated considerable improvements over prior approaches.
    Type: Application
    Filed: July 7, 2021
    Publication date: December 21, 2023
    Applicants: Baidu USA LLC, Baidu.com Times Technology (Beijing) Co., Ltd.
    Inventors: Hongliang FEI, Jingyuan ZHANG, Xingxuan ZHOU, Junhao ZHAO, Banghu YIN, Ping LI
  • Patent number: 11816533
    Abstract: Learning disentangled representations is an important topic in machine learning for a wide range of applications. Disentangled latent variables represent interpretable semantic information and reflect separate factors of variation in data. Although generative models may learn latent representations and generate data samples as well, existing models may ignore the structural information among latent representations. Described in the present disclosure are embodiments to learn disentangled latent structural representations from data using decomposable variational auto-encoders, which simultaneously learn component representations and encode component relationships. Embodiments of a novel structural prior for latent representations are disclosed to capture interactions among different data components. Embodiments are applied to data segmentation and latent relation discovery among different data components. Experiments on several datasets demonstrate the utility of the present model embodiments.
    Type: Grant
    Filed: November 18, 2020
    Date of Patent: November 14, 2023
    Assignee: Baidu USA LLC
    Inventors: Shaogang Ren, Hongliang Fei, Dingcheng Li, Ping Li
  • Patent number: 11694042
    Abstract: Presented herein are embodiments of an unsupervised cross-lingual sentiment classification model (which may be referred to as multi-view encoder-classifier (MVEC)) that leverages an unsupervised machine translation (UMT) system and a language discriminator. Unlike previous language model (LM)-based fine-tuning approaches that adjust parameters solely based on the classification error on training data, embodiments employ an encoder-decoder framework of an UMT as a regularization component on the shared network parameters. In one or more embodiments, the cross-lingual encoder of embodiments learns a shared representation, which is effective for both reconstructing input sentences of two languages and generating more representative views from the input for classification. Experiments on five language pairs verify that an MVEC embodiment significantly outperforms other models for 8/11 sentiment classification tasks.
    Type: Grant
    Filed: March 19, 2021
    Date of Patent: July 4, 2023
    Assignee: Baidu USA LLC
    Inventors: Hongliang Fei, Ping Li
  • Patent number: 11630953
    Abstract: Described herein are embodiments for end-to-end reinforcement learning based coreference resolution models to directly optimize coreference evaluation metrics. Embodiments of a reinforced policy gradient model are disclosed to incorporate reward associated with a sequence of coreference linking actions. Furthermore, maximum entropy regularization may be used for adequate exploration to prevent a model embodiment from prematurely converging to a bad local optimum. Experiments on datasets compared with state-of-the-art methods verified the effectiveness of embodiments.
    Type: Grant
    Filed: July 25, 2019
    Date of Patent: April 18, 2023
    Assignees: Baidu USA LLC, Baidu.com Times Technology (Beijing) Co., Ltd.
    Inventors: Hongliang Fei, Xu Li, Dingcheng Li, Ping Li
  • Patent number: 11580415
    Abstract: Due to the high language use variability in real-life, manual construction of semantic resources to cover all synonyms is prohibitively expensive and may result in limited coverage. Described herein are systems and methods that automate the process of synonymy resource development, including both formal entities and noisy descriptions from end-users. Embodiments of a multi-task model with hierarchical task relationship are presented that learn more representative entity/term embeddings and apply them to synonym prediction. In model embodiments, a skip-gram word embedding model is extended by introducing an auxiliary task “neighboring word/term semantic type prediction” and hierarchically organize them based on the task complexity. In one or more embodiments, existing term-term synonymous knowledge is integrated into the word embedding learning framework.
    Type: Grant
    Filed: July 9, 2019
    Date of Patent: February 14, 2023
    Assignee: Baidu USA LLC
    Inventors: Hongliang Fei, Shulong Tan, Ping Li
  • Publication number: 20230035337
    Abstract: Efficient inner product search is important for many data ranking services, such as recommendation and Information Retrieval. Efficient retrieval via inner product dramatically influences the performance of such data searching and retrieval systems. To resolve deficiencies of prior approaches, embodiments of a new index graph construction approach, referred to generally as Norm Adjusted Proximity Graph (NAPG), for approximate Maximum Inner Product Search (MIPS) are presented. With adjusting factors estimated on sampled data, NAPG embodiments select more meaningful data points to connect with when constructing a graph-based index for inner product search. Extensive experiments verify that the improved graph-based index pushes the state-of-the-art of inner product search forward greatly, in the trade-off between search efficiency and effectiveness.
    Type: Application
    Filed: February 18, 2022
    Publication date: February 2, 2023
    Applicant: Baidu USA LLC
    Inventors: Shulong TAN, Zhaozhuo XU, Weijie ZHAO, Hongliang FEI, Zhixin ZHOU, Ping LI
  • Publication number: 20220417328
    Abstract: An Internet of Vehicles service activation method comprises: receiving first vehicle information of a vehicle to be activated, which is sent by a mobile terminal; when the received first vehicle information is comprised in a vehicle information database, sending an activation request message to the mobile terminal and starting a countdown of a preset time period; when a vehicle-started event sent by a vehicle-mounted terminal is received before the end of the countdown, extracting, from the vehicle-started event, second vehicle information corresponding to the vehicle-mounted terminal; and when the first vehicle information is the same as the second vehicle information, activating an Internet of Vehicles service of the vehicle to be activated.
    Type: Application
    Filed: May 27, 2021
    Publication date: December 29, 2022
    Inventors: XINGLONG QIU, HONGMING DU, JIJIE GU, ZIPING ZHENG, DONGBAO YOU, HONGLIANG FEI
  • Publication number: 20220383048
    Abstract: Current pretrained vision-language models for cross-modal retrieval tasks in English depend upon on the availability of many annotated image-caption datasets for pretraining to have English text. However, the texts are not necessarily in English. Although machine translation (MT) tools may be used to translate text to English, the performance largely relies on MT's quality and may suffer from high latency problems in real-world applications. Embodiments herein address these problems by learning cross-lingual cross-modal representations for matching images and their relevant captions in multiple languages. Embodiments seamlessly combine cross-lingual pretraining objectives and cross-modal pretraining objectives in a unified framework to learn image and text in a joint embedding space from available English image-caption data, monolingual corpus, and parallel corpus. Embodiments are shown to achieve state-of-the-art performance in retrieval tasks on multimodal multilingual image caption datasets.
    Type: Application
    Filed: April 7, 2022
    Publication date: December 1, 2022
    Applicant: Baidu USA LLC
    Inventors: Hongliang FEI, Tan YU, Ping LI
  • Patent number: 11494615
    Abstract: Described herein are embodiments for systems and methods to incorporate skip-gram convolution to extract non-consecutive local n-gram patterns for comprehensive information for varying text expressions. In one or more embodiments, one or more recurrent neural networks are employed to extract long-range features from localized level to sequential and global level via a chain-like architecture. Comprehensive experiments on large-scale datasets widely used for the text classification task were conducted to demonstrate the effectiveness of the presented deep skip-gram network embodiments. Performance evaluation on various datasets demonstrates that embodiments of the skip-gram network are powerful for general text classification task set. The skip-gram models are robust and may be generalized well on different datasets, even without tuning the hyper-parameters for specific dataset.
    Type: Grant
    Filed: March 28, 2019
    Date of Patent: November 8, 2022
    Assignee: Baidu USA LLC
    Inventors: Hongliang Fei, Chaochun Liu, Yaliang Li, Ping Li
  • Publication number: 20220318255
    Abstract: Existing research on cross-lingual retrieval cannot take good advantage of large-scale pretrained language models, such as multilingual BERT and XLM. The absence of cross-lingual passage-level relevance data for finetuning and the lack of query-document style pretraining are some of the key factors of this issue. Accordingly, embodiments of two novel retrieval-oriented pretraining tasks are presented herein to further pretrain cross-lingual language models for downstream retrieval tasks, such as cross-lingual ad-hoc retrieval (CUR) and cross-lingual question answering (CLQA). In one or more embodiments, distant supervision data was constructed from multilingual texts using section alignment to support retrieval-oriented language model pretraining. In one or more embodiments, directly finetuning language models on part of an evaluation collection was performed by making Transformers capable of accepting longer sequences.
    Type: Application
    Filed: January 14, 2022
    Publication date: October 6, 2022
    Applicant: Baidu USA LLC
    Inventors: Hongliang FEI, Puxuan YU, Ping LI
  • Patent number: 11354506
    Abstract: Previous neural network models that perform named entity recognition (NER) typically treat the input sentences as a linear sequence of words but ignore rich structural information, such as the coreference relations among non-adjacent words, phrases, or entities. Presented herein are novel approaches to learn coreference-aware word representations for the NER task. In one or more embodiments, a “CNN-BiLSTM-CRF” neural architecture is modified to include a coreference layer component on top of the BiLSTM layer to incorporate coreferential relations. Also, in one or more embodiments, a coreference regularization is added during training to ensure that the coreferential entities share similar representations and consistent predictions within the same coreference cluster. A model embodiment achieved new state-of-the-art performance when tested.
    Type: Grant
    Filed: July 30, 2019
    Date of Patent: June 7, 2022
    Assignee: Baidu USA LLC
    Inventors: Hongliang Fei, Zeyu Dai, Ping Li
  • Publication number: 20220156612
    Abstract: Learning disentangled representations is an important topic in machine learning for a wide range of applications. Disentangled latent variables represent interpretable semantic information and reflect separate factors of variation in data. Although generative models may learn latent representations and generate data samples as well, existing models may ignore the structural information among latent representations. Described in the present disclosure are embodiments to learn disentangled latent structural representations from data using decomposable variational auto-encoders, which simultaneously learn component representation and encodes component relationships. Embodiments of a novel structural prior for latent representations are disclosed to capture interactions among different data components. Embodiments are applied to data segmentation and latent relation discovery among different data components. Experiments on several datasets demonstrate the utility of the present model embodiments.
    Type: Application
    Filed: November 18, 2020
    Publication date: May 19, 2022
    Applicant: Baidu USA LLC
    Inventors: Shaogang REN, Hongliang FEI, Dingcheng LI, Ping LI
  • Publication number: 20210390270
    Abstract: Presented herein are embodiments of an unsupervised cross-lingual sentiment classification model (which may be referred to as multi-view encoder-classifier (MVEC)) that leverages an unsupervised machine translation (UMT) system and a language discriminator. Unlike previous language model (LM)-based fine-tuning approaches that adjust parameters solely based on the classification error on training data, embodiments employ an encoder-decoder framework of an UMT as a regularization component on the shared network parameters. In one or more embodiments, the cross-lingual encoder of embodiments learns a shared representation, which is effective for both reconstructing input sentences of two languages and generating more representative views from the input for classification. Experiments on five language pairs verify that an MVEC embodiment significantly outperforms other models for 8/11 sentiment classification tasks.
    Type: Application
    Filed: March 19, 2021
    Publication date: December 16, 2021
    Applicant: Baidu USA LLC
    Inventors: Hongliang FEI, Ping LI
  • Patent number: 11195128
    Abstract: Presented are systems and methods that allow healthcare providers and governments to infer demand for healthcare resources to ensure effective and timely healthcare services to patients by reducing healthcare supply shortages, emergencies, and healthcare costs. In embodiments, this is accomplished by gathering data from a number of sources to generate labeled records from which entity features and relationships between entities are extracted, correlates, and/or combined with other external healthcare data. In embodiments, this information is used to train a model that predicts healthcare resource demands given a set of input conditions or factors.
    Type: Grant
    Filed: August 2, 2016
    Date of Patent: December 7, 2021
    Assignee: Baidu USA LLC
    Inventors: Yi Zhen, Hongliang Fei, Shulong Tan, Wei Fan
  • Patent number: 11194860
    Abstract: Systems and methods are disclosed for question generation to obtain more related medical information based on observed symptoms from a patient. In embodiments, possible diseases associated with the observed symptoms are generated by querying a knowledge graph. In embodiments, candidate symptoms associated with the possible diseases are also identified and are combined with the observed symptoms to obtain combined symptom sets. In embodiments, discriminative scores for the candidate symptom sets are determined and candidate symptoms with top discriminative scores are selected. In embodiments, these selected candidate symptoms may be checked for conflicts with observed symptoms and removed from further consideration if a conflict exists. In embodiments, one or more questions may be generated based on the remaining selected candidate systems to aid in collecting information about the patient. In embodiments, the process may be repeated with the updated observed symptoms.
    Type: Grant
    Filed: July 11, 2016
    Date of Patent: December 7, 2021
    Assignee: Baidu USA LLC
    Inventors: Erheng Zhong, Chaochun Liu, Yusheng Xie, Nan Du, Hongliang Fei, Yi Zhen, Yu Cao, Richard Chun Ching Wang, Dawen Zhou, Wei Fan
  • Publication number: 20210240929
    Abstract: Described herein are embodiments for end-to-end reinforcement learning based coreference resolution models to directly optimize coreference evaluation metrics. Embodiments of a reinforced policy gradient model are disclosed to incorporate reward associated with a sequence of coreference linking actions. Furthermore, maximum entropy regularization may be used for adequate exploration to prevent a model embodiment from prematurely converging to a bad local optimum. Experiments on datasets compared with state-of-the-art methods verified the effectiveness of embodiments.
    Type: Application
    Filed: July 25, 2019
    Publication date: August 5, 2021
    Applicants: Baidu USA LLC, Baidu.com Times Technology (Beijing) Co., Ltd.
    Inventors: Hongliang FEI, Xu LI, Dingcheng LI, Ping LI
  • Patent number: 10956850
    Abstract: Causal performance analysis for store merchandising may be provided. A clustering technique may be performed based on target store location data and existing store data. Based on the clustering technique, a peer selection group is determined comprising a group of stores determined to have similar attributes to the target store location. Sales distortions for a plurality of divisions associated with the group of stores in the peer selection group may be determined. A distortion matrix may be generated comprising a ranked list of the plurality of divisions. A merchandise mix recommendation for the target store location may be presented via a user interface device.
    Type: Grant
    Filed: October 26, 2016
    Date of Patent: March 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Ajay A. Deshpande, Hongliang Fei, Arun Hampapur, Hongfei Li, Xuan Liu
  • Publication number: 20210034701
    Abstract: Previous neural network models that perform named entity recognition (NER) typically treat the input sentences as a linear sequence of words but ignore rich structural information, such as the coreference relations among non-adjacent words, phrases, or entities. Presented herein are novel approaches to learn coreference-aware word representations for the NER task. In one or more embodiments, a “CNN-BiLSTM-CRF” neural architecture is modified to include a coreference layer component on top of the BiLSTM layer to incorporate coreferential relations. Also, in one or more embodiments, a coreference regularization is added during training to ensure that the coreferential entities share similar representations and consistent predictions within the same coreference cluster. A model embodiment achieved new state-of-the-art performance when tested.
    Type: Application
    Filed: July 30, 2019
    Publication date: February 4, 2021
    Applicant: Baidu USA LLC
    Inventors: Hongliang FEI, Zeyu DAI, Ping LI
  • Publication number: 20210012215
    Abstract: Due to the high language use variability in real-life, manual construction of semantic resources to cover all synonyms is prohibitively expensive and may result in limited coverage. Described herein are systems and methods that automate the process of synonymy resource development, including both formal entities and noisy descriptions from end-users. Embodiments of a multi-task model with hierarchical task relationship are presented that learn more representative entity/term embeddings and apply them to synonym prediction. In model embodiments, a skip-gram word embedding model is extended by introducing an auxiliary task “neighboring word/term semantic type prediction” and hierarchically organize them based on the task complexity. In one or more embodiments, existing term-term synonymous knowledge is integrated into the word embedding learning framework.
    Type: Application
    Filed: July 9, 2019
    Publication date: January 14, 2021
    Applicant: Baidu USA LLC
    Inventors: Hongliang FEI, Shulong TAN, Ping LI