Patents by Inventor Huiji Gao

Huiji Gao has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11966700
    Abstract: Embodiments of the described technologies are capable of reading a text sequence that include at least one word; extracting model input data from the text sequence, where the model input data includes, for each word of the text sequence, segment data and non-segment data; using a first machine learning model and at least one second machine learning model, generating, for each word of the text sequence, a multi-level feature set; outputting, by a third machine learning model, in response to input to the third machine learning model of the multi-level feature set, a tagged version of the text sequence; executing a search based at least in part on the tagged version of the text sequence.
    Type: Grant
    Filed: March 5, 2021
    Date of Patent: April 23, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yuwei Qiu, Gonzalo Aniano Porcile, Yu Gan, Qin Iris Wang, Haichao Wei, Huiji Gao
  • Publication number: 20230135401
    Abstract: In an example, a particular type of deep learning model is used in the global model of the GDMix model: a Factorization Machine. A Factorization Machine combines a Support Vector Machine (SVM) and Matrix Factorizations. It has the advantage of modeling data with huge sparsity well, while maintaining a linear time complexity. A modification may be further made to the Factorization Machine by introducing L2 norm reduction. This acts to divide calculations made by the Factorization Machine into a portion that can be precomputed and a portion that cannot be precomputed. The portion that can be precomputed is then precomputed in an offline manner. As such, when the model is operated in an online manner, the Factorization Machine only needs to compute the portion that cannot be precomputed, reducing the number of operations that need to performed at runtime and greatly improving processing speed over prior machine learned models.
    Type: Application
    Filed: October 28, 2021
    Publication date: May 4, 2023
    Inventors: Qiang Xiao, Haichao Wei, Jun Shi, Huiji Gao
  • Publication number: 20230124258
    Abstract: Methods, systems, and computer programs are presented for determining parameters of neural networks and selecting embedding dimensions for the feature fields. One method includes an operation for initializing parameters of a neural network and weights for embedding sizes for each feature associated with the neural network. The parameters of the neural network and the weights are iteratively optimized. Each optimization iteration comprises training the neural network with current parameters of the neural network to optimize a value of the weights, and training the neural network with current values of the weights to optimize the parameters of the neural network. Further, the method includes operations for selecting embedding sizes for the features based on the optimized values of the weights, and for training the neural network based on the selected embedding sizes for the features to obtain an estimator model. A prediction is generated utilizing the estimator model.
    Type: Application
    Filed: October 19, 2021
    Publication date: April 20, 2023
    Inventors: Xiangyu Zhao, Sida Wang, Huiji Gao, Bo Long, Bee-Chung Chen, Weiwei Guo, Jun Shi
  • Patent number: 11514249
    Abstract: Embodiments of the disclosed technologies use machine learning to produce thread level classification data and case level classification data.
    Type: Grant
    Filed: April 27, 2020
    Date of Patent: November 29, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Zhiling Wan, Chih-Hui Wang, Haichao Wei, Lili Zhou, Huiji Gao
  • Patent number: 11481627
    Abstract: Computer-implemented techniques for learning composite machine learned models are disclosed. Benefits to implementors of the disclosed techniques include allowing non-machine learning experts to use the techniques for learning a composite machine learned model based on a learning dataset, reducing or eliminating the explorative trial and error process of manually tuning architectural parameters and hyperparameters, and reducing the computing resource requirements and model learning time for learning composite machine learned models. The techniques improve the operation of distributed learning computing systems by reducing or eliminating straggler effects and by reducing or minimizing synchronization latency when executing a composite model search algorithm for learning a composite machine learned model.
    Type: Grant
    Filed: October 30, 2019
    Date of Patent: October 25, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yuwei Qiu, Chengming Jiang, Huiji Gao, Bee-Chung Chen, Bo Long
  • Patent number: 11475085
    Abstract: Machine learning based method for generating personalized query suggestions is described. Different users may have different search intent even when they are inputting the same search query. The technical problem of personalizing search query suggestions produced by a machine learning model is addressed by extending the sequence to sequence machine learning model framework to be able to take into consideration additional, personalized features of the user, such as, e.g., profile industry, language, geographic location, etc. This methodology includes an offline model training framework as well as an online serving framework.
    Type: Grant
    Filed: February 26, 2020
    Date of Patent: October 18, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jianling Zhong, Weiwei Guo, Lin Guo, Huiji Gao, Bo Long
  • Publication number: 20220318499
    Abstract: Computer-implemented machine learning-based techniques for assisted electronic message composition in a vertical messaging context. The vertical messaging context may be any electronic messaging context in which senders repetitively compose electronic messages to send to recipients where the messages are not identical but nonetheless have common tone, sentiment, content, and structure. The techniques assist users that compose electronic messages in a particular vertical messaging context in composing those messages quickly, with few or no grammatical errors, and with a likelihood of being positively received by the recipients of the messages.
    Type: Application
    Filed: March 31, 2021
    Publication date: October 6, 2022
    Inventors: Qiang XIAO, Haichao WEI, Praveen Kumar BODIGUTLA, Huiji GAO, Arya G. CHOUDHURY
  • Publication number: 20220284191
    Abstract: Embodiments of the described technologies are capable of reading a text sequence that include at least one word; extracting model input data from the text sequence, where the model input data includes, for each word of the text sequence, segment data and non-segment data; using a first machine learning model and at least one second machine learning model, generating, for each word of the text sequence, a multi-level feature set; outputting, by a third machine learning model, in response to input to the third machine learning model of the multi-level feature set, a tagged version of the text sequence; executing a search based at least in part on the tagged version of the text sequence.
    Type: Application
    Filed: March 5, 2021
    Publication date: September 8, 2022
    Inventors: Yuwei QIU, Gonzalo ANIANO PORCILE, Yu GAN, Qin Iris WANG, Haichao WEI, Huiji GAO
  • Publication number: 20220180241
    Abstract: Embodiments of the disclosed technologies provide tree-based transfer learning of hyperparameters of a machine learning model or tunable parameters of a black box system. A similar reference task tree is selected from a set of reference task trees. Data is transferred from the similar reference task tree to a target task tree.
    Type: Application
    Filed: December 4, 2020
    Publication date: June 9, 2022
    Inventors: QINGQUAN SONG, CHENGMING JIANG, YUNBO OUYANG, JUN JIA, HUIJI GAO, BO LONG, BEE-CHUNG CHEN, XIA HU
  • Publication number: 20220172040
    Abstract: Techniques for training a machine-learned model based on feedback are provided. In one technique, reformulation data that comprises a plurality of sequence pairs is stored. Also, feedback data that comprises a plurality of sequence triples is stored. Based on the reformulation data and the feedback data, one or more machine learning techniques are used to train a sequence-to-sequence model. Training the sequence-to-sequence model involves using a loss function that comprises (1) a first term that takes, as input, sequence pairs from the reformulation data and (2) a second term that takes, as input, sequence triples from the feedback data. After training the sequence-to-sequence, a search query is received from a computing device. In response to receiving the search query, a set of embeddings is retrieved, each corresponding to a token in the search query. The set of embeddings is input into the sequence-to-sequence model, which generates one or more query suggestions.
    Type: Application
    Filed: November 30, 2020
    Publication date: June 2, 2022
    Inventors: Michaeel M. KAZI, Weiwei GUO, Huiji GAO, Bo LONG
  • Publication number: 20220172039
    Abstract: Techniques for using machine learning to predict document types for incomplete queries are provided. In one technique, one or more characters from input are identified. For each character, an embedding that corresponds to that character is retrieved. The embedding was machine-learned while training a neural network that outputs multiple classifications, each corresponding to a different document type. One or more embeddings, each corresponding to one of the characters, are input into the neural network. Based on the inputting, the neural network generates an output that comprises multiple values that includes (1) a first value that reflects a first probability that the input is associated with a first document type and (2) a second value that reflects a second probability that the input is associated with a second document type. Based on the first and second probabilities, a set of query completions is identified and presented on the computing device.
    Type: Application
    Filed: November 30, 2020
    Publication date: June 2, 2022
    Inventors: Xiaowei LIU, Weiwei GUO, Huiji GAO, Bo LONG
  • Patent number: 11232154
    Abstract: A neural related query generation approach in a search system uses a neural encoder that reads through a source query to build a query intent vector. The approach then processes the query intent vector through a neural decoder to emit a related query. By doing so, the approach gathers information from the entire source query before generating the related query. As a result, the neural encoder-decoder approach captures long-range dependencies in the source query such as, for example, structural ordering of query keywords. The approach can be used to generate related queries for long-tail source queries, including long-tail source queries never before or not recently submitted to the search system.
    Type: Grant
    Filed: March 28, 2019
    Date of Patent: January 25, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Weiwei Guo, Lin Guo, Jianling Zhong, Huiji Gao, Bo Long
  • Patent number: 11188937
    Abstract: Techniques for extracting features of entities and targets that can be applied in a set of applications, such as entity selection prediction, audience expansion, feed relevance, and job recommendation. In one technique, entity interaction data is stored that indicates, for each of multiple entities, one or more targets that are associated with items with which the entity interacted. Token association data is stored that indicates, for each of multiple tokens, one or more targets that are associated with the token. Then, using one or more machine learning techniques, entity embeddings and target embeddings are generated based on the entity interaction data and the token association data. Later, a request for content is received from a particular entity. Based on at least one entity embedding, a content item for the particular entity is identified. The content item is transferred over a computer network and presented to the particular entity.
    Type: Grant
    Filed: May 31, 2018
    Date of Patent: November 30, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Huiji Gao, Jianling Zhong, Haishan Liu
  • Patent number: 11182432
    Abstract: The disclosed embodiments provide a system for performing a natural language search. During operation, the system applies a first machine learning model to a natural language query to predict one or more search intentions associated with the natural language query. Next, the system applies a second machine learning model to the natural language query to produce one or more search parameters associated with a first intention in the search intention(s), wherein the search parameter(s) include a field and a value of the field. The system then performs a first search of a first vertical associated with the first intention using the search parameter(s). Finally, the system generates a ranking containing a first set of search results from the first search of the first vertical and outputs at least a portion of the ranking in a response to the natural language query.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: November 23, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jun Shi, Huiji Gao, Ying Xiong, Michaeel M. Kazi, Yu Gan, Yu Liu, Xiaowei Liu, Gonzalo Jorge Aniano Porcile, Bo Long, Abhimanyu Lad, Liang Zhang
  • Publication number: 20210334467
    Abstract: Embodiments of the disclosed technologies use machine learning to produce thread level classification data and case level classification data.
    Type: Application
    Filed: April 27, 2020
    Publication date: October 28, 2021
    Inventors: Zhiling Wan, Chih-Hui Wang, Haichao Wei, Lili Zhou, Huiji Gao
  • Patent number: 11106662
    Abstract: In an embodiment, the disclosed technologies include extracting, from at least one search log, session data including at least three semantically related queries and corresponding timestamp data; using the session data, creating a training sequence that includes source query data, context query data, and target query data, the source query data having both a temporal relationship and a lexical relationship to the target query data and the context query data having a temporal relationship to the source query data; creating a learned model by, using a machine learning-based modeling process, learning a mapping of a semantic representation of the context query data and the source query data to a semantic representation of the target query data; in response to a new query, using the learned model to generate at least one recommended query that is semantically related to the new query.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: August 31, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Zhong Yi Wan, Weiwei Guo, Michaeel M. Kazi, Huiji Gao, Bo Long
  • Publication number: 20210263982
    Abstract: Machine learning based method for generating personalized query suggestions is described. Different users may have different search intent even when they are inputting the same search query. The technical problem of personalizing search query suggestions produced by a machine learning model is addressed by extending the sequence to sequence machine learning model framework to be able to take into consideration additional, personalized features of the user, such as, e.g., profile industry, language, geographic location, etc. This methodology includes an offline model training framework as well as an online serving framework.
    Type: Application
    Filed: February 26, 2020
    Publication date: August 26, 2021
    Inventors: Jianling Zhong, Weiwei Guo, Lin Guo, Huiji Gao, Bo Long
  • Publication number: 20210133555
    Abstract: Computer-implemented techniques for learning composite machine learned models are disclosed. Benefits to implementors of the disclosed techniques include allowing non-machine learning experts to use the techniques for learning a composite machine learned model based on a learning dataset, reducing or eliminating the explorative trial and error process of manually tuning architectural parameters and hyperparameters, and reducing the computing resource requirements and model learning time for learning composite machine learned models. The techniques improve the operation of distributed learning computing systems by reducing or eliminating straggler effects and by reducing or minimizing synchronization latency when executing a composite model search algorithm for learning a composite machine learned model.
    Type: Application
    Filed: October 30, 2019
    Publication date: May 6, 2021
    Inventors: Yuwei Qiu, Chengming Jiang, Huiji Gao, Bee-Chung Chen, Bo Long
  • Publication number: 20210097374
    Abstract: The disclosed embodiments provide a system for processing a search query. During operation, the system generates, based on one or more embedding layers in a machine learning model, input embeddings of the search query from a user of an online system. Next, the system applies one or more convolution layers in the machine learning model to the input embeddings to generate convolutional output from combinations of the input embeddings. The system then processes the convolutional output using one or more prediction layers in the machine learning model to produce a set of intent scores representing predicted likelihoods of a set of search intentions in the search query. Finally, the system performs a search of one or more verticals in the online system based on the search query and the set of intent scores.
    Type: Application
    Filed: September 30, 2019
    Publication date: April 1, 2021
    Applicant: Microsoft Technology Licensing, LLC
    Inventors: Xiaowei Liu, Yu Gan, Huiji Gao, Bo Long
  • Publication number: 20210097063
    Abstract: In an embodiment, the disclosed technologies include extracting, from at least one search log, session data including at least three semantically related queries and corresponding timestamp data; using the session data, creating a training sequence that includes source query data, context query data, and target query data, the source query data having both a temporal relationship and a lexical relationship to the target query data and the context query data having a temporal relationship to the source query data; creating a learned model by, using a machine learning-based modeling process, learning a mapping of a semantic representation of the context query data and the source query data to a semantic representation of the target query data; in response to a new query, using the learned model to generate at least one recommended query that is semantically related to the new query.
    Type: Application
    Filed: September 26, 2019
    Publication date: April 1, 2021
    Inventors: Zhong Yi Wan, Weiwei Guo, Michaeel M. Kazi, Huiji Gao, Bo Long