Patents by Inventor Xuedong David Huang

Xuedong David Huang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20230368782
    Abstract: Systems and methods are provided for training a machine learning model to learn speech representations. Labeled speech data or both labeled and unlabeled data sets is applied to a feature extractor of a machine learning model to generate latent speech representations. The latent speech representations are applied to a quantizer to generate quantized latent speech representations and to a transformer context network to generate contextual representations. Each contextual representation included in the contextual representations is aligned with a phoneme label to generate phonetically-aware contextual representations. Quantized latent representations are aligned with phoneme labels to generate phonetically aware latent speech representations.
    Type: Application
    Filed: July 3, 2023
    Publication date: November 16, 2023
    Inventors: Yao QIAN, Yu WU, Kenichi KUMATANI, Shujie LIU, Furu WEI, Nanshan ZENG, Xuedong David HUANG, Chengyi WANG
  • Patent number: 11735171
    Abstract: Systems and methods are provided for training a machine learning model to learn speech representations. Labeled speech data or both labeled and unlabeled data sets is applied to a feature extractor of a machine learning model to generate latent speech representations. The latent speech representations are applied to a quantizer to generate quantized latent speech representations and to a transformer context network to generate contextual representations. Each contextual representation included in the contextual representations is aligned with a phoneme label to generate phonetically-aware contextual representations. Quantized latent representations are aligned with phoneme labels to generate phonetically aware latent speech representations.
    Type: Grant
    Filed: May 14, 2021
    Date of Patent: August 22, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yao Qian, Yu Wu, Kenichi Kumatani, Shujie Liu, Furu Wei, Nanshan Zeng, Xuedong David Huang, Chengyi Wang
  • Publication number: 20230229960
    Abstract: Some disclosed systems are configured to obtain a knowledge module configured to receive one or more knowledge inputs corresponding to one or more different modalities and generate a set of knowledge embeddings to be integrated with a set of multi-modal embeddings generated by a multi-modal main model. The systems receive a knowledge input at the knowledge module, identify a knowledge type associated with the knowledge input, and extract a knowledge unit from the knowledge input. The systems select a representation model that corresponds to the knowledge type and select a grounding type configured to ground the at least one knowledge unit into the representation model. The systems then ground the knowledge unit into the representation model according to the grounding type.
    Type: Application
    Filed: January 19, 2022
    Publication date: July 20, 2023
    Inventors: Chenguang ZHU, Lu YUAN, Yao QIAN, Yu SHI, Nanshan ZENG, Xuedong David HUANG
  • Publication number: 20220366898
    Abstract: Systems and methods are provided for training a machine learning model to learn speech representations. Labeled speech data or both labeled and unlabeled data sets is applied to a feature extractor of a machine learning model to generate latent speech representations. The latent speech representations are applied to a quantizer to generate quantized latent speech representations and to a transformer context network to generate contextual representations. Each contextual representation included in the contextual representations is aligned with a phoneme label to generate phonetically-aware contextual representations. Quantized latent representations are aligned with phoneme labels to generate phonetically aware latent speech representations.
    Type: Application
    Filed: May 14, 2021
    Publication date: November 17, 2022
    Inventors: Yao QIAN, Yu WU, Kenichi KUMATANI, Shujie LIU, Furu WEI, Nanshan ZENG, Xuedong David HUANG, Chengyi WANG
  • Patent number: 10984337
    Abstract: Searching is assisted by recognizing a selection of text from a document as an indication that a user wishes to initiate a search based on the selected text. The user is provided with query suggestions based on the selected text and the query suggestions are ranked based on a context provided by the document. The user may select the text by using a mouse, drawing a circle around the text on a touch screen, or by other input techniques. The query suggestions may be based on query reformulation or query expansion techniques applied to the selected text. Context provided by the document is used by a language model and/or an artificial intelligence system to rank the query suggestions in predicted order of relevance based on the selected text and the context.
    Type: Grant
    Filed: February 29, 2012
    Date of Patent: April 20, 2021
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Peng Bai, Zheng Chen, Xuedong David Huang, Xiaochuan Ni, Jian-Tao Sun, Zhimin Zhang
  • Patent number: 10444979
    Abstract: Computer-readable media, computer systems, and computing devices for initiating a search function, such as presentation of a search box or initiation of a search, is provided. In one embodiment, the method includes detecting movement of a selector from within a display area to an edge of the display area. Such a selector can be controlled by an input device coupled to a user device. In response to detecting movement of the selector from within the display area to the edge of the display area, a search-query input area associated with a search engine is presented within a display screen view.
    Type: Grant
    Filed: September 11, 2012
    Date of Patent: October 15, 2019
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Xuedong David Huang, Samuel Y. Shen, Hongjiang Zhang, Yong Rui
  • Patent number: 10409851
    Abstract: A search of displayed content may be automatically performed in response to receipt of a search gesture that defines a scope of the search and initiates the search. The search gesture may define a region of content within the displayed content. A search query may be formulated based on the region of content defined by the search gesture. In response to completion of the search gesture, a search may be automatically initiated. In some examples, the search gesture comprises a generally circular gesture that substantially bounds the region of content.
    Type: Grant
    Filed: January 31, 2011
    Date of Patent: September 10, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Xuedong David Huang, Qing (Alex) Lu, Zhaowei (Charlie) Jiang, Vikas Rajvanshy
  • Publication number: 20160267193
    Abstract: A communication-powered searching system provides real-time personalized search assistance to a user by integrating search functionality with real-time communication. Upon submitting a query and receiving search results from the communication-powered searching system, the user may select a communication link included in the search results to activate communication with an entity associated with the communication link. The communication-powered searching system may then refine search results displayed to the user based on information exchanged between the user and the entity. The refinements may be made in real time or substantially in real time.
    Type: Application
    Filed: May 20, 2016
    Publication date: September 15, 2016
    Inventors: Xuedong David Huang, Zheng Chen, Zhimin Zhang
  • Patent number: 9390140
    Abstract: A communication-powered searching system provides real-time personalized search assistance to a user by integrating search functionality with real-time communication. Upon submitting a query and receiving search results from the communication-powered searching system, the user may select a communication link included in the search results to activate communication with an entity associated with the communication link. The communication-powered searching system may then refine search results displayed to the user based on information exchanged between the user and the entity. The refinements may be made in real time or substantially in real time.
    Type: Grant
    Filed: February 22, 2013
    Date of Patent: July 12, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Xuedong David Huang, Zheng Chen, Zhimin Zhang
  • Patent number: 9043358
    Abstract: A unified search service may collect information related to an enterprise from at least one of publicly available data and private enterprise data. In some implementations, crowd sourcing may be used to determine a source list of one or more sources of information. Authored content can be generated, such as by combining one or more items of information from the public data with one or more items of information from the private enterprise data. Further, in some implementations, a public index may be generated from the public data, and one or more affiliation indexes may be generated from the private enterprise data. For example, a first affiliation index may contain confidential enterprise information, while a second affiliation index may contain non-confidential enterprise information. A user's affiliation to the enterprise may be taken into consideration when determining which indexes to use when responding to a search request from the user.
    Type: Grant
    Filed: March 9, 2011
    Date of Patent: May 26, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Lili Cheng, Xuedong David Huang, Heung-Yeung Shum, Eric J. Horvitz, James H. Lewallen, Todd D. Newman, David S. Taniguchi
  • Patent number: 8995626
    Abstract: Described is a technology by which a storage at a telephone device (e.g., a client telephone) is synchronized with information corresponding to actions performed at a computing device (e.g., a server) on behalf of the client. For example, the server may employ speech recognition to recognize a name or number spoken into the client telephone, and in response, dial out a corresponding telephone number for the client telephone. This action is synchronized back to the client storage so that the client's call history includes knowledge of the server's dialing action. Thereafter, an action at the telephone device that accesses the call history (e.g., for redialing or scrolling) obtains the full call history, independent of whether the telephone device or computing device performed the action. Changes made via telephone device may be similarly synchronized to the computing device, such as directly dialed calls, user-input speed dial information, and so forth.
    Type: Grant
    Filed: January 22, 2007
    Date of Patent: March 31, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventor: Xuedong David Huang
  • Publication number: 20140244629
    Abstract: A communication-powered searching system provides real-time personalized search assistance to a user by integrating search functionality with real-time communication. Upon submitting a query and receiving search results from the communication-powered searching system, the user may select a communication link included in the search results to activate communication with an entity associated with the communication link. The communication-powered searching system may then refine search results displayed to the user based on information exchanged between the user and the entity. The refinements may be made in real time or substantially in real time.
    Type: Application
    Filed: February 22, 2013
    Publication date: August 28, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: Xuedong David Huang, Zheng Chen, Zhimin Zhang
  • Publication number: 20130226935
    Abstract: Searching is assisted by recognizing a selection of text from a document as an indication that a user wishes to initiate a search based on the selected text. The user is provided with query suggestions based on the selected text and the query suggestions are ranked based on a context provided by the document. The user may select the text by using a mouse, drawing a circle around the text on a touch screen, or by other input techniques. The query suggestions may be based on query reformulation or query expansion techniques applied to the selected text. Context provided by the document is used by a language model and/or an artificial intelligence system to rank the query suggestions in predicted order of relevance based on the selected text and the context.
    Type: Application
    Filed: February 29, 2012
    Publication date: August 29, 2013
    Applicant: Microsoft Corporation
    Inventors: Peng Bai, Zheng Chen, Xuedong David Huang, Xiaochuan Ni, Jian-Tao Sun, Zhimin Zhang
  • Publication number: 20130006957
    Abstract: Computer-readable media, computer systems, and computing devices for initiating a search function, such as presentation of a search box or initiation of a search, is provided. In one embodiment, the method includes detecting movement of a selector from within a display area to an edge of the display area. Such a selector can be controlled by an input device coupled to a user device. In response to detecting movement of the selector from within the display area to the edge of the display area, a search-query input area associated with a search engine is presented within a display screen view.
    Type: Application
    Filed: September 11, 2012
    Publication date: January 3, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: XUEDONG DAVID HUANG, SAMUEL Y. SHEN, HONGJIANG ZHANG, YONG RUI
  • Publication number: 20120233209
    Abstract: A unified search service may collect information related to an enterprise from at least one of publicly available data and private enterprise data. In some implementations, crowd sourcing may be used to determine a source list of one or more sources of information. Authored content can be generated, such as by combining one or more items of information from the public data with one or more items of information from the private enterprise data. Further, in some implementations, a public index may be generated from the public data, and one or more affiliation indexes may be generated from the private enterprise data. For example, a first affiliation index may contain confidential enterprise information, while a second affiliation index may contain non-confidential enterprise information. A user's affiliation to the enterprise may be taken into consideration when determining which indexes to use when responding to a search request from the user.
    Type: Application
    Filed: March 9, 2011
    Publication date: September 13, 2012
    Applicant: Microsoft Corporation
    Inventors: Lili Cheng, Xuedong David Huang, Heung-Yeung Shum, Erik J. Horvitz, James H. Lewallen, Todd D. Newman, David S. Taniguchi
  • Publication number: 20120197857
    Abstract: A search of displayed content may be automatically performed in response to receipt of a search gesture that defines a scope of the search and initiates the search. The search gesture may define a region of content within the displayed content. A search query may be formulated based on the region of content defined by the search gesture. In response to completion of the search gesture, a search may be automatically initiated. In some examples, the search gesture comprises a generally circular gesture that substantially bounds the region of content.
    Type: Application
    Filed: January 31, 2011
    Publication date: August 2, 2012
    Applicant: Microsoft Corporation
    Inventors: Xuedong David Huang, Qing (Alex) Lu, Zhaowei (Charlie) Jiang, Vikas Rajvanshy
  • Patent number: 7840638
    Abstract: A multimedia conference technique is disclosed that allows physically remote users to participate in an immersive telecollaborative environment by synchronizing multiple data, images and sounds. The multimedia conference implementation provides users with the perception of being in the same room visually as well as acoustically according to an orientation plan which reflects each remote user's position within the multimedia conference environment.
    Type: Grant
    Filed: June 27, 2008
    Date of Patent: November 23, 2010
    Assignee: Microsoft Corporation
    Inventors: Zhengyou Zhang, Xuedong David Huang, Zicheng Liu, Cha Zhang, Philip A. Chou, Christian Huitema
  • Publication number: 20100228825
    Abstract: The claimed subject matter provides a system and/or a method that facilitates enhancing the employment of a telepresence session. An automatic telepresence engine that can evaluate data associated with at least one of an attendee, a schedule for an attendee, or a portion of an electronic communication for an attendee. The automatic telepresence engine can identify at least one the following for a telepresence session based upon the evaluated data: a participant to include for the telepresence session, a portion of data related to a presentation within the telepresence session, a portion of data related to a meeting topic within the telepresence session, a device utilized by an attendee to communicate within the telepresence session. The automatic telepresence engine can initiate the telepresence session within a communication framework that includes two or more virtually represented users that communicate therein.
    Type: Application
    Filed: March 6, 2009
    Publication date: September 9, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Rajesh Kutpadi Hegde, Xuedong David Huang, Sharon Kay Cunnington, Jin Li, Michel Pahud, Ryan M. Burkhardt, Kori Marie Quinn, Jayman Dalal, Zhengyou Zhang
  • Patent number: 7646755
    Abstract: Portable computing devices automatically interface with other computing devices to interact in a collaborative effort toward providing a single, seamless computing experience for a user. As a user walks into a room with a cellular telephone certain functionality and data can be automatically unloaded to a desktop computer or other device based on a user or device identification or state. For example, a conversation on a cellular telephone can be automatically migrated to a desktop telephone as a user sits down. As a user is about to leave a room for a meeting, the desktop computer can update the telephone with latest versions of certain files. Thus, devices can automatically aggregate and/or decouple to provide a user with a single computing experience. These portable devices can broadcast an extensible set of services to other devices as well as to a host computer or server.
    Type: Grant
    Filed: June 30, 2005
    Date of Patent: January 12, 2010
    Assignee: Microsoft Corporation
    Inventors: David Joshua Kurlander, Xuedong David Huang, Yuan Kong, Silviu-Petru Cucerzan
  • Publication number: 20090327418
    Abstract: A multimedia conference technique is disclosed that allows physically remote users to participate in an immersive telecollaborative environment by synchronizing multiple data, images and sounds. The multimedia conference implementation provides users with the perception of being in the same room visually as well as acoustically according to an orientation plan which reflects each remote user's position within the multimedia conference environment.
    Type: Application
    Filed: June 27, 2008
    Publication date: December 31, 2009
    Applicant: MICROSOFT CORPORATION
    Inventors: Zhengyou Zhang, Xuedong David Huang, Zicheng Liu, Cha Zhang, Philip A. Chou, Christian Huitema