Patents by Inventor Xuedong Huang

Xuedong Huang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Audio Stream Processing for Distributed Device Meeting

Publication number: 20200351603

Abstract: A computer implemented method includes receiving multiple channels of audio from three or more microphones detecting speech from a meeting of multiple users, localizing speech sources to determine an approximate direction of arrival of speech from a user, using a speech unmixing model to select two channels corresponding to a primary and a secondary microphone, and sending the two selected channels to a meeting server for generation of a speaker attributed meeting transcript.

Type: Application

Filed: April 30, 2019

Publication date: November 5, 2020

Inventors: William Isaac Hinthorn, Lijuan Qin, Nanshan Zeng, Dimitrios Basile Dimitriadis, Zhuo Chen, Andreas Stolcke, Takuya Yoshioka, Xuedong Huang
Processing Overlapping Speech from Distributed Devices

Publication number: 20200349954

Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio streams transmitted from corresponding multiple distributed devices, performing, via a neural network model, continuous speech separation for one or more of the received audio signals having overlapped speech, and providing the separated speech on a fixed number of separate output audio channels.

Type: Application

Filed: April 30, 2019

Publication date: November 5, 2020

Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
Distributed Device Meeting Initiation

Publication number: 20200349949

Abstract: A computer implemented method includes receiving audio streams at a meeting server from two distributed devices that are streaming audio captured during an ad-hoc meeting between at least two users, comparing the received audio streams to determine that the received audio streams are representative of sound from the ad-hoc meeting, generating a meeting instance to process the audio streams in response to the comparing determining that the audio streams are representative of sound from the ad-hoc meeting, and processing the received audio streams to generate a transcript of the ad-hoc meeting.

Type: Application

Filed: April 30, 2019

Publication date: November 5, 2020

Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
Customized multi-device translated conversations

Patent number: 10817678

Abstract: Systems and methods may be used to provide transcription and translation services. A method may include initializing a plurality of user devices with respective language output selections in a translation group by receiving a shared identifier from the plurality of user devices and transcribing the audio stream to transcribed text. The method may include translating the transcribed text to one or more of the respective language output selections when an original language of the transcribed text differs from the one or more of the respective language output selections. The method may include sending, a user device in the translation group, the transcribed text including translated text in a language corresponding to the respective language output selection for the user device. In an example, the method may include customizing the transcription or the translation, such as to a particular topic, location, user, or the like.

Type: Grant

Filed: August 5, 2019

Date of Patent: October 27, 2020

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: William D. Lewis, Ivo José Garcia Dos Santos, Tanvi Surti, Arul A. Menezes, Olivier Nano, Christian Wendt, Xuedong Huang
Audio stream processing for distributed device meeting

Patent number: 10812921

Abstract: A computer implemented method includes receiving multiple channels of audio from three or more microphones detecting speech from a meeting of multiple users, localizing speech sources to determine an approximate direction of arrival of speech from a user, using a speech unmixing model to select two channels corresponding to a primary and a secondary microphone, and sending the two selected channels to a meeting server for generation of a speaker attributed meeting transcript.

Type: Grant

Filed: April 30, 2019

Date of Patent: October 20, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: William Isaac Hinthorn, Lijuan Qin, Nanshan Zeng, Dimitrios Basile Dimitriadis, Zhuo Chen, Andreas Stolcke, Takuya Yoshioka, Xuedong Huang
Universal Interaction for Capturing Content to Persistent Storage

Publication number: 20200327148

Abstract: Systems and methods for enhanced content capture on a computing device are presented. In operation, a user interaction is detected on a computing device with the intent to capture content to a content store associated with the computer user operating the computing device. A content capture service is executed to capture content to the content store, comprising the following: applications executing on the computing device are notified to suspend output to display views corresponding to the applications; content to be captured to the content store is identified and obtained; the applications executing on the computing device are notified to resume output to display views; and automatically storing the obtained content in a content store associated with the computer user.

Type: Application

Filed: June 26, 2020

Publication date: October 15, 2020

Applicant: Microsoft Technology Licensing, LLC

Inventors: Madhur Dixit, Chinmay Vaishampayan, Justin Varacheril George, Nirav Ashwin Kamdar, Deepak Achuthan Menon, Srinivasa V. Thirumalai-Anandanpillai, Ramindar Singh Khatra, Xuedong Huang, Akshad Viswanathan
Synchronization of audio signals from distributed devices

Patent number: 10743107

Abstract: A computer implemented method includes receiving audio signals representative of speech via multiple audio channels transmitted from corresponding multiple distributed devices, designating one of the audio channels as a reference channel, and for each of the remaining audio channels, determine a difference in time from the reference channel, and correcting each remaining audio channel by compensating for the corresponding difference in time from the reference channel.

Type: Grant

Filed: April 30, 2019

Date of Patent: August 11, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Takuya Yoshioka, Andreas Stolcke, Zhuo Chen, Dimitrios Basile Dimitriadis, Nanshan Zeng, Lijuan Qin, William Isaac Hinthorn, Xuedong Huang
Computational systems and methods for verifying personal information during transactions

Patent number: 10606989

Abstract: Methods, apparatuses, computer program products, devices and systems are described that carry out accessing at least one persona that includes a unique identifier that is at least partly based on a first user's device-identifier data and the first user's network-participation data; verifying the persona by comparing the first user's device-identifier data and the first user's network-participation data of the unique identifier to a second user's device-identifier data and the second user's network-participation data; and presenting the persona in response to a request for personal information.

Type: Grant

Filed: December 30, 2011

Date of Patent: March 31, 2020

Assignee: Elwha LLC

Inventors: Marc E. Davis, Matthew G. Dyor, William Gates, Xuedong Huang, Roderick A. Hyde, Edward K. Y. Jung, Jordin T. Kare, Royce A. Levien, Richard T. Lord, Robert W. Lord, Qi Lu, Mark A. Malamud, Nathan P. Myhrvold, Satya Nadella, Daniel Reed, Harry Shum, Clarence T. Tegreene, Lowell L. Wood, Jr.
SYSTEMS, METHODS, AND COMPUTER-READABLE STORAGE DEVICE FOR GENERATING NOTES FOR A MEETING BASED ON PARTICIPANT ACTIONS AND MACHINE LEARNING

Publication number: 20200082824

Abstract: Systems, methods, and computer-readable storage devices are disclosed for generating smart notes for a meeting based on participant actions and machine learning. One method including: receiving meeting data from a plurality of participant devices participating in an online meeting; continuously generating text data based on the received audio data from each participant device of the plurality of participant devices; iteratively performing the following steps until receiving meeting data for the meeting has ended, the steps including: receiving an indication that a predefined action has occurred on the first participating device; generating a participant segment of the meeting data for at least the first participant device from a first predetermined time before when the predefined action occurred to when the predefined action occurred; determining whether the receiving meeting data of the meeting has ended; and generating a summary of the meeting.

Type: Application

Filed: November 13, 2019

Publication date: March 12, 2020

Applicant: Microsoft Technology Licensing, LLC

Inventors: Heiko RAHMEL, Li-Juan QIN, Xuedong HUANG, Wei XIONG
Customized Multi-Device Translated Conversations

Publication number: 20200034437

Abstract: Systems and methods may be used to provide transcription and translation services. A method may include initializing a plurality of user devices with respective language output selections in a translation group by receiving a shared identifier from the plurality of user devices and transcribing the audio stream to transcribed text. The method may include translating the transcribed text to one or more of the respective language output selections when an original language of the transcribed text differs from the one or more of the respective language output selections. The method may include sending, a user device in the translation group, the transcribed text including translated text in a language corresponding to the respective language output selection for the user device. In an example, the method may include customizing the transcription or the translation, such as to a particular topic, location, user, or the like.

Type: Application

Filed: August 5, 2019

Publication date: January 30, 2020

Inventors: William D. Lewis, Ivo José Garcia Dos Santos, Tanvi Surti, Arul A. Menezes, Olivier Nano, Christian Wendt, Xuedong Huang
Computational systems and methods for regulating information flow during interactions

Patent number: 10546306

Abstract: Methods, apparatuses, computer program products, devices and systems are described that carry out accepting at least one persona from a party to a transaction; evaluating the transaction; and negotiating receipt of at least one different persona from the party to the transaction at least partly based on an evaluation of the transaction.

Type: Grant

Filed: December 29, 2011

Date of Patent: January 28, 2020

Assignee: Elwha LLC

Inventors: Marc E. Davis, Matthew G. Dyor, William Gates, Xuedong Huang, Roderick A. Hyde, Edward K. Y. Jung, Jordin T. Kare, Royce A. Levien, Richard T. Lord, Robert W. Lord, Qi Lu, Mark A. Malamud, Nathan P. Myhrvold, Satya Nadella, Daniel Reed, Harry Shum, Clarence T. Tegreene, Lowell L. Wood, Jr.
Computational systems and methods for regulating information flow during interactions

Patent number: 10546295

Abstract: Methods, apparatuses, computer program products, devices and systems are described that carry out accepting at least one request for personal information from a party to a transaction; evaluating the transaction; and negotiating presentation of at least one persona to the party to the transaction at least partly based on an evaluation of the transaction.

Type: Grant

Filed: December 29, 2011

Date of Patent: January 28, 2020

Assignee: Elwha LLC

Inventors: Marc E. Davis, Matthew G. Dyor, William Gates, Xuedong Huang, Roderick A. Hyde, Edward K. Y. Jung, Jordin T. Kare, Royce A. Levien, Richard T. Lord, Robert W. Lord, Qi Lu, Mark A. Malamud, Nathan P. Myhrvold, Satya Nadella, Daniel Reed, Harry Shum, Clarence T. Tegreene, Lowell L. Wood, Jr.
Computational systems and methods for identifying a communications partner

Patent number: 10523618

Abstract: Methods, apparatuses, computer program products, devices and systems are described that carry out accepting at least one email communication from at least one member of a network; disambiguating the at least one search term including associating the at least one search term with at least one of network-participation identifier data or device-identifier data; and presenting the sender profile in association with the at least one email communication.

Type: Grant

Filed: December 16, 2011

Date of Patent: December 31, 2019

Assignee: Elwha LLC

Inventors: Marc E. Davis, Matthew G. Dyor, William Gates, Xuedong Huang, Roderick A. Hyde, Edward K. Y. Jung, Jordin T. Kare, Royce A. Levien, Richard T. Lord, Robert W. Lord, Qi Lu, Mark A. Malamud, Nathan P. Myhrvold, Satya Nadella, Daniel Reed, Harry Shum, Clarence T. Tegreene, Lowell L. Wood, Jr.
Systems, methods, and computer-readable storage device for generating notes for a meeting based on participant actions and machine learning

Patent number: 10510346

Abstract: Systems, methods, and computer-readable storage devices are disclosed for generating smart notes for a meeting based on participant actions and machine learning. One method including: receiving meeting data from a plurality of participant devices participating in an online meeting; continuously generating text data based on the received audio data from each participant device of the plurality of participant devices; iteratively performing the following steps until receiving meeting data for the meeting has ended, the steps including: receiving an indication that a predefined action has occurred on the first participating device; generating a participant segment of the meeting data for at least the first participant device from a first predetermined time before when the predefined action occurred to when the predefined action occurred; determining whether the receiving meeting data of the meeting has ended; and generating a summary of the meeting.

Type: Grant

Filed: November 9, 2017

Date of Patent: December 17, 2019

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Heiko Rahmel, Li-Juan Qin, Xuedong Huang, Wei Xiong
CONDUCTING SEARCH SESSIONS UTILIZING NAVIGATION PATTERNS

Publication number: 20190377733

Abstract: Systems, methods, and computer-readable storage media are provided for conducting searches utilizing search navigation patterns. Search queries are received that include search terms that are of a particular type. It is recognized that at least one prior search session has been conducted that included a search query having search terms of an equivalent or similar type and followed a particular navigation pattern. Such prior search(es) may have been conducted by the user or by a different user and/or may have a navigation pattern that was affirmatively recorded by the requesting user or that was recorded by the system without explicit contemporaneous user instruction to do so. Upon identifying the navigation pattern associated with the prior search, the system effectively conducts a search session following the navigation pattern.

Type: Application

Filed: June 24, 2019

Publication date: December 12, 2019

Inventors: Anoop GUPTA, Xuedong HUANG
COMPUTERIZED INTELLIGENT ASSISTANT FOR CONFERENCES

Publication number: 20190341050

Abstract: A method for facilitating a remote conference includes receiving a digital video and a computer-readable audio signal. A face recognition machine is operated to recognize a face of a first conference participant in the digital video, and a speech recognition machine is operated to translate the computer-readable audio signal into a first text. An attribution machine attributes the text to the first conference participant. A second computer-readable audio signal is processed similarly, to obtain a second text attributed to a second conference participant. A transcription machine automatically creates a transcript including the first text attributed to the first conference participant and the second text attributed to the second conference participant.

Type: Application

Filed: June 29, 2018

Publication date: November 7, 2019

Applicant: Microsoft Technology Licensing, LLC

Inventors: Adi DIAMANT, Karen MASTER BEN-DOR, Eyal KRUPKA, Raz HALALY, Yoni SMOLIN, Ilya GURVICH, Aviv HURVITZ, Lijuan QIN, Wei XIONG, Shixiong ZHANG, Lingfeng WU, Xiong XIAO, Ido LEICHTER, Moshe DAVID, Xuedong HUANG, Amit Kumar AGARWAL
Multi-talker speech recognizer

Patent number: 10460727

Abstract: Various systems and methods for multi-talker speech separation and recognition are disclosed herein. In one example, a system includes a memory and a processor to process mixed speech audio received from a microphone. In an example, the processor can also separate the mixed speech audio using permutation invariant training, wherein a criterion of the permutation invariant training is defined on an utterance of the mixed speech audio. In an example, the processor can also generate a plurality of separated streams for submission to a speech decoder.

Type: Grant

Filed: May 23, 2017

Date of Patent: October 29, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: James Droppo, Xuedong Huang, Dong Yu
Customized multi-device translated and transcribed conversations

Patent number: 10417349

Abstract: Systems and methods may be used to provide transcription and translation services. A method may include initializing a plurality of user devices with respective language output selections in a translation group by receiving a shared identifier from the plurality of user devices and transcribing the audio stream to transcribed text. The method may include translating the transcribed text to one or more of the respective language output selections when an original language of the transcribed text differs from the one or more of the respective language output selections. The method may include sending, a user device in the translation group, the transcribed text including translated text in a language corresponding to the respective language output selection for the user device. In an example, the method may include customizing the transcription or the translation, such as to a particular topic, location, user, or the like.

Type: Grant

Filed: June 14, 2017

Date of Patent: September 17, 2019

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: William D Lewis, Ivo José Garcia dos Santos, Tanvi Surti, Arul A Menezes, Olivier Nano, Christian Wendt, Xuedong Huang
ARTIFICIAL INTELLIGENCE SYSTEM UTILIZING MICROPHONE ARRAY AND FISHEYE CAMERA

Publication number: 20190236416

Abstract: In some embodiments, the disclosed subject matter involves a system and method relating to using an ambient capture device including a fisheye camera and a microphone array to capture audio and video in an environment, for use in an artificial intelligence (Al) application. The device with fisheye camera may provide approximately a 360° audio and video view, at relatively low cost. An embodiment may utilize a speech and vision fusion model component. The speech and vision fusion model may be trained using deep learning to combine features from many different sources, including available sensor data from the capture device. A long short term memory (LSTM) model may inter or identify features such as, but not limited to: audio direction; vision detection and tracking; voice signature; facial signature; gesture recognition; and object identification. The fusion processing may be performed by a cloud server, enabling the capture device to remain less complex.

Type: Application

Filed: January 31, 2018

Publication date: August 1, 2019

Inventors: Zhenghao Wang, Xuedong Huang, Lijuan Qin, Kun Wu, Huaming Wang
Conducting search sessions utilizing navigation patterns

Patent number: 10331686

Abstract: Systems, methods, and computer-readable storage media are provided for conducting searches utilizing search navigation patterns. Search queries are received that include search terms that are of a particular type. It is recognized that at least one prior search session has been conducted that included a search query having search terms of an equivalent or similar type and followed a particular navigation pattern. Such prior search(es) may have been conducted by the user or by a different user and/or may have a navigation pattern that was affirmatively recorded by the requesting user or that was recorded by the system without explicit contemporaneous user instruction to do so. Upon identifying the navigation pattern associated with the prior search, the system effectively conducts a search session following the navigation pattern.

Type: Grant

Filed: March 14, 2013

Date of Patent: June 25, 2019

Assignee: MICROSOFT CORPORATION

Inventors: Anoop Gupta, Xuedong Huang

prev 1 2 3 4 5 6 7 … next