Patents by Inventor Yinyi Guo

Yinyi Guo has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

CONTEXT-BASED MODEL SELECTION

Publication number: 20250103888

Abstract: A device includes one or more processors configured to receive sensor data from one or more sensor devices. The one or more processors are also configured to determine a context of the device based on the sensor data. The one or more processors are further configured to select a model based on the context. The one or more processors are also configured to process an input signal using the model to generate a context-specific output.

Type: Application

Filed: December 5, 2024

Publication date: March 27, 2025

Inventors: Fatemeh SAKI, Yinyi GUO, Erik VISSER
AUTOMATED AUDIO CAPTION CORRECTION USING FALSE ALARM AND MISS DETECTION

Publication number: 20250078828

Abstract: Systems and techniques are provided for natural language processing. A system generates a plurality of tokens (e.g., words or portions thereof) based on input content (e.g., text and/or speech). The system searches through the plurality of tokens to generate a first ranking the plurality of tokens based on probability. The system generates natural language inference (NLI) scores for the plurality of tokens to generate a second ranking of the plurality of tokens based on faithfulness to the input content (e.g., whether the tokens produce statements that are true based on the input content). The system generates output text that includes at least one token selected from the plurality of tokens based on the first ranking and the second ranking.

Type: Application

Filed: August 21, 2024

Publication date: March 6, 2025

Inventors: Rehana MAHFUZ, Yinyi GUO, Arvind Krishna SRIDHAR, Erik VISSER
FAITHFUL GENERATION OF OUTPUT TEXT FOR MULTIMODAL APPLICATIONS

Publication number: 20250078818

Abstract: Systems and techniques are described for generating and using unimodal/multimodal generative models that mitigate hallucinations. For example, a computing device can encode input data to generate encoded representations of the input data. The computing device can obtain intermediate data including a plurality of partial sentences associated with the input data and can generate, based on the intermediate data, at least one complete sentence associated with the input data. The computing device can encode the at least one complete sentence to generate at least one encoded representation of the at least one complete sentence. The computing device can generate a faithfulness score based on a comparison of the encoded representations of the input data and the at least one encoded representation of the at least one complete sentence. The computing device can re-rank the plurality of partial sentences of the intermediate data based on the faithfulness score to generate re-ranked data.

Type: Application

Filed: February 28, 2024

Publication date: March 6, 2025

Inventors: Arvind Krishna SRIDHAR, Rehana MAHFUZ, Erik VISSER, Yinyi GUO
Processing of audio signals from multiple microphones

Patent number: 12244994

Abstract: A first device includes a memory configured to store instructions and one or more processors configured to receive audio signals from multiple microphones. The one or more processors are configured to process the audio signals to generate direction-of-arrival information corresponding to one or more sources of sound represented in one or more of the audio signals. The one or more processors are also configured to and send, to a second device, data based on the direction-of-arrival information and a class or embedding associated with the direction-of-arrival information.

Type: Grant

Filed: July 25, 2022

Date of Patent: March 4, 2025

Assignee: QUALCOMM Incorporated

Inventors: Erik Visser, Fatemeh Saki, Yinyi Guo, Lae-Hoon Kim, Rogerio Guedes Alves, Hannes Pessentheiner
Context-based model selection

Patent number: 12198057

Abstract: A device includes one or more processors configured to receive sensor data from one or more sensor devices. The one or more processors are also configured to determine a context of the device based on the sensor data. The one or more processors are further configured to select a model based on the context. The one or more processors are also configured to process an input signal using the model to generate a context-specific output.

Type: Grant

Filed: November 24, 2020

Date of Patent: January 14, 2025

Assignee: QUALCOMM Incorporated

Inventors: Fatemeh Saki, Yinyi Guo, Erik Visser
KNOWLEDGE-BASED AUDIO SCENE GRAPH

Publication number: 20240419731

Abstract: A device includes a processor configured to obtain a first audio embedding of a first audio segment and obtain a first text embedding of a first tag assigned to the first audio segment. The first audio segment corresponds to a first audio event of audio events. The processor is configured to obtain a first event representation based on a combination of the first audio embedding and the first text embedding. The processor is configured to obtain a second event representation of a second audio event of the audio events. The processor is also configured to determine, based on knowledge data, relations between the audio events. The processor is configured to construct an audio scene graph based on a temporal order of the audio events. The audio scene graph constructed to include a first node corresponding to the first audio event and a second node corresponding to the second audio event.

Type: Application

Filed: June 10, 2024

Publication date: December 19, 2024

Inventors: Arvind Krishna SRIDHAR, Yinyi GUO, Erik VISSER
SOUND SEARCH

Publication number: 20240232258

Abstract: A device includes one or more processors configured to generate one or more query caption embeddings based on a query. The processor(s) are further configured to select one or more caption embeddings from among a set of embeddings associated with a set of media files of a file repository. Each caption embedding represents a corresponding sound caption, and each sound caption includes a natural-language text description of a sound. The caption embedding(s) are selected based on a similarity metric indicative of similarity between the caption embedding(s) and the query caption embedding(s). The processor(s) are further configured to generate search results identifying one or more first media files of the set of media files. Each of the first media file(s) is associated with at least one of the caption embedding(s).

Type: Application

Filed: May 31, 2023

Publication date: July 11, 2024

Inventors: Rehana MAHFUZ, Yinyi GUO, Erik VISSER
SOUND SEARCH

Publication number: 20240134908

Abstract: A device includes one or more processors configured to generate one or more query caption embeddings based on a query. The processor(s) are further configured to select one or more caption embeddings from among a set of embeddings associated with a set of media files of a file repository. Each caption embedding represents a corresponding sound caption, and each sound caption includes a natural-language text description of a sound. The caption embedding(s) are selected based on a similarity metric indicative of similarity between the caption embedding(s) and the query caption embedding(s). The processor(s) are further configured to generate search results identifying one or more first media files of the set of media files. Each of the first media file(s) is associated with at least one of the caption embedding(s).

Type: Application

Filed: May 30, 2023

Publication date: April 25, 2024

Inventors: Rehana MAHFUZ, Yinyi GUO, Erik VISSER
METHOD AND APPARATUS FOR TARGET SOUND DETECTION

Publication number: 20240135959

Abstract: A device to perform target sound detection includes a memory including a buffer configured to store audio data. The device includes one or more processors coupled to the memory. The one or more processors are configured to receive the audio data from the buffer. The one or more processors are configured to detect the presence or absence of one or more target non-speech sounds in the audio data. The one or more processors are further configured to generate a user interface signal, to indicate one of the one or more target non-speech sounds has been detected, and provide the user interface signal to an output device.

Type: Application

Filed: December 18, 2023

Publication date: April 25, 2024

Inventors: Prajakt KULKARNI, Yinyi GUO, Erik VISSER
Method and apparatus for target sound detection

Patent number: 11862189

Abstract: A device to perform target sound detection includes one or more processors. The one or more processors include a buffer configured to store audio data and a target sound detector. The target sound detector includes a first stage and a second stage. The first stage includes a binary target sound classifier configured to process the audio data. The first stage is configured to activate the second stage in response to detection of a target sound. The second stage is configured to receive the audio data from the buffer in response to the detection of the target sound.

Type: Grant

Filed: April 1, 2020

Date of Patent: January 2, 2024

Assignee: QUALCOMM Incorporated

Inventors: Prajakt Kulkarni, Yinyi Guo, Erik Visser
Sound event detection learning

Patent number: 11664044

Abstract: A device includes a processor configured to receive audio data samples and provide the audio data samples to a first neural network to generate a first output corresponding to a first set of sound classes. The processor is further configured to provide the audio data samples to a second neural network to generate a second output corresponding to a second set of sound classes. A second count of classes of the second set of sound classes is greater than a first count of classes of the first set of sound classes. The processor is also configured to provide the first output to a neural adapter to generate a third output corresponding to the second set of sound classes. The processor is further configured to provide the second output and the third output to a merger adapter to generate sound event identification data based on the audio data samples.

Type: Grant

Filed: November 24, 2020

Date of Patent: May 30, 2023

Assignee: Qualcomm Incorporated

Inventors: Fatemeh Saki, Yinyi Guo, Erik Visser, Eunjeong Koh
AUDIO EVENT DATA PROCESSING

Publication number: 20230035531

Abstract: A second device includes a memory configured to store instructions and one or more processors configured to receive, from a first device, an indication of an audio class corresponding to an audio event.

Type: Application

Filed: July 25, 2022

Publication date: February 2, 2023

Inventors: Erik Visser, Fatemeh Saki, Yinyi Guo, Lae-Hoon Kim, Rogerio Guedes Alves, Hannes Pessentheiner
PROCESSING OF AUDIO SIGNALS FROM MULTIPLE MICROPHONES

Publication number: 20230036986

Abstract: A first device includes a memory configured to store instructions and one or more processors configured to receive audio signals from multiple microphones. The one or more processors are configured to process the audio signals to generate direction-of-arrival information corresponding to one or more sources of sound represented in one or more of the audio signals. The one or more processors are also configured to and send, to a second device, data based on the direction-of-arrival information and a class or embedding associated with the direction-of-arrival information.

Type: Application

Filed: July 25, 2022

Publication date: February 2, 2023

Inventors: Erik VISSER, Fatemeh SAKI, Yinyi GUO, Lae-Hoon KIM, Rogerio Guedes ALVES, Hannes PESSENTHEINER
Adaptive sound event classification

Patent number: 11410677

Abstract: A device includes one or more processors configured to provide audio data samples to a sound event classification model. The one or more processors are also configured to determine, based on an output of the sound event classification model responsive to the audio data samples, whether a sound class of with the audio data samples was recognized by the sound event classification model. The one or more processors are further configured to, based on a determination that the sound class was not recognized, determine whether the sound event classification model corresponds to an audio scene associated with the audio data samples. The one or more processors are also configured to, based on a determination that the sound event classification model corresponds to the audio scene associated with the audio data samples, store model update data based on the audio data samples.

Type: Grant

Filed: November 24, 2020

Date of Patent: August 9, 2022

Assignee: Qualcomm Incorporated

Inventors: Fatemeh Saki, Yinyi Guo, Erik Visser
Multi-modal user interface

Patent number: 11348581

Abstract: A device for multi-modal user input includes a processor configured to process first data received from a first input device. The first data indicates a first input from a user based on a first input mode. The first input corresponds to a command. The processor is configured to send a feedback message to an output device based on processing the first data. The feedback message instructs the user to provide, based on a second input mode that is different from the first input mode, a second input that identifies a command associated with the first input. The processor is configured to receive second data from a second input device, the second data indicating the second input, and to update a mapping to associate the first input to the command identified by the second input.

Type: Grant

Filed: November 15, 2019

Date of Patent: May 31, 2022

Assignee: Qualcomm Incorporated

Inventors: Ravi Choudhary, Lae-Hoon Kim, Sunkuk Moon, Yinyi Guo, Fatemeh Saki, Erik Visser
ADAPTIVE SOUND EVENT CLASSIFICATION

Publication number: 20220165292

Abstract: A device includes one or more processors configured to provide audio data samples to a sound event classification model. The one or more processors are also configured to determine, based on an output of the sound event classification model responsive to the audio data samples, whether a sound class of with the audio data samples was recognized by the sound event classification model. The one or more processors are further configured to, based on a determination that the sound class was not recognized, determine whether the sound event classification model corresponds to an audio scene associated with the audio data samples. The one or more processors are also configured to, based on a determination that the sound event classification model corresponds to the audio scene associated with the audio data samples, store model update data based on the audio data samples.

Type: Application

Filed: November 24, 2020

Publication date: May 26, 2022

Inventors: Fatemeh SAKI, Yinyi GUO, Erik VISSER
TRANSFER LEARNING FOR SOUND EVENT CLASSIFICATION

Publication number: 20220164667

Abstract: A method includes initializing a second neural network based on a first neural network that is trained to detect a first set of sound classes and linking an output of the first neural network and an output of the second neural network to one or more coupling networks. The method also includes, after training the second neural network and the one or more coupling networks, determining whether to discard the first neural network based on an accuracy of sound classes assigned by the second neural network and an accuracy of sound classes assigned by the first neural network.

Type: Application

Filed: November 24, 2020

Publication date: May 26, 2022

Inventors: Fatemeh SAKI, Yinyi GUO, Erik VISSER
CONTEXT-BASED MODEL SELECTION

Publication number: 20220164662

Abstract: A device includes one or more processors configured to receive sensor data from one or more sensor devices. The one or more processors are also configured to determine a context of the device based on the sensor data. The one or more processors are further configured to select a model based on the context. The one or more processors are also configured to process an input signal using the model to generate a context-specific output.

Type: Application

Filed: November 24, 2020

Publication date: May 26, 2022

Inventors: Fatemeh SAKI, Yinyi GUO, Erik VISSER
Wireless control of remote devices through intention codes over a wireless connection

Patent number: 11290518

Abstract: Various embodiments provide systems and methods which disclose a command device which can be used to establish a wireless connection, through one or more wireless channels, between the command device and a remote device. An intention code may be generated, prior to, or after, the establishment of the wireless connection, and the remote device may be selected based on the intention code. The command device may initiate a wireless transfer, through one or more wireless channels of the established wireless connection, of an intention code, and receive acknowledgement that the intention code was successfully transferred to the remote device. The command device may then control the remote device, based on the intention code sent to the remote device, through the one or more wireless channels of the established wireless connection between the command device and the remote device.

Type: Grant

Filed: September 27, 2017

Date of Patent: March 29, 2022

Assignee: Qualcomm Incorporated

Inventors: Lae-Hoon Kim, Erik Visser, Yinyi Guo
System and method to view occupant status and manage devices of building

Patent number: 11240058

Abstract: A device to provide information to a visual interface that is mountable to a vehicle dashboard includes a memory configured to store device information indicative of controllable devices of a building and occupant data indicative of one or more occupants of the building. The device includes a processor configured to receive, in real-time, status information associated with the one or more occupants of the building. The status information includes at least one of dynamic location information or dynamic activity information. The processor is configured to generate an output to provide, at the visual interface device, a visual representation of at least a portion of the building and the status information associated with the one or more occupants. The processor is also configured to generate an instruction to adjust an operation of one or more devices of the controllable devices based on user input.

Type: Grant

Filed: March 29, 2019

Date of Patent: February 1, 2022

Assignee: Qualcomm Incorporated

Inventors: Ravi Choudhary, Yinyi Guo, Fatemeh Saki, Erik Visser

1 2 3 4 next