Patents Examined by Brian L. Albertalli

Generative language model for few-shot aspect-based sentiment analysis

Patent number: 11853706

Abstract: Sentiment analysis is a task in natural language processing. The embodiments are directed to using a generative language model to extract an aspect term, aspect category and their corresponding polarities. The generative language model may be trained as a single, joint, and multi-task model. The single-task generative language model determines a term polarity from the aspect term in the sentence or a category polarity from an aspect category in the sentence. The joint-task generative language model determines both the aspect term and the term polarity or the aspect category and the category polarity. The multi-task generative language model determines the aspect term, term polarity, aspect category and category polarity of the sentence.

Type: Grant

Filed: September 8, 2021

Date of Patent: December 26, 2023

Assignee: salesforce.com, inc.

Inventors: Ehsan Hosseini-Asl, Wenhao Liu
Voice command scrubbing

Patent number: 11848015

Abstract: The invention is directed towards a an audio scrubbing system that allows for scrubbing recognized voice commands from audio data and replacing the recognized voice commands with environment audio data. Specifically, as a user captures video and audio data via a HMD, audio data captured by the HMD may be processed by an audio scrubbing module to identify voice commands in the audio data that are used for controlling the HMD. When a voice command is identified in the audio data, timestamps corresponding to the voice command may be determined. Filler audio data may then be generated to imitate the environment by processing at least a portion of the audio data by a neural network of a machine learning model. The filler audio data may then be used to replace the audio data corresponding to the identified voice commands, thereby scrubbing the voice command from the audio data.

Type: Grant

Filed: October 1, 2020

Date of Patent: December 19, 2023

Assignee: RealWear, Inc.

Inventor: Christopher Iain Parkinson
System and method for cluster-based audio event detection

Patent number: 11842748

Abstract: Methods, systems, and apparatuses for audio event detection, where the determination of a type of sound data is made at the cluster level rather than at the frame level. The techniques provided are thus more robust to the local behavior of features of an audio signal or audio recording. The audio event detection is performed by using Gaussian mixture models (GMMs) to classify each cluster or by extracting an i-vector from each cluster. Each cluster may be classified based on an i-vector classification using a support vector machine or probabilistic linear discriminant analysis. The audio event detection significantly reduces potential smoothing error and avoids any dependency on accurate window-size tuning. Segmentation may be performed using a generalized likelihood ratio and a Bayesian information criterion, and the segments may be clustered using hierarchical agglomerative clustering. Audio frames may be clustered using K-means and GMMs.

Type: Grant

Filed: December 14, 2020

Date of Patent: December 12, 2023

Assignee: Pindrop Security, Inc.

Inventors: Elie Khoury, Matthew Garland
Using semantic grammar extensibility for collective artificial intelligence

Patent number: 11829724

Abstract: Support for natural language expressions is provided by the use of semantic grammars that describe the structure of expressions in that grammar and that construct the meaning of a corresponding natural language expression. A semantic grammar extension mechanism is provided, which allows one semantic grammar to be used in the place of another semantic grammar. This enriches the expressivity of semantic grammars in a simple, natural, and decoupled manner.

Type: Grant

Filed: July 16, 2021

Date of Patent: November 28, 2023

Assignee: SOUNDHOUND AI IP, LLC

Inventors: Bernard Mont-Reynaud, Christopher S. Wilson, Keyvan Mohajer
Methods, apparatus and systems for encoding and decoding of multi-channel ambisonics audio data

Patent number: 11798568

Abstract: Conventional audio compression technologies perform a standardized signal transformation, independent of the type of the content. Multi-channel signals are decomposed into their signal components, subsequently quantized and encoded. This is disadvantageous due to lack of knowledge on the characteristics of scene composition, especially for e.g. multi-channel audio or Higher-Order Ambisonics (HOA) content. A method for decoding an encoded bitstream of multi-channel audio data and associated metadata is provided, including transforming the first Ambisonics format of the multi-channel audio data to a second Ambisonics format representation of the multi-channel audio data, wherein the transforming maps the first Ambisonics format of the multi-channel audio data into the second Ambisonics format representation of the multi-channel audio data.

Type: Grant

Filed: August 2, 2021

Date of Patent: October 24, 2023

Assignee: DOLBY LABORATORIES LICENSING CORPORATION

Inventors: Oliver Wuebbolt, Johannes Boehm, Peter Jax
Voice detection optimization using sound metadata

Patent number: 11790937

Abstract: Systems and methods for optimizing voice detection via a network microphone device are disclosed herein. In one example, individual microphones of a network microphone device detect sound. The sound data is captured in a first buffer and analyzed to detect a trigger event. Metadata associated with the sound data is captured in a second buffer and provided to at least one network device to determine at least one characteristic of the detected sound based on the metadata. The network device provides a response that includes an instruction, based on the determined characteristic, to modify at least one performance parameter of the NMD. The NMD then modifies the at least one performance parameter based on the instruction.

Type: Grant

Filed: May 18, 2021

Date of Patent: October 17, 2023

Assignee: Sonos, Inc.

Inventors: Connor Kristopher Smith, Kurt Thomas Soto, Charles Conor Sleith
Information processing device and method, and program

Patent number: 11790925

Abstract: The present technology relates to an information processing device and method, and a program capable of reducing a code amount. The information processing device includes: an acquisition unit that acquires space information regarding a position and a size of a child space within a parent space and position information in the child space indicating a position of an object within the child space, the child space being included in the parent space, and the object being included in the child space; and a calculation unit that calculates position information in the parent space indicating a position of the object within the parent space on the basis of the space information and the position information in the child space. The present technology can be applied to a signal processing device.

Type: Grant

Filed: June 20, 2019

Date of Patent: October 17, 2023

Assignee: Sony Corporation

Inventors: Mitsuyuki Hatanaka, Toru Chinen, Minoru Tsuji, Hiroyuki Honma, Yuki Yamamoto
Data protection in a multi-assistant system

Patent number: 11783831

Abstract: A user may access multiple virtual assistants via a voice-enabled device. The device may receive a command from the user, detect a wakeword corresponding to one of the assistants, and send audio data to a command processing system corresponding to the selected assistant. The device transmits encrypted audio data to one or more systems and, upon detecting a wakeword or wake command corresponding to one of the systems, the device may provide an encryption key to that particular system. The system may decrypt and process the audio data without additional latency introduced by having to wait for the audio data to arrive.

Type: Grant

Filed: June 29, 2021

Date of Patent: October 10, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Philippe Andre Lantin, Ori Neidich, David Berol
Methods and systems for reducing latency in automated assistant interactions

Patent number: 11763813

Abstract: Implementations described herein relate to reducing latency in automated assistant interactions. In some implementations, a client device can receive audio data that captures a spoken utterance of a user. The audio data can be processed to determine an assistant command to be performed by an automated assistant. The assistant command can be processed, using a latency prediction model, to generate a predicted latency to fulfill the assistant command. Further, the client device (or the automated assistant) can determine, based on the predicted latency, whether to audibly render pre-cached content for presentation to the user prior to audibly rendering content that is responsive to the spoken utterance. The pre-cached content can be tailored to the assistant command and audibly rendered for presentation to the user while the content is being obtained, and the content can be audibly rendered for presentation to the user subsequent to the pre-cached content.

Type: Grant

Filed: April 28, 2021

Date of Patent: September 19, 2023

Assignee: GOOGLE LLC

Inventors: Lior Alon, Rafael Goldfarb, Dekel Auster, Dan Rasin, Michael Andrew Goodman, Trevor Strohman, Nino Tasca, Valerie Nygaard, Jaclyn Konzelmann
Automatic summarization of content in electronic messages

Patent number: 11755844

Abstract: Servers configured to perform automatic summarization of content in electronic messages are discloses herein. In one embodiment, upon receiving an email, an server determines whether the incoming email is a templated message. In response to determining that the incoming email is not a templated message, the server classifies one or more sentences in the email as a statement of decision, judgement, inference, or fact, cluster the classified statements into clusters, and select one or more of the clusters to automatically generate summaries of the incoming email. The server can then insert data representing the generated summaries into the email before transmitting the email to a destination via a computer network.

Type: Grant

Filed: May 24, 2021

Date of Patent: September 12, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Kausik Ghatak, Ganessh Kumar R P, Priyanka Goel, Neeraj Singh, Swathi Karri
Automated assistants that accommodate multiple age groups and/or vocabulary levels

Patent number: 11756537

Abstract: Techniques are described herein for enabling an automated assistant to adjust its behavior depending on a detected age range and/or “vocabulary level” of a user who is engaging with the automated assistant. In various implementations, data indicative of a user's utterance may be used to estimate one or more of the user's age range and/or vocabulary level. The estimated age range/vocabulary level may be used to influence various aspects of a data processing pipeline employed by an automated assistant. In various implementations, aspects of the data processing pipeline that may be influenced by the user's age range/vocabulary level may include one or more of automated assistant invocation, speech-to-text (“STT”) processing, intent matching, intent resolution (or fulfillment), natural language generation, and/or text-to-speech (“TTS”) processing. In some implementations, one or more tolerance thresholds associated with one or more of these aspects, such as grammatical tolerances, vocabularic tolerances, etc.

Type: Grant

Filed: October 10, 2022

Date of Patent: September 12, 2023

Assignee: GOOGLE LLC

Inventors: Pedro Gonnet Anders, Victor Carbune, Daniel Keysers, Thomas Deselaers, Sandro Feuz
Apparatus and method for stereo filling in multichannel coding

Patent number: 11727944

Abstract: An apparatus for decoding an encoded multichannel signal of a current frame to obtain three or more current audio output channels is provided. A multichannel processor is adapted to select two decoded channels from three or more decoded channels depending on first multichannel parameters. Moreover, the multichannel processor is adapted to generate a first group of two or more processed channels based on the selected channels. A noise filling module is adapted to identify for at least one of the selected channels, one or more frequency bands, within which all spectral lines are quantized to zero, and to generate a mixing channel using, depending on side information, a proper subset of three or more previous audio output channels that have been decoded, and to fill the spectral lines of frequency bands, within which all spectral lines are quantized to zero, with noise generated using spectral lines of the mixing channel.

Type: Grant

Filed: July 1, 2020

Date of Patent: August 15, 2023

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Sascha Dick, Christian Helmrich, Nikolaus Rettelbach, Florian Schuh, Richard Fueg, Frederik Nagel
Intermediate data for inter-device speech processing

Patent number: 11721347

Abstract: Some speech processing systems may handle some commands on-device rather than sending the audio data to a second device or system for processing. The first device may have limited speech processing capabilities sufficient for handling common language and/or commands, while the second device (e.g., an edge device and/or a remote system) may call on additional language models, entity libraries, skill components, etc. to perform additional tasks. An intermediate data generator may facilitate dividing speech processing operations between devices by generating a stream of data that includes a first-pass ASR output (e.g., a word or sub-word lattice) and other characteristics of the audio data such as whisper detection, speaker identification, media signatures, etc. The second device can perform the additional processing using the data stream; e.g., without using the audio data. Thus, privacy may be enhanced by processing the audio data locally without sending it to other devices/systems.

Type: Grant

Filed: June 29, 2021

Date of Patent: August 8, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Stanislaw Ignacy Pasko, Pawel Zelazko, Cagdas Bak, Eli Joshua Fidler, Michal Kowalczuk, Andrew Oberlin, Ariya Rastrow
Digital encapsulation of audio signals

Patent number: 11710493

Abstract: Encoding and decoding systems are described for the provision of high quality digital representations of audio signals with particular attention to the correct perceptual rendering of fast transients at modest sample rates. This is achieved by optimising downsampling and upsampling filters to minimise the length of the impulse response while adequately attenuating alias products that have been found perceptually harmful.

Type: Grant

Filed: December 14, 2020

Date of Patent: July 25, 2023

Assignee: MQA LIMITED

Inventors: Peter Graham Craven, John Robert Stuart
Health management system

Patent number: 11688509

Abstract: In general, this disclosure describes techniques for a health management system that schedules medical appointments based on a dialog with a user (e.g., a patient), clinical guideline information, and/or other information. The health management system may engage in a dialog with the user, the dialog including requests from the health management system for audio input to the user device and audio input from the user in response to each request. The health management system may extract information from the audio input and compare the extracted information to clinical guideline information to determine one or more probable health conditions of the user. The health management system may determine a time allotment, identify a health care provider type and a platform for a medical appointment based on the one or more probable health conditions.

Type: Grant

Filed: January 16, 2020

Date of Patent: June 27, 2023

Assignee: SRI INTERNATIONAL

Inventors: Mark Hanson, Bhaskar Ramamurthy, Manish Kothari, Brecken Hu Uhl
Fully supervised speaker diarization

Patent number: 11688404

Abstract: A method includes receiving an utterance of speech and segmenting the utterance of speech into a plurality of segments. For each segment of the utterance of speech, the method also includes extracting a speaker=discriminative embedding from the segment and predicting a probability distribution over possible speakers for the segment using a probabilistic generative model configured to receive the extracted speaker-discriminative embedding as a feature input. The probabilistic generative model trained on a corpus of training speech utterances each segmented into a plurality of training segments. Each training segment including a corresponding speaker-discriminative embedding and a corresponding speaker label. The method also includes assigning a speaker label to each segment of the utterance of speech based on the probability distribution over possible speakers for the corresponding segment.

Type: Grant

Filed: May 26, 2021

Date of Patent: June 27, 2023

Assignee: Google LLC

Inventors: Chong Wang, Aonan Zhang, Quan Wang, Zhenyao Zhu
Distributed identification in networked system

Patent number: 11683320

Abstract: The present disclosure is generally directed to a data processing system for customizing content in a voice activated computer network environment. With user consent, the data processing system can improve the efficiency and effectiveness of auditory data packet transmission over one or more computer networks by, for example, increasing the accuracy of the voice identification process used in the generation of customized content. The present solution can make accurate identifications while generating fewer audio identification models, which are computationally intensive to generate.

Type: Grant

Filed: April 22, 2021

Date of Patent: June 20, 2023

Assignee: GOOGLE LLC

Inventors: Victor Carbune, Thomas Deselaers, Sandro Feuz
Reducing latency and improving accuracy of work estimates utilizing natural language processing

Patent number: 11681870

Abstract: Disclosed are devices, systems, apparatuses, methods, products, and other implementations for improving the accuracy and latency in work estimation systems and methods through the invocation of serverless applications and/or servers and the interfacing of natural language processing endpoint devices.

Type: Grant

Filed: January 28, 2019

Date of Patent: June 20, 2023

Assignee: ENSONO, LP

Inventors: Jeremy Bowers, David Pearson
Automatic video tagging

Patent number: 11682415

Abstract: In an approach, a processor extracts an audio signal from a video clip. A processor converts the audio signal into a text sequence. A processor selects a first set of keywords from the text sequence, the first set of keywords corresponding to a first audio segment of the audio signal. A processor tags a target video segment of the video clip with the first set of keywords, the target video segment corresponding to the first audio segment.

Type: Grant

Filed: March 19, 2021

Date of Patent: June 20, 2023

Assignee: International Business Machines Corporation

Inventors: Li Cao, Jing Xu, Ze Ming Zhao, Xue Ying Zhang
Method for processing user input of voice assistant

Patent number: 11664022

Abstract: Provided is a method of processing a user input to deliver the user input to at least one of a plurality of assistants, includes: converting a user input including a voice signal based on a predetermined rule to generate an instruction; splitting a complex instruction into partial instructions based on that the generated instruction is the complex instruction requesting two or more events; and determining a domain of each of the partial instructions and distributing the partial instructions to at least one of a plurality of voice assistants based on the domain. According to an embodiment, the washer may be related to artificial intelligence (AI) modules, unmanned aerial vehicles (UAVs), robots, augmented reality (AR) devices, virtual reality (VR) devices, and 5G service-related devices.

Type: Grant

Filed: June 24, 2020

Date of Patent: May 30, 2023

Assignee: LG ELECTRONICS INC.

Inventors: Hoolim Kim, Euihyeok Lee, Kihyeon Kim

prev 1 2 3 4 5 6 … next