Patents Examined by Brian L. Albertalli
  • Patent number: 11853706
    Abstract: Sentiment analysis is a task in natural language processing. The embodiments are directed to using a generative language model to extract an aspect term, aspect category and their corresponding polarities. The generative language model may be trained as a single, joint, and multi-task model. The single-task generative language model determines a term polarity from the aspect term in the sentence or a category polarity from an aspect category in the sentence. The joint-task generative language model determines both the aspect term and the term polarity or the aspect category and the category polarity. The multi-task generative language model determines the aspect term, term polarity, aspect category and category polarity of the sentence.
    Type: Grant
    Filed: September 8, 2021
    Date of Patent: December 26, 2023
    Assignee: salesforce.com, inc.
    Inventors: Ehsan Hosseini-Asl, Wenhao Liu
  • Patent number: 11848015
    Abstract: The invention is directed towards a an audio scrubbing system that allows for scrubbing recognized voice commands from audio data and replacing the recognized voice commands with environment audio data. Specifically, as a user captures video and audio data via a HMD, audio data captured by the HMD may be processed by an audio scrubbing module to identify voice commands in the audio data that are used for controlling the HMD. When a voice command is identified in the audio data, timestamps corresponding to the voice command may be determined. Filler audio data may then be generated to imitate the environment by processing at least a portion of the audio data by a neural network of a machine learning model. The filler audio data may then be used to replace the audio data corresponding to the identified voice commands, thereby scrubbing the voice command from the audio data.
    Type: Grant
    Filed: October 1, 2020
    Date of Patent: December 19, 2023
    Assignee: RealWear, Inc.
    Inventor: Christopher Iain Parkinson
  • Patent number: 11842748
    Abstract: Methods, systems, and apparatuses for audio event detection, where the determination of a type of sound data is made at the cluster level rather than at the frame level. The techniques provided are thus more robust to the local behavior of features of an audio signal or audio recording. The audio event detection is performed by using Gaussian mixture models (GMMs) to classify each cluster or by extracting an i-vector from each cluster. Each cluster may be classified based on an i-vector classification using a support vector machine or probabilistic linear discriminant analysis. The audio event detection significantly reduces potential smoothing error and avoids any dependency on accurate window-size tuning. Segmentation may be performed using a generalized likelihood ratio and a Bayesian information criterion, and the segments may be clustered using hierarchical agglomerative clustering. Audio frames may be clustered using K-means and GMMs.
    Type: Grant
    Filed: December 14, 2020
    Date of Patent: December 12, 2023
    Assignee: Pindrop Security, Inc.
    Inventors: Elie Khoury, Matthew Garland
  • Patent number: 11829724
    Abstract: Support for natural language expressions is provided by the use of semantic grammars that describe the structure of expressions in that grammar and that construct the meaning of a corresponding natural language expression. A semantic grammar extension mechanism is provided, which allows one semantic grammar to be used in the place of another semantic grammar. This enriches the expressivity of semantic grammars in a simple, natural, and decoupled manner.
    Type: Grant
    Filed: July 16, 2021
    Date of Patent: November 28, 2023
    Assignee: SOUNDHOUND AI IP, LLC
    Inventors: Bernard Mont-Reynaud, Christopher S. Wilson, Keyvan Mohajer
  • Patent number: 11798568
    Abstract: Conventional audio compression technologies perform a standardized signal transformation, independent of the type of the content. Multi-channel signals are decomposed into their signal components, subsequently quantized and encoded. This is disadvantageous due to lack of knowledge on the characteristics of scene composition, especially for e.g. multi-channel audio or Higher-Order Ambisonics (HOA) content. A method for decoding an encoded bitstream of multi-channel audio data and associated metadata is provided, including transforming the first Ambisonics format of the multi-channel audio data to a second Ambisonics format representation of the multi-channel audio data, wherein the transforming maps the first Ambisonics format of the multi-channel audio data into the second Ambisonics format representation of the multi-channel audio data.
    Type: Grant
    Filed: August 2, 2021
    Date of Patent: October 24, 2023
    Assignee: DOLBY LABORATORIES LICENSING CORPORATION
    Inventors: Oliver Wuebbolt, Johannes Boehm, Peter Jax
  • Patent number: 11790937
    Abstract: Systems and methods for optimizing voice detection via a network microphone device are disclosed herein. In one example, individual microphones of a network microphone device detect sound. The sound data is captured in a first buffer and analyzed to detect a trigger event. Metadata associated with the sound data is captured in a second buffer and provided to at least one network device to determine at least one characteristic of the detected sound based on the metadata. The network device provides a response that includes an instruction, based on the determined characteristic, to modify at least one performance parameter of the NMD. The NMD then modifies the at least one performance parameter based on the instruction.
    Type: Grant
    Filed: May 18, 2021
    Date of Patent: October 17, 2023
    Assignee: Sonos, Inc.
    Inventors: Connor Kristopher Smith, Kurt Thomas Soto, Charles Conor Sleith
  • Patent number: 11790925
    Abstract: The present technology relates to an information processing device and method, and a program capable of reducing a code amount. The information processing device includes: an acquisition unit that acquires space information regarding a position and a size of a child space within a parent space and position information in the child space indicating a position of an object within the child space, the child space being included in the parent space, and the object being included in the child space; and a calculation unit that calculates position information in the parent space indicating a position of the object within the parent space on the basis of the space information and the position information in the child space. The present technology can be applied to a signal processing device.
    Type: Grant
    Filed: June 20, 2019
    Date of Patent: October 17, 2023
    Assignee: Sony Corporation
    Inventors: Mitsuyuki Hatanaka, Toru Chinen, Minoru Tsuji, Hiroyuki Honma, Yuki Yamamoto
  • Patent number: 11783831
    Abstract: A user may access multiple virtual assistants via a voice-enabled device. The device may receive a command from the user, detect a wakeword corresponding to one of the assistants, and send audio data to a command processing system corresponding to the selected assistant. The device transmits encrypted audio data to one or more systems and, upon detecting a wakeword or wake command corresponding to one of the systems, the device may provide an encryption key to that particular system. The system may decrypt and process the audio data without additional latency introduced by having to wait for the audio data to arrive.
    Type: Grant
    Filed: June 29, 2021
    Date of Patent: October 10, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Philippe Andre Lantin, Ori Neidich, David Berol
  • Patent number: 11763813
    Abstract: Implementations described herein relate to reducing latency in automated assistant interactions. In some implementations, a client device can receive audio data that captures a spoken utterance of a user. The audio data can be processed to determine an assistant command to be performed by an automated assistant. The assistant command can be processed, using a latency prediction model, to generate a predicted latency to fulfill the assistant command. Further, the client device (or the automated assistant) can determine, based on the predicted latency, whether to audibly render pre-cached content for presentation to the user prior to audibly rendering content that is responsive to the spoken utterance. The pre-cached content can be tailored to the assistant command and audibly rendered for presentation to the user while the content is being obtained, and the content can be audibly rendered for presentation to the user subsequent to the pre-cached content.
    Type: Grant
    Filed: April 28, 2021
    Date of Patent: September 19, 2023
    Assignee: GOOGLE LLC
    Inventors: Lior Alon, Rafael Goldfarb, Dekel Auster, Dan Rasin, Michael Andrew Goodman, Trevor Strohman, Nino Tasca, Valerie Nygaard, Jaclyn Konzelmann
  • Patent number: 11755844
    Abstract: Servers configured to perform automatic summarization of content in electronic messages are discloses herein. In one embodiment, upon receiving an email, an server determines whether the incoming email is a templated message. In response to determining that the incoming email is not a templated message, the server classifies one or more sentences in the email as a statement of decision, judgement, inference, or fact, cluster the classified statements into clusters, and select one or more of the clusters to automatically generate summaries of the incoming email. The server can then insert data representing the generated summaries into the email before transmitting the email to a destination via a computer network.
    Type: Grant
    Filed: May 24, 2021
    Date of Patent: September 12, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Kausik Ghatak, Ganessh Kumar R P, Priyanka Goel, Neeraj Singh, Swathi Karri
  • Patent number: 11756537
    Abstract: Techniques are described herein for enabling an automated assistant to adjust its behavior depending on a detected age range and/or “vocabulary level” of a user who is engaging with the automated assistant. In various implementations, data indicative of a user's utterance may be used to estimate one or more of the user's age range and/or vocabulary level. The estimated age range/vocabulary level may be used to influence various aspects of a data processing pipeline employed by an automated assistant. In various implementations, aspects of the data processing pipeline that may be influenced by the user's age range/vocabulary level may include one or more of automated assistant invocation, speech-to-text (“STT”) processing, intent matching, intent resolution (or fulfillment), natural language generation, and/or text-to-speech (“TTS”) processing. In some implementations, one or more tolerance thresholds associated with one or more of these aspects, such as grammatical tolerances, vocabularic tolerances, etc.
    Type: Grant
    Filed: October 10, 2022
    Date of Patent: September 12, 2023
    Assignee: GOOGLE LLC
    Inventors: Pedro Gonnet Anders, Victor Carbune, Daniel Keysers, Thomas Deselaers, Sandro Feuz
  • Patent number: 11727944
    Abstract: An apparatus for decoding an encoded multichannel signal of a current frame to obtain three or more current audio output channels is provided. A multichannel processor is adapted to select two decoded channels from three or more decoded channels depending on first multichannel parameters. Moreover, the multichannel processor is adapted to generate a first group of two or more processed channels based on the selected channels. A noise filling module is adapted to identify for at least one of the selected channels, one or more frequency bands, within which all spectral lines are quantized to zero, and to generate a mixing channel using, depending on side information, a proper subset of three or more previous audio output channels that have been decoded, and to fill the spectral lines of frequency bands, within which all spectral lines are quantized to zero, with noise generated using spectral lines of the mixing channel.
    Type: Grant
    Filed: July 1, 2020
    Date of Patent: August 15, 2023
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Sascha Dick, Christian Helmrich, Nikolaus Rettelbach, Florian Schuh, Richard Fueg, Frederik Nagel
  • Patent number: 11721347
    Abstract: Some speech processing systems may handle some commands on-device rather than sending the audio data to a second device or system for processing. The first device may have limited speech processing capabilities sufficient for handling common language and/or commands, while the second device (e.g., an edge device and/or a remote system) may call on additional language models, entity libraries, skill components, etc. to perform additional tasks. An intermediate data generator may facilitate dividing speech processing operations between devices by generating a stream of data that includes a first-pass ASR output (e.g., a word or sub-word lattice) and other characteristics of the audio data such as whisper detection, speaker identification, media signatures, etc. The second device can perform the additional processing using the data stream; e.g., without using the audio data. Thus, privacy may be enhanced by processing the audio data locally without sending it to other devices/systems.
    Type: Grant
    Filed: June 29, 2021
    Date of Patent: August 8, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Stanislaw Ignacy Pasko, Pawel Zelazko, Cagdas Bak, Eli Joshua Fidler, Michal Kowalczuk, Andrew Oberlin, Ariya Rastrow
  • Patent number: 11710493
    Abstract: Encoding and decoding systems are described for the provision of high quality digital representations of audio signals with particular attention to the correct perceptual rendering of fast transients at modest sample rates. This is achieved by optimising downsampling and upsampling filters to minimise the length of the impulse response while adequately attenuating alias products that have been found perceptually harmful.
    Type: Grant
    Filed: December 14, 2020
    Date of Patent: July 25, 2023
    Assignee: MQA LIMITED
    Inventors: Peter Graham Craven, John Robert Stuart
  • Patent number: 11688509
    Abstract: In general, this disclosure describes techniques for a health management system that schedules medical appointments based on a dialog with a user (e.g., a patient), clinical guideline information, and/or other information. The health management system may engage in a dialog with the user, the dialog including requests from the health management system for audio input to the user device and audio input from the user in response to each request. The health management system may extract information from the audio input and compare the extracted information to clinical guideline information to determine one or more probable health conditions of the user. The health management system may determine a time allotment, identify a health care provider type and a platform for a medical appointment based on the one or more probable health conditions.
    Type: Grant
    Filed: January 16, 2020
    Date of Patent: June 27, 2023
    Assignee: SRI INTERNATIONAL
    Inventors: Mark Hanson, Bhaskar Ramamurthy, Manish Kothari, Brecken Hu Uhl
  • Patent number: 11688404
    Abstract: A method includes receiving an utterance of speech and segmenting the utterance of speech into a plurality of segments. For each segment of the utterance of speech, the method also includes extracting a speaker=discriminative embedding from the segment and predicting a probability distribution over possible speakers for the segment using a probabilistic generative model configured to receive the extracted speaker-discriminative embedding as a feature input. The probabilistic generative model trained on a corpus of training speech utterances each segmented into a plurality of training segments. Each training segment including a corresponding speaker-discriminative embedding and a corresponding speaker label. The method also includes assigning a speaker label to each segment of the utterance of speech based on the probability distribution over possible speakers for the corresponding segment.
    Type: Grant
    Filed: May 26, 2021
    Date of Patent: June 27, 2023
    Assignee: Google LLC
    Inventors: Chong Wang, Aonan Zhang, Quan Wang, Zhenyao Zhu
  • Patent number: 11683320
    Abstract: The present disclosure is generally directed to a data processing system for customizing content in a voice activated computer network environment. With user consent, the data processing system can improve the efficiency and effectiveness of auditory data packet transmission over one or more computer networks by, for example, increasing the accuracy of the voice identification process used in the generation of customized content. The present solution can make accurate identifications while generating fewer audio identification models, which are computationally intensive to generate.
    Type: Grant
    Filed: April 22, 2021
    Date of Patent: June 20, 2023
    Assignee: GOOGLE LLC
    Inventors: Victor Carbune, Thomas Deselaers, Sandro Feuz
  • Patent number: 11681870
    Abstract: Disclosed are devices, systems, apparatuses, methods, products, and other implementations for improving the accuracy and latency in work estimation systems and methods through the invocation of serverless applications and/or servers and the interfacing of natural language processing endpoint devices.
    Type: Grant
    Filed: January 28, 2019
    Date of Patent: June 20, 2023
    Assignee: ENSONO, LP
    Inventors: Jeremy Bowers, David Pearson
  • Patent number: 11682415
    Abstract: In an approach, a processor extracts an audio signal from a video clip. A processor converts the audio signal into a text sequence. A processor selects a first set of keywords from the text sequence, the first set of keywords corresponding to a first audio segment of the audio signal. A processor tags a target video segment of the video clip with the first set of keywords, the target video segment corresponding to the first audio segment.
    Type: Grant
    Filed: March 19, 2021
    Date of Patent: June 20, 2023
    Assignee: International Business Machines Corporation
    Inventors: Li Cao, Jing Xu, Ze Ming Zhao, Xue Ying Zhang
  • Patent number: 11664022
    Abstract: Provided is a method of processing a user input to deliver the user input to at least one of a plurality of assistants, includes: converting a user input including a voice signal based on a predetermined rule to generate an instruction; splitting a complex instruction into partial instructions based on that the generated instruction is the complex instruction requesting two or more events; and determining a domain of each of the partial instructions and distributing the partial instructions to at least one of a plurality of voice assistants based on the domain. According to an embodiment, the washer may be related to artificial intelligence (AI) modules, unmanned aerial vehicles (UAVs), robots, augmented reality (AR) devices, virtual reality (VR) devices, and 5G service-related devices.
    Type: Grant
    Filed: June 24, 2020
    Date of Patent: May 30, 2023
    Assignee: LG ELECTRONICS INC.
    Inventors: Hoolim Kim, Euihyeok Lee, Kihyeon Kim