Patents Examined by Brian L. Albertalli
-
Patent number: 11853706Abstract: Sentiment analysis is a task in natural language processing. The embodiments are directed to using a generative language model to extract an aspect term, aspect category and their corresponding polarities. The generative language model may be trained as a single, joint, and multi-task model. The single-task generative language model determines a term polarity from the aspect term in the sentence or a category polarity from an aspect category in the sentence. The joint-task generative language model determines both the aspect term and the term polarity or the aspect category and the category polarity. The multi-task generative language model determines the aspect term, term polarity, aspect category and category polarity of the sentence.Type: GrantFiled: September 8, 2021Date of Patent: December 26, 2023Assignee: salesforce.com, inc.Inventors: Ehsan Hosseini-Asl, Wenhao Liu
-
Patent number: 11848015Abstract: The invention is directed towards a an audio scrubbing system that allows for scrubbing recognized voice commands from audio data and replacing the recognized voice commands with environment audio data. Specifically, as a user captures video and audio data via a HMD, audio data captured by the HMD may be processed by an audio scrubbing module to identify voice commands in the audio data that are used for controlling the HMD. When a voice command is identified in the audio data, timestamps corresponding to the voice command may be determined. Filler audio data may then be generated to imitate the environment by processing at least a portion of the audio data by a neural network of a machine learning model. The filler audio data may then be used to replace the audio data corresponding to the identified voice commands, thereby scrubbing the voice command from the audio data.Type: GrantFiled: October 1, 2020Date of Patent: December 19, 2023Assignee: RealWear, Inc.Inventor: Christopher Iain Parkinson
-
Patent number: 11842748Abstract: Methods, systems, and apparatuses for audio event detection, where the determination of a type of sound data is made at the cluster level rather than at the frame level. The techniques provided are thus more robust to the local behavior of features of an audio signal or audio recording. The audio event detection is performed by using Gaussian mixture models (GMMs) to classify each cluster or by extracting an i-vector from each cluster. Each cluster may be classified based on an i-vector classification using a support vector machine or probabilistic linear discriminant analysis. The audio event detection significantly reduces potential smoothing error and avoids any dependency on accurate window-size tuning. Segmentation may be performed using a generalized likelihood ratio and a Bayesian information criterion, and the segments may be clustered using hierarchical agglomerative clustering. Audio frames may be clustered using K-means and GMMs.Type: GrantFiled: December 14, 2020Date of Patent: December 12, 2023Assignee: Pindrop Security, Inc.Inventors: Elie Khoury, Matthew Garland
-
Patent number: 11829724Abstract: Support for natural language expressions is provided by the use of semantic grammars that describe the structure of expressions in that grammar and that construct the meaning of a corresponding natural language expression. A semantic grammar extension mechanism is provided, which allows one semantic grammar to be used in the place of another semantic grammar. This enriches the expressivity of semantic grammars in a simple, natural, and decoupled manner.Type: GrantFiled: July 16, 2021Date of Patent: November 28, 2023Assignee: SOUNDHOUND AI IP, LLCInventors: Bernard Mont-Reynaud, Christopher S. Wilson, Keyvan Mohajer
-
Patent number: 11798568Abstract: Conventional audio compression technologies perform a standardized signal transformation, independent of the type of the content. Multi-channel signals are decomposed into their signal components, subsequently quantized and encoded. This is disadvantageous due to lack of knowledge on the characteristics of scene composition, especially for e.g. multi-channel audio or Higher-Order Ambisonics (HOA) content. A method for decoding an encoded bitstream of multi-channel audio data and associated metadata is provided, including transforming the first Ambisonics format of the multi-channel audio data to a second Ambisonics format representation of the multi-channel audio data, wherein the transforming maps the first Ambisonics format of the multi-channel audio data into the second Ambisonics format representation of the multi-channel audio data.Type: GrantFiled: August 2, 2021Date of Patent: October 24, 2023Assignee: DOLBY LABORATORIES LICENSING CORPORATIONInventors: Oliver Wuebbolt, Johannes Boehm, Peter Jax
-
Patent number: 11790937Abstract: Systems and methods for optimizing voice detection via a network microphone device are disclosed herein. In one example, individual microphones of a network microphone device detect sound. The sound data is captured in a first buffer and analyzed to detect a trigger event. Metadata associated with the sound data is captured in a second buffer and provided to at least one network device to determine at least one characteristic of the detected sound based on the metadata. The network device provides a response that includes an instruction, based on the determined characteristic, to modify at least one performance parameter of the NMD. The NMD then modifies the at least one performance parameter based on the instruction.Type: GrantFiled: May 18, 2021Date of Patent: October 17, 2023Assignee: Sonos, Inc.Inventors: Connor Kristopher Smith, Kurt Thomas Soto, Charles Conor Sleith
-
Patent number: 11790925Abstract: The present technology relates to an information processing device and method, and a program capable of reducing a code amount. The information processing device includes: an acquisition unit that acquires space information regarding a position and a size of a child space within a parent space and position information in the child space indicating a position of an object within the child space, the child space being included in the parent space, and the object being included in the child space; and a calculation unit that calculates position information in the parent space indicating a position of the object within the parent space on the basis of the space information and the position information in the child space. The present technology can be applied to a signal processing device.Type: GrantFiled: June 20, 2019Date of Patent: October 17, 2023Assignee: Sony CorporationInventors: Mitsuyuki Hatanaka, Toru Chinen, Minoru Tsuji, Hiroyuki Honma, Yuki Yamamoto
-
Patent number: 11783831Abstract: A user may access multiple virtual assistants via a voice-enabled device. The device may receive a command from the user, detect a wakeword corresponding to one of the assistants, and send audio data to a command processing system corresponding to the selected assistant. The device transmits encrypted audio data to one or more systems and, upon detecting a wakeword or wake command corresponding to one of the systems, the device may provide an encryption key to that particular system. The system may decrypt and process the audio data without additional latency introduced by having to wait for the audio data to arrive.Type: GrantFiled: June 29, 2021Date of Patent: October 10, 2023Assignee: Amazon Technologies, Inc.Inventors: Philippe Andre Lantin, Ori Neidich, David Berol
-
Patent number: 11763813Abstract: Implementations described herein relate to reducing latency in automated assistant interactions. In some implementations, a client device can receive audio data that captures a spoken utterance of a user. The audio data can be processed to determine an assistant command to be performed by an automated assistant. The assistant command can be processed, using a latency prediction model, to generate a predicted latency to fulfill the assistant command. Further, the client device (or the automated assistant) can determine, based on the predicted latency, whether to audibly render pre-cached content for presentation to the user prior to audibly rendering content that is responsive to the spoken utterance. The pre-cached content can be tailored to the assistant command and audibly rendered for presentation to the user while the content is being obtained, and the content can be audibly rendered for presentation to the user subsequent to the pre-cached content.Type: GrantFiled: April 28, 2021Date of Patent: September 19, 2023Assignee: GOOGLE LLCInventors: Lior Alon, Rafael Goldfarb, Dekel Auster, Dan Rasin, Michael Andrew Goodman, Trevor Strohman, Nino Tasca, Valerie Nygaard, Jaclyn Konzelmann
-
Patent number: 11755844Abstract: Servers configured to perform automatic summarization of content in electronic messages are discloses herein. In one embodiment, upon receiving an email, an server determines whether the incoming email is a templated message. In response to determining that the incoming email is not a templated message, the server classifies one or more sentences in the email as a statement of decision, judgement, inference, or fact, cluster the classified statements into clusters, and select one or more of the clusters to automatically generate summaries of the incoming email. The server can then insert data representing the generated summaries into the email before transmitting the email to a destination via a computer network.Type: GrantFiled: May 24, 2021Date of Patent: September 12, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Kausik Ghatak, Ganessh Kumar R P, Priyanka Goel, Neeraj Singh, Swathi Karri
-
Patent number: 11756537Abstract: Techniques are described herein for enabling an automated assistant to adjust its behavior depending on a detected age range and/or “vocabulary level” of a user who is engaging with the automated assistant. In various implementations, data indicative of a user's utterance may be used to estimate one or more of the user's age range and/or vocabulary level. The estimated age range/vocabulary level may be used to influence various aspects of a data processing pipeline employed by an automated assistant. In various implementations, aspects of the data processing pipeline that may be influenced by the user's age range/vocabulary level may include one or more of automated assistant invocation, speech-to-text (“STT”) processing, intent matching, intent resolution (or fulfillment), natural language generation, and/or text-to-speech (“TTS”) processing. In some implementations, one or more tolerance thresholds associated with one or more of these aspects, such as grammatical tolerances, vocabularic tolerances, etc.Type: GrantFiled: October 10, 2022Date of Patent: September 12, 2023Assignee: GOOGLE LLCInventors: Pedro Gonnet Anders, Victor Carbune, Daniel Keysers, Thomas Deselaers, Sandro Feuz
-
Patent number: 11727944Abstract: An apparatus for decoding an encoded multichannel signal of a current frame to obtain three or more current audio output channels is provided. A multichannel processor is adapted to select two decoded channels from three or more decoded channels depending on first multichannel parameters. Moreover, the multichannel processor is adapted to generate a first group of two or more processed channels based on the selected channels. A noise filling module is adapted to identify for at least one of the selected channels, one or more frequency bands, within which all spectral lines are quantized to zero, and to generate a mixing channel using, depending on side information, a proper subset of three or more previous audio output channels that have been decoded, and to fill the spectral lines of frequency bands, within which all spectral lines are quantized to zero, with noise generated using spectral lines of the mixing channel.Type: GrantFiled: July 1, 2020Date of Patent: August 15, 2023Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.Inventors: Sascha Dick, Christian Helmrich, Nikolaus Rettelbach, Florian Schuh, Richard Fueg, Frederik Nagel
-
Patent number: 11721347Abstract: Some speech processing systems may handle some commands on-device rather than sending the audio data to a second device or system for processing. The first device may have limited speech processing capabilities sufficient for handling common language and/or commands, while the second device (e.g., an edge device and/or a remote system) may call on additional language models, entity libraries, skill components, etc. to perform additional tasks. An intermediate data generator may facilitate dividing speech processing operations between devices by generating a stream of data that includes a first-pass ASR output (e.g., a word or sub-word lattice) and other characteristics of the audio data such as whisper detection, speaker identification, media signatures, etc. The second device can perform the additional processing using the data stream; e.g., without using the audio data. Thus, privacy may be enhanced by processing the audio data locally without sending it to other devices/systems.Type: GrantFiled: June 29, 2021Date of Patent: August 8, 2023Assignee: Amazon Technologies, Inc.Inventors: Stanislaw Ignacy Pasko, Pawel Zelazko, Cagdas Bak, Eli Joshua Fidler, Michal Kowalczuk, Andrew Oberlin, Ariya Rastrow
-
Patent number: 11710493Abstract: Encoding and decoding systems are described for the provision of high quality digital representations of audio signals with particular attention to the correct perceptual rendering of fast transients at modest sample rates. This is achieved by optimising downsampling and upsampling filters to minimise the length of the impulse response while adequately attenuating alias products that have been found perceptually harmful.Type: GrantFiled: December 14, 2020Date of Patent: July 25, 2023Assignee: MQA LIMITEDInventors: Peter Graham Craven, John Robert Stuart
-
Patent number: 11688509Abstract: In general, this disclosure describes techniques for a health management system that schedules medical appointments based on a dialog with a user (e.g., a patient), clinical guideline information, and/or other information. The health management system may engage in a dialog with the user, the dialog including requests from the health management system for audio input to the user device and audio input from the user in response to each request. The health management system may extract information from the audio input and compare the extracted information to clinical guideline information to determine one or more probable health conditions of the user. The health management system may determine a time allotment, identify a health care provider type and a platform for a medical appointment based on the one or more probable health conditions.Type: GrantFiled: January 16, 2020Date of Patent: June 27, 2023Assignee: SRI INTERNATIONALInventors: Mark Hanson, Bhaskar Ramamurthy, Manish Kothari, Brecken Hu Uhl
-
Patent number: 11688404Abstract: A method includes receiving an utterance of speech and segmenting the utterance of speech into a plurality of segments. For each segment of the utterance of speech, the method also includes extracting a speaker=discriminative embedding from the segment and predicting a probability distribution over possible speakers for the segment using a probabilistic generative model configured to receive the extracted speaker-discriminative embedding as a feature input. The probabilistic generative model trained on a corpus of training speech utterances each segmented into a plurality of training segments. Each training segment including a corresponding speaker-discriminative embedding and a corresponding speaker label. The method also includes assigning a speaker label to each segment of the utterance of speech based on the probability distribution over possible speakers for the corresponding segment.Type: GrantFiled: May 26, 2021Date of Patent: June 27, 2023Assignee: Google LLCInventors: Chong Wang, Aonan Zhang, Quan Wang, Zhenyao Zhu
-
Patent number: 11683320Abstract: The present disclosure is generally directed to a data processing system for customizing content in a voice activated computer network environment. With user consent, the data processing system can improve the efficiency and effectiveness of auditory data packet transmission over one or more computer networks by, for example, increasing the accuracy of the voice identification process used in the generation of customized content. The present solution can make accurate identifications while generating fewer audio identification models, which are computationally intensive to generate.Type: GrantFiled: April 22, 2021Date of Patent: June 20, 2023Assignee: GOOGLE LLCInventors: Victor Carbune, Thomas Deselaers, Sandro Feuz
-
Patent number: 11681870Abstract: Disclosed are devices, systems, apparatuses, methods, products, and other implementations for improving the accuracy and latency in work estimation systems and methods through the invocation of serverless applications and/or servers and the interfacing of natural language processing endpoint devices.Type: GrantFiled: January 28, 2019Date of Patent: June 20, 2023Assignee: ENSONO, LPInventors: Jeremy Bowers, David Pearson
-
Patent number: 11682415Abstract: In an approach, a processor extracts an audio signal from a video clip. A processor converts the audio signal into a text sequence. A processor selects a first set of keywords from the text sequence, the first set of keywords corresponding to a first audio segment of the audio signal. A processor tags a target video segment of the video clip with the first set of keywords, the target video segment corresponding to the first audio segment.Type: GrantFiled: March 19, 2021Date of Patent: June 20, 2023Assignee: International Business Machines CorporationInventors: Li Cao, Jing Xu, Ze Ming Zhao, Xue Ying Zhang
-
Patent number: 11664022Abstract: Provided is a method of processing a user input to deliver the user input to at least one of a plurality of assistants, includes: converting a user input including a voice signal based on a predetermined rule to generate an instruction; splitting a complex instruction into partial instructions based on that the generated instruction is the complex instruction requesting two or more events; and determining a domain of each of the partial instructions and distributing the partial instructions to at least one of a plurality of voice assistants based on the domain. According to an embodiment, the washer may be related to artificial intelligence (AI) modules, unmanned aerial vehicles (UAVs), robots, augmented reality (AR) devices, virtual reality (VR) devices, and 5G service-related devices.Type: GrantFiled: June 24, 2020Date of Patent: May 30, 2023Assignee: LG ELECTRONICS INC.Inventors: Hoolim Kim, Euihyeok Lee, Kihyeon Kim