Patents Examined by Jakieda R Jackson
  • Patent number: 11817083
    Abstract: A first playback device is configured to perform functions comprising: detecting sound, identifying a wake word based on the sound as detected by the first device, receiving an indication that a second playback device has also detected the sound and identified the wake word based on the sound as detected by the second device, after receiving the indication, evaluating which of the first and second devices is to extract sound data representing the sound and thereby determining that the extraction of the sound data is to be performed by the second device over the first device, in response to the determining, foregoing extraction of the sound data, receiving VAS response data that is indicative of a given VAS response corresponding to a given voice input identified in the sound data extracted by the second device, and based on the VAS response data, output the given VAS response.
    Type: Grant
    Filed: December 22, 2022
    Date of Patent: November 14, 2023
    Assignee: Sonos, Inc.
    Inventors: John Tolomei, Klaus Hartung
  • Patent number: 11810132
    Abstract: The present invention provides a system and method for presenting global issues to users and followers of a social media platform, allowing the users and followers to provide viewpoints on the global issues, ensuring that the users providing the viewpoints are authentic, and analyzing the various viewpoints to develop statistical data including the location of those providing viewpoints. The present invention also allows a user to present a global issue for consideration by users of the platform, for example, a social media internet-based website, and allows followers of the user to provide their viewpoints on such global issue. Simultaneously, the location of said followers will be collected and collated along with their responses.
    Type: Grant
    Filed: March 11, 2023
    Date of Patent: November 7, 2023
    Assignee: WORLD ANSWER ZONE LLC
    Inventors: Abdulrhman Khald Mohammedkhalil, Talead Saaty
  • Patent number: 11810546
    Abstract: Provided are a sample generation method and apparatus. The sample generation method comprises: acquiring a plurality of text-audio pairs, wherein each text-audio pair contains a text segment and an audio segment; calculating an audio feature of an audio segment of each of the plurality of text-audio pairs, and selecting, by means of screening and according to the audio feature, a target text-audio pair and a splicing text-audio pair corresponding to the target text-audio pair from among the plurality of text-audio pairs; splicing the target text-audio pair and the splicing text-audio pair into a text-audio pair to be tested, and testing the text-audio pair to be tested; and when the text-audio pair to be tested meets a preset test condition, writing the text-audio pair to be tested into a training database.
    Type: Grant
    Filed: November 12, 2021
    Date of Patent: November 7, 2023
    Assignee: Beijing Yuanli Weilai Science and Technology Co., Ltd.
    Inventors: Dongxiao Wang, Mingqi Yang, Nan Ma, Long Xia, Changzhen Guo
  • Patent number: 11810574
    Abstract: A system and method of capturing a voicer's voice data to illuminate features of their own physiology and produce voice-driven internal imaging. Data encoded within the voice is captured, decoded, modeled and simulated. Using vibrations of a voicer's voice as the input, features of their own physiology are outputted in the format of an internal image.
    Type: Grant
    Filed: April 30, 2023
    Date of Patent: November 7, 2023
    Inventor: Leslie Helpert
  • Patent number: 11790896
    Abstract: Various embodiments of the invention provide methods, systems, and computer-program products for analyzing an audio to capture semantic and non-semantic characteristics of the audio and corresponding relationships between the semantic and non-semantic characteristics. In particular embodiments, the audio is segmented into a set of utterance segments containing a party speaking on the audio and a set of noise segments containing the party not speaking on the audio. The semantic and non-semantic characteristics are then captured for each of the utterance segments. Specifically, speech analytics is performed on each segment to identify the words spoken by the party in the segment as semantic characteristics. Further, laughter, emotion, and sentence boundary detection is performed on each segment to identify occurrences of such in the segment as non-semantic characteristics.
    Type: Grant
    Filed: September 27, 2021
    Date of Patent: October 17, 2023
    Assignee: Noble Systems Corporation
    Inventors: Patrick M. McDaniel, Christopher S. Haggerty
  • Patent number: 11776561
    Abstract: A method includes obtaining one or more speech models, each model including one or more acoustic states and, provided that the model includes multiple acoustic states, allowed transitions therebetween. The method further includes receiving a speech sample produced by a subject while a physiological state of the subject was unknown. The method further includes mapping at least one sample portion of the speech sample to a respective one of the speech models, by computing a plurality of feature vectors quantifying acoustic features of different respective portions of the sample portion, and mapping the feature vectors to respective acoustic states included in the speech model such that a total distance between the feature vectors and the respective acoustic states is minimized. The method further includes, in response to mapping the sample portion to the speech model, communicating an output indicating the physiological state of the subject. Other embodiments are also described.
    Type: Grant
    Filed: November 21, 2022
    Date of Patent: October 3, 2023
    Assignee: CORDIO MEDICAL LTD.
    Inventor: Ilan D. Shallom
  • Patent number: 11776535
    Abstract: A semantic understanding method and apparatus, and a device and a storage medium are provided. The method includes: acquiring a recognition character string that matches speech information; acquiring, from an entity vocabulary library, at least one entity vocabulary respectively corresponding to each recognition character in the recognition character string; and according to a situation of each entity vocabulary hitting the recognition character string, determining a matching entity vocabulary as a semantic understanding result of the speech information. By means of the method, insofar as a completely matching entity vocabulary is not acquired, a matching entity vocabulary can still be determined according to an entity vocabulary library, and semantic information of speech is thus accurately understood; and the method also has relatively high fault tolerance for situations such as wrong words, added words, and omitted words, such that the semantic understanding accuracy of speech information is improved.
    Type: Grant
    Filed: August 11, 2022
    Date of Patent: October 3, 2023
    Assignee: BEIJING BYTEDANCE NETWORK TECHNOLOGY CO., LTD.
    Inventors: He Zhang, Hang Li, Yang Wang
  • Patent number: 11758346
    Abstract: Methods for simulating a source of sound are provided. One method includes determining, by a computer, a location in physical space of a head of a user. The location is determined by capturing images by a camera of the physical space in which a user is located. The method further includes determining a sound for delivery to two speakers worn by the user and determining, by the computer, an emanating location in the physical space for the sound. The method further includes establishing, by the computer, acoustic signals for each speaker based on the location in the physical space of the head, the sound, the emanating location in the physical space, and a selected auditory characteristics of the user. The auditory characteristics of the user are identified based on a calibration process. The method further includes transmitting, by the computer, the acoustic signals to the two speakers. The acoustic signals simulate that the sound originated at the emanating location in space.
    Type: Grant
    Filed: March 26, 2021
    Date of Patent: September 12, 2023
    Assignee: Sony Interactive Entertainment Inc.
    Inventor: Steven Osman
  • Patent number: 11755836
    Abstract: An improved speech-based/natural language point-of-sale customer order system which is useful for any business that interacts with customers through speech or sound. Despite the advances in speech recognition, currently available voice ordering interfaces have proven to be unintuitive and lack reliability. Voice recognition has so far proven to be inefficient in retail contexts, and therefore voice recognition has so far achieved a low level of usage penetration in the retail sector. The present invention facilitates the automated operation of the ordering function of a drive-through restaurant, fast food restaurant or other business establishment by replacing an employee or other means of capturing order data with an ordering system employing a highly accurate speech recognition component that is able to be trained to recognize a wide vocabulary of words, and associate tones and other metadata in a manner not previously achieved in speech-to-text systems.
    Type: Grant
    Filed: February 18, 2020
    Date of Patent: September 12, 2023
    Assignee: Valyant AI, Inc.
    Inventors: Robley Theron Carpenter, II, Jacob Daniel Poore, Benjamin William Thielker
  • Patent number: 11749274
    Abstract: A method includes receiving an utterance at a computerized automated assistant system, and detecting, via a date/time constraint module of the computerized automated assistant system, one or more constraints in the utterance associated with a date or time. The utterance is associated with a domain. The method further comprises generating, via the date/time constraint module, a periodic set for each of the one or more constraints associated with the date or time, and combining, via the date/time constraint module, the one or more periodic sets. The method further comprises processing, via a dialogue manager module of the computerized automated assistant system, the combined periodic sets to determine an action, and executing the action at the computerized automated assistant system.
    Type: Grant
    Filed: December 1, 2020
    Date of Patent: September 5, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Jordan Rian Cohen, David Leo Wright Hall, Jason Andrew Wolfe, Daniel Lawrence Roth, Daniel Klein
  • Patent number: 11748713
    Abstract: Techniques for improving a natural language processing (NLP) system by ingesting information from various sources and generating personalized results in response to a user query. With user permission, the system may ingest personal data from a variety of data sources, allowing the system to access existing personal data in order to identify correlated information, make inferences and predictions, prompt the user when appropriate, and generate improved results. The ingestion step may process both textual and non-textual data (e.g., images, audio, video). Using this consolidated data storage, the system may search amongst multiple different sources of data and identify correlations to improve processing. In addition, the system may proactively provide suggestions to a customer (e.g., detect upcoming event and suggest an action corresponding to the event) or may provide personalized results (e.g., filter results based on user preferences determined by the inferences and predictions).
    Type: Grant
    Filed: March 31, 2020
    Date of Patent: September 5, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Luu Tran, Omer Luzzatti, Nathanael Joe Hayashi, Abhinav Athreya
  • Patent number: 11749257
    Abstract: A method for evaluating a speech forced alignment model, an electronic device, and a storage medium are provided. The method includes: according to each audio segment in a test set and a text corresponding to each audio segment, acquiring, by using a speech forced alignment model to be evaluated, a phoneme sequence corresponding to each audio segment and a predicted start time and a predicted end time of each phoneme in the phoneme sequence; for each phoneme, obtaining a time accuracy score of the phoneme according to the predicted start time and the predicted end time of the phoneme and a predetermined reference start time and a predetermined reference end time of the phoneme; and determining a time accuracy score of said speech forced alignment model according to the time accuracy score of each phoneme.
    Type: Grant
    Filed: March 6, 2023
    Date of Patent: September 5, 2023
    Assignee: BEIJING CENTURY TAL EDUCATION TECHNOLOGY CO., LTD.
    Inventors: Lizhao Guo, Song Yang, Junfeng Yuan
  • Patent number: 11741965
    Abstract: A system is provided for determining a natural language output, responsive to a user input, using different speech personality profiles. The system may determine to user a particular language generation profile based at least in part on data relating to the user input and data corresponding to the response to the user input. The language generation profile may include different attributes that are used to determine the natural language output, such as, prosody, replacement words, injected words, sentence structure, etc.
    Type: Grant
    Filed: June 26, 2020
    Date of Patent: August 29, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Ramsey Abou-Zaki Opp, Anantdeep Gill, Angela Liu, Anisha Jain, Justin Maxwell Bollag, Nathan Yeazel, Sara Renee Bilich, Spencer B Baker
  • Patent number: 11735161
    Abstract: A technique improves training and speech quality of a text-to-speech (TTS) system having an artificial intelligence, such as a neural network. The TTS system is organized as a front-end subsystem and a back-end subsystem. The front-end subsystem is configured to provide analysis and conversion of text into input vectors, each having at least a base frequency, f0, a phenome duration, and a phoneme sequence that is processed by a signal generation unit of the back-end subsystem. The signal generation unit includes the neural network interacting with a pre-existing knowledgebase of phenomes to generate audible speech from the input vectors. The technique applies an error signal from the neural network to correct imperfections of the pre-existing knowledgebase of phenomes to generate audible speech signals. A back-end training system is configured to train the signal generation unit by applying psychoacoustic principles to improve quality of the generated audible speech signals.
    Type: Grant
    Filed: January 31, 2022
    Date of Patent: August 22, 2023
    Assignee: Telepathy Labs, Inc
    Inventors: Martin Reber, Vijeta Avijeet
  • Patent number: 11727933
    Abstract: A first network microphone device (NMD) is configured to receive, from a second NMD, a first arbitration message including (i) a first measure of confidence associated with a voice input as detected by the second NMD and (ii) the voice input as detected by the second NMD, and receive, from a third NMD, a second arbitration message including (i) a second measure of confidence associated with the voice input as detected by the third NMD and (ii) the voice input as detected by the third NMD. The first NMD is configured to determine that the second measure of confidence is greater than the first measure of confidence and based on the determination, perform voice recognition based on the voice input as detected by the third NMD, where the voice input includes a command to control audio playback by the first, second, and/or third NMD, and after performing voice recognition, executing the command.
    Type: Grant
    Filed: April 18, 2022
    Date of Patent: August 15, 2023
    Assignee: Sonos, Inc.
    Inventors: Steven Beckhardt, Ted Lin
  • Patent number: 11727922
    Abstract: A computerized system for deriving expression of intent from recorded speech includes: a text classification module comparing a transcription of recorded speech against a text classifier to generate a first set of representations of potential intents; a phonetics classification module comparing a phonetic transcription of the recorded speech against a phonetics classifier to generate a second set of representations; an audio classification module comparing an audio version of the recorded speech with an audio classifier to generate a third set of representations; and a discriminator module for receiving the first, second and third sets of the representations of potential intents and generating one derived expression of intent by processing the first, second and third sets together; where at least two of the text classification module, the phonetics classification module, and the audio classification module are asynchronous processes from one another.
    Type: Grant
    Filed: May 11, 2021
    Date of Patent: August 15, 2023
    Assignee: Verint Americas Inc.
    Inventor: Moshe Villaizan
  • Patent number: 11720759
    Abstract: An electronic apparatus includes an input unit comprising input circuitry configured to receive a natural language input, a communicator comprising communication circuitry configured to perform communication with a plurality of external chatting servers, and a processor configured to analyze a characteristic of the natural language and a characteristic of the user and to identify a chatting server corresponding to the natural language from among the plurality of chatting servers, and to control the communicator to transmit the natural language to the identified chatting server in order to receive a response with respect to the natural language.
    Type: Grant
    Filed: July 16, 2021
    Date of Patent: August 8, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Chang-hwan Choi, Ji-hwan Yun, Man-un Jeong
  • Patent number: 11714968
    Abstract: Systems and methods for identifying data of interest are disclosed. The system may retrieve unstructured data from an internet data source via an alert system or RSS feed. The system may input the unstructured data into various models and scoring systems to determine whether the data is of interest. The models and scoring systems may be executed in order or in parallel. For example, the system may input the unstructured data into a Naïve Bayes machine learning model, a long short-term memory (LSTM) machine learning model, a named entity recognition (NER) model, a semantic role labeling (SRL) model, a sentiment scoring algorithm, and/or a gradient boosted regression tree (GBRT) machine learning model. Based on determining that the unstructured data is of interest, a data alert may be generated and transmitted for manual review or as part of an automated decisioning process.
    Type: Grant
    Filed: February 24, 2022
    Date of Patent: August 1, 2023
    Assignee: American Express Travel Related Services Company, Inc.
    Inventors: Ravi Batra, Sandeep Bose, Mario Fragoso, Ravneet Ghuman, Madhu Sudan Reddy Gudur, Suraj Madnani, Curtis T. Merryweather, Ravi Varma, Vinod Yadav
  • Patent number: 11710477
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech endpointing are described. In one aspect, a method includes the action of accessing voice query log data that includes voice queries spoken by a particular user. The actions further include based on the voice query log data that includes voice queries spoken by a particular user, determining a pause threshold from the voice query log data that includes voice queries spoken by the particular user. The actions further include receiving, from the particular user, an utterance. The actions further include determining that the particular user has stopped speaking for at least a period of time equal to the pause threshold. The actions further include based on determining that the particular user has stopped speaking for at least a period of time equal to the pause threshold, processing the utterance as a voice query.
    Type: Grant
    Filed: June 21, 2021
    Date of Patent: July 25, 2023
    Assignee: Google LLC
    Inventors: Siddhi Tadpatrikar, Michael Buchanan, Pravir Kumar Gupta
  • Patent number: 11705107
    Abstract: Embodiments of a production-quality text-to-speech (TTS) system constructed from deep neural networks are described. System embodiments comprise five major building blocks: a segmentation model for locating phoneme boundaries, a grapheme-to-phoneme conversion model, a phoneme duration prediction model, a fundamental frequency prediction model, and an audio synthesis model. For embodiments of the segmentation model, phoneme boundary detection was performed with deep neural networks using Connectionist Temporal Classification (CTC) loss. For embodiments of the audio synthesis model, a variant of WaveNet was created that requires fewer parameters and trains faster than the original. By using a neural network for each component, system embodiments are simpler and more flexible than traditional TTS systems, where each component requires laborious feature engineering and extensive domain expertise. Inference with system embodiments may be performed faster than real time.
    Type: Grant
    Filed: October 1, 2020
    Date of Patent: July 18, 2023
    Assignee: Baidu USA LLC
    Inventors: Sercan O. Arik, Mike Chrzanowski, Adam Coates, Gregory Diamos, Andrew Gibiansky, John Miller, Andrew Ng, Jonathan Raiman, Shubhahrata Sengupta, Mohammad Shoeybi