Patents Examined by Angela Armstrong
  • Patent number: 11727934
    Abstract: Implementations set forth herein relate to phasing-out of vehicle computing device versions while ensuring useful responsiveness of any vehicle computing device versions that are still in operation. Certain features of updated computing devices may not be available to prior versions of computing devices because of hardware limitations. The implementations set forth herein eliminate crashes and wasteful data transmissions caused by prior versions of computing devices that have not been, or cannot be, upgraded. A server device can be responsive to a particular intent request provided to a vehicle computing device, despite the intent request being associated with an action that a particular version of the vehicle computing device cannot execute. In response, the server device can elect to provide speech to text data, and/or natural language understanding data, in furtherance of allowing the vehicle computing device to continue leveraging resources at the server device.
    Type: Grant
    Filed: April 25, 2022
    Date of Patent: August 15, 2023
    Assignee: GOOGLE LLC
    Inventors: Vikram Aggarwal, Vinod Krishnan
  • Patent number: 11720748
    Abstract: A system for automatically labeling data using conceptual descriptions. In one example, the system includes an electronic processor configured to generate unlabeled training data examples from one or more natural language documents and, for each of a plurality of categories, determine one or more concepts associated with a conceptual description of the category and generate a weak annotator for each of the one or more concepts. The electronic processor is also configured to apply each weak annotator to each training data example and, when a training data example satisfies a weak annotator, output a category associated with the weak annotator. For each training data example, the electronic processor determines a probabilistic distribution of the plurality of categories. For each training data example, the electronic processor labels the training data example with a category having the highest value in the probabilistic distribution determined for the training data example.
    Type: Grant
    Filed: April 27, 2020
    Date of Patent: August 8, 2023
    Assignee: Robert Bosch GmbH
    Inventors: Haibo Ding, Zhe Feng
  • Patent number: 11720635
    Abstract: Generating and/or recommending command bundles for a user of an automated assistant. A command bundle comprises a plurality of discrete actions that can be performed by an automated assistant. One or more of the actions of a command bundle can cause transmission of a corresponding command and/or other data to one or more devices and/or agents that are distinct from devices and/or agents to which data is transmitted based on other action(s) of the bundle. Implementations determine command bundles that are likely relevant to a user, and present those command bundles as suggestions to the user. In some of those implementations, a machine learning model is utilized to generate a user action embedding for the user, and a command bundle embedding for each of a plurality of command bundles. Command bundle(s) can be selected for suggestion based on comparison of the user action embedding and the command bundle embeddings.
    Type: Grant
    Filed: January 24, 2022
    Date of Patent: August 8, 2023
    Assignee: GOOGLE LLC
    Inventor: Yuzhao Ni
  • Patent number: 11715466
    Abstract: Systems and methods are described herein for locally interpreting a voice query and for managing a storage size of data stored locally to support such local interpretation of voice queries. A voice query is received and compared with a plurality of stored voice queries having similar audio characteristics. If a match is identified, text corresponding to the matching stored voice query is retrieved, and an action corresponding to the retrieved text is performed. If the locally stored table does not contain a stored voice query that matches the voice query, the voice query is transmitted to a remote server for transcription. Once the transcription is received from the remote server, the voice query and the transcription are stored in the table in association with one another.
    Type: Grant
    Filed: November 21, 2019
    Date of Patent: August 1, 2023
    Assignee: Rovi Guides, Inc.
    Inventors: Ankur Anil Aher, Kiran Das B, Jyothi Ekambaram, Nishchit Mahajan
  • Patent number: 11714964
    Abstract: An apparatus comprises processing circuitry configured to pre-process text data for inputting to a trained model, the pre-processing comprising: receiving a set of text data including numerical information, the set of text data comprising a plurality of tokens, wherein a first subset of the plurality of tokens comprises tokens that do not comprise numerical information, and a second subset of the plurality of tokens comprises tokens that each comprise respective numerical information; transforming each of the plurality of tokens into a respective encoding vector, each of the plurality of tokens in the second subset having a common encoding vector; assigning a respective numerical vector to each of the plurality of tokens, wherein each token in the second subset is assigned a respective numerical vector in dependence on the numerical information in said token; and combining the encoding vectors and numerical vectors to obtain a vector representation of the text data.
    Type: Grant
    Filed: March 13, 2020
    Date of Patent: August 1, 2023
    Assignee: Canon Medical Systems Corporation
    Inventor: Maciej Pajak
  • Patent number: 11710481
    Abstract: A method, performed by an electronic device, of providing a conversational service includes: receiving an utterance input; identifying a temporal expression representing a time in a text obtained from the utterance input; determining a time point related to the utterance input based on the temporal expression; selecting a database corresponding to the determined time point from among a plurality of databases storing information about a conversation history of a user using the conversational service; interpreting the text based on information about the conversation history of the user, the conversation history information being acquired from the selected database; generating a response message to the utterance input based on a result of the interpreting; and outputting the generated response message.
    Type: Grant
    Filed: February 21, 2020
    Date of Patent: July 25, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jina Ham, Kangwook Lee, Soofeel Kim, Yewon Park, Wonjong Choi
  • Patent number: 11675844
    Abstract: Generating and/or recommending command bundles for a user of an automated assistant. A command bundle comprises a plurality of discrete actions that can be performed by an automated assistant. One or more of the actions of a command bundle can cause transmission of a corresponding command and/or other data to one or more devices and/or agents that are distinct from devices and/or agents to which data is transmitted based on other action(s) of the bundle. Implementations determine command bundles that are likely relevant to a user, and present those command bundles as suggestions to the user. In some of those implementations, a machine learning model is utilized to generate a user action embedding for the user, and a command bundle embedding for each of a plurality of command bundles. Command bundle(s) can be selected for suggestion based on comparison of the user action embedding and the command bundle embeddings.
    Type: Grant
    Filed: January 24, 2022
    Date of Patent: June 13, 2023
    Assignee: GOOGLE LLC
    Inventor: Yuzhao Ni
  • Patent number: 11640505
    Abstract: Embodiments described herein provide systems and methods for an Explicit Memory Tracker (EMT) that tracks each rule sentence to perform decision making and to generate follow-up clarifying questions. Specifically, the EMT first segments the regulation text into several rule sentences and allocates the segmented rule sentences into memory modules, and then feeds information regarding the user scenario and dialogue history into the EMT sequentially to update each memory module separately. At each dialogue turn, the EMT makes a decision among based on current memory status of the memory modules whether further clarification is needed to come up with an answer to a user question. The EMT determines that further clarification is needed by identifying an underspecified rule sentence span by modulating token-level span distributions with sentence-level selection scores. The EMT extracts the underspecified rule sentence span and rephrases the underspecified rule sentence span to generate a follow-up question.
    Type: Grant
    Filed: April 30, 2020
    Date of Patent: May 2, 2023
    Assignee: Salesforce.com, Inc.
    Inventors: Yifan Gao, Chu Hong Hoi, Shafiq Rayhan Joty, Chien-Sheng Wu
  • Patent number: 11587563
    Abstract: A method of presenting a signal to a speech processing engine is disclosed. According to an example of the method, an audio signal is received via a microphone. A portion of the audio signal is identified, and a probability is determined that the portion comprises speech directed by a user of the speech processing engine as input to the speech processing engine. In accordance with a determination that the probability exceeds a threshold, the portion of the audio signal is presented as input to the speech processing engine. In accordance with a determination that the probability does not exceed the threshold, the portion of the audio signal is not presented as input to the speech processing engine.
    Type: Grant
    Filed: February 28, 2020
    Date of Patent: February 21, 2023
    Assignee: Magic Leap, Inc.
    Inventors: Anthony Robert Sheeder, Colby Nelson Leider
  • Patent number: 11568864
    Abstract: A computing system for generating image data representing a speaker's face includes a detection device configured to route data representing a voice signal to one or more processors and a data processing device comprising the one or more processors configured to generate a representation of a speaker that generated the voice signal in response to receiving the voice signal. The data processing device executes a voice embedding function to generate a feature vector from the voice signal representing one or more signal features of the voice signal, maps a signal feature of the feature vector to a visual feature of the speaker by a modality transfer function specifying a relationship between the visual feature of the speaker and the signal feature of the feature vector; and generates a visual representation of at least a portion of the speaker based on the mapping, the visual representation comprising the visual feature.
    Type: Grant
    Filed: August 13, 2019
    Date of Patent: January 31, 2023
    Assignee: Carnegie Mellon University
    Inventor: Rita Singh
  • Patent number: 11557284
    Abstract: A method, system and computer program product for speech recognition using multiple languages includes receiving, by one or more processors, an input from a user, the input includes a sentence in a first language. The one or more processors translate the sentence to a plurality of languages different than the first language, and create vectors associated with the plurality of languages, each vector includes a representation of the sentence in each of the plurality of languages. The one or more processors calculate eigenvectors for each vector associated with a language in the plurality of languages, and based on the calculated eigenvectors, a score is assigned to each of the plurality of languages according to a relevance for determining a meaning of the sentence.
    Type: Grant
    Filed: January 3, 2020
    Date of Patent: January 17, 2023
    Assignee: International Business Machines Corporation
    Inventors: Zhong Fang Yuan, Kun Yan Yin, He Li, Tong Liu, Hai Ji
  • Patent number: 11545163
    Abstract: A loss function of a signal including an audio signal is determined. A loss function determining system for an audio signal is provided. A loss function is determined by: determining a reference quantization index by quantizing an original input signal; inputting the original input signal to a neural network classifier and applying an activation function to an output layer of the neural network classifier; and determining a total loss function for the neural network classifier using an output of the activation function and the reference quantization index.
    Type: Grant
    Filed: December 27, 2019
    Date of Patent: January 3, 2023
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Seung Kwon Beack, Woo-taek Lim, Tae Jin Lee
  • Patent number: 11532181
    Abstract: An electronic device and method are disclosed herein. The electronic device includes a microphone, a camera, an output device, a memory, and a processor. The processor implements the method, including receiving a voice input and/or capturing an image, and analyze the first voice input or the image to determine at least one of a user's intent, emotion, and situation based on predefined keywords and expressions, identifying a category based on the input, selecting first information based on the category, selecting and outputting a first query prompting confirmation of output of the first information, detect a first responsive input to the first query, and when a condition to output the first information is satisfied, output a second query, detecting a second input responsive to the second query, and selectively outputting the first information based on the second input.
    Type: Grant
    Filed: March 30, 2018
    Date of Patent: December 20, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yong Ju Yu, Ja Min Goo, Seong Hoon You, Ki Young Kwon, Ki Won Kim, Eun Young Kim, Ji Min Kim, Chul Kwi Kim, Hyung Woo Kim, Joo Namkung, Ji Hyun Park, Sae Gee Oh, Dong Kyu Lee, Im Sung Lee, Chan Won Lee, Si Hak Jang
  • Patent number: 11531819
    Abstract: Machine learned models take in vectors representing desired behaviors and generate voice vectors that provide the parameters for text-to-speech (TTS) synthesis. Models may be trained on behavior vectors that include user profile attributes, situational attributes, or semantic attributes. Situational attributes may include age of people present, music that is playing, location, noise, and mood. Semantic attributes may include presence of proper nouns, number of modifiers, emotional charge, and domain of discourse. TTS voice parameters may apply per utterance and per word as to enable contrastive emphasis.
    Type: Grant
    Filed: January 14, 2020
    Date of Patent: December 20, 2022
    Assignee: SoundHound, Inc.
    Inventors: Bernard Mont-Reynaud, Monika Almudafar-Depeyrot
  • Patent number: 11527241
    Abstract: A display device and a method for controlling the same are provided. The display device includes a rollable display screen, a voice acquisition unit, an identification control unit, a drive control unit and a display control unit. The voice acquisition unit is configured to acquire a first voice command. The identification control unit is configured to identify the first voice command acquired by the voice acquisition unit as a voice process command, and the voice process command includes a rolling operation command and a display drive command. The drive control unit is configured to perform an operation corresponding to the rolling operation command on the rollable display screen according to the rolling operation command. The display control unit is configured to control a display state of the rollable display screen according to the display drive command.
    Type: Grant
    Filed: October 11, 2019
    Date of Patent: December 13, 2022
    Assignees: BEIJING BOE OPTOELECTRONICS TECHNOLOGY CO., LTD., Beijing BOE Technology Development Co., Ltd.
    Inventors: Jiyang Shao, Yuxin Bi, Jian Sun, Hao Zhang
  • Patent number: 11495236
    Abstract: An apparatus for processing an input audio signal relies on a cascade of filterbanks, the cascade having a synthesis filterbank for synthesizing an audio intermediate signal from the input audio signal, the input audio signal being represented by a plurality of first subband signals generated by an analysis filterbank, wherein a number of filterbank channels of the synthesis filterbank is smaller than a number of channels of the analysis filterbank. The apparatus furthermore has a further analysis filterbank for generating a plurality of second subband signals from the audio intermediate signal, wherein the further analysis filterbank has a number of channels being different from the number of channels of the synthesis filterbank, so that a sampling rate of a subband signal of the plurality of second subband signals is different from a sampling rate of a first subband signal of the plurality of first subband signals.
    Type: Grant
    Filed: May 19, 2020
    Date of Patent: November 8, 2022
    Assignees: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V., Dolby International AB
    Inventors: Lars Villemoes, Per Ekstrand, Sascha Disch, Frederik Nagel, Stephan Wilde
  • Patent number: 11488596
    Abstract: A method for recording audio content in a group conversation among a plurality of members includes: controlling an image capturing device to continuously capture images of the members; executing an image processing procedure on the images of the members to determine whether a specific gesture is detected; when the determination is affirmative, controlling an audio recording device to activate and perform directional audio collection with respect to a direction that is associated with the specific gesture to record audio data; and controlling a data storage to store the audio data and a time stamp associated with the audio data as an entry of conversation record.
    Type: Grant
    Filed: April 27, 2020
    Date of Patent: November 1, 2022
    Inventor: Hsiao-Han Chen
  • Patent number: 11488577
    Abstract: The present application discloses a training method and an apparatus for a speech synthesis model, electronic device, and storage medium. The method includes: taking a syllable input sequence, a phoneme input sequence and a Chinese character input sequence of a current sample as inputs of an encoder of a model to be trained, to obtain encoded representations of these three sequences at an output end of the encoder; fusing the encoded representations of these three sequences, to obtain a weighted combination of these three sequences; taking the weighted combination as an input of an attention module, to obtain a weighted average of the weighted combination at each moment at an output end of the attention module; taking the weighted average as an input of a decoder of the model to be trained, to obtain a speech Mel spectrum of the current sample at an output end of the decoder.
    Type: Grant
    Filed: June 19, 2020
    Date of Patent: November 1, 2022
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Zhipeng Chen, Jinfeng Bai, Lei Jia
  • Patent number: 11461779
    Abstract: Techniques for transferring control of a system-user dialog session are described. A first speechlet component may interact with a user until the first speechlet component receives user input that the first speechlet component cannot handle. The first speechlet component may output an action representing the user input. A system may determine a second speechlet component configured to execute the action. The system may send the second speechlet component a navigator object that results in the second speechlet component handling the user interaction that the first speechlet component could not handle. Once the second speechlet component is finished processing, the second speechlet component may output an updated navigator object, which causes the first speechlet component to either further interact with a user or cause a current dialog session to be closed.
    Type: Grant
    Filed: March 23, 2018
    Date of Patent: October 4, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Rohin Dabas, Troy Dean Schuring, Xu Zhang, Maksym Kolodeznyi, Andres Felipe Borja Jaramillo, Nnenna Eleanya Okwara, Alberto Milan Gutierrez, Rashmi Tonge
  • Patent number: 11449682
    Abstract: Systems, devices, and methods provide improved autonomous agents that are configured to respond to a user's query based on an emotion with which the query was expressed and a personality trait of the user. The agent may identify candidate answers to the query that are each associated with an emotion and/or a personality trait. The autonomous agent may utilize a predefined protocol set that indicates transitions between emotional states. A transition may correspond to an action associated with an emotion and/or a personality trait that, if performed, is likely to maintain a user in or transition the user to a preferred emotional state. The responses may be scored based at least in part their corresponding emotions and/or personality traits and in light of the transitions identified in the protocol set. A particular scored response may be selected and provided to the user in response to their query.
    Type: Grant
    Filed: May 7, 2020
    Date of Patent: September 20, 2022
    Assignee: ORACLE INTERNATIONAL CORPORATION
    Inventor: Boris Galitsky