Patents Examined by Paras D Shah
  • Patent number: 11170763
    Abstract: A voice interaction system performs a voice interaction with a user. The voice interaction system includes: ask-again detection means for detecting ask-again by the user; response-sentence generation means for generating, when the ask-again has been detected by the ask-again detection means, a response sentence for the ask-again in response to the ask-again based on a response sentence responding to the user before the ask-again; and storage means for storing a history of the voice interaction with the user. The response-sentence generation means generates, when the response sentence includes a word whose frequency of appearance in the history of the voice interaction in the storage means is equal to or smaller than a first predetermined value, a response sentence for the ask-again formed of only this word or a response sentence for the ask-again in which this word is emphasized in the response sentence.
    Type: Grant
    Filed: May 29, 2019
    Date of Patent: November 9, 2021
    Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHA
    Inventors: Narimasa Watanabe, Sawa Higuchi, Wataru Kaku
  • Patent number: 11151325
    Abstract: Systems and methods are provided to compare a target sample of text to a set of textual records, each textual record including a sample of text and an indication of one or more segments of text within the sample of text. Semantic similarity values between the target sample of text and each of the textual records are determined. Determining a particular semantic similarity value between the target sample of text and a particular textual record of the corpus includes: (i) determining individual semantic similarity values between the target sample of text and each of the segments of text indicated by the particular textual record, and (ii) generating the particular semantic similarity value between the target sample of text and the particular textual record based on the individual semantic similarity values. A textual record is then selected based on the semantic similarities.
    Type: Grant
    Filed: March 22, 2019
    Date of Patent: October 19, 2021
    Assignee: ServiceNow, Inc.
    Inventors: Omer Anil Turkkan, Firat Karakusoglu, Sriram Palapudi
  • Patent number: 11152016
    Abstract: Embodiments of the disclosed technologies include finding content of interest in an RF spectrum by automatically scanning the RF spectrum; detecting, in a range of frequencies of the RF spectrum that includes one or more undefined channels, a candidate RF segment; where the candidate RF segment includes a frequency-bound time segment of electromagnetic energy; executing a machine learning-based process to determine, for the candidate RF segment, signal characterization data indicative of one or more of: a frequency range, a modulation type, a timestamp; using the signal characterization data to determine whether audio contained in the candidate RF segment corresponds to a search criterion; in response to determining that the candidate RF segment corresponds to the search criterion, outputting, through an electronic device, data indicative of the candidate RF segment; where the data indicative of the candidate RF segment is output in a real-time time interval after the candidate RF segment is detected.
    Type: Grant
    Filed: May 8, 2019
    Date of Patent: October 19, 2021
    Assignee: SRI INTERNATIONAL
    Inventors: Aaron D. Lawson, Harry Bratt, Mitchell L. McLaren, Martin Graciarena
  • Patent number: 11139079
    Abstract: Embodiments of the invention include methods, systems, and computer program products for determining stroke onset. Aspects of the invention include determining a baseline behavioral model for a user and receiving real-time user data from a personal portable device. Aspects of the invention also include analyzing the real-time user data to determine the presence of an abnormal event. Aspects of the invention also include, based at least in part on a determination that the abnormal event is present, conducting a plurality of stroke analyses to generate a plurality of impairment characteristics. Aspects of the invention also include integrating the plurality of impairment characteristics, comparing the integrated plurality of impairment characteristics to the baseline behavioral model and outputting a stroke onset determination.
    Type: Grant
    Filed: March 6, 2017
    Date of Patent: October 5, 2021
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Itzhack Goldberg, Raquel Norel, Kahn Rhrissorrakrai
  • Patent number: 11138963
    Abstract: A processor-implemented text-to-speech method includes determining, using a sub-encoder, a first feature vector indicating an utterance characteristic of a speaker from feature vectors of a plurality of frames extracted from a partial section of a first speech signal of the speaker, and determining, using an autoregressive decoder, into which the first feature vector is input as an initial value, from context information of the text, a second feature vector of a second speech signal in which a text is uttered according to the utterance characteristic.
    Type: Grant
    Filed: May 7, 2019
    Date of Patent: October 5, 2021
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Hoshik Lee
  • Patent number: 11132998
    Abstract: A voice recognition device includes: a first feature vector calculating unit (2) for calculating a first feature vector from voice data input; an acoustic likelihood calculating unit (4) for calculating an acoustic likelihood of the first feature vector by using an acoustic model used for calculating an acoustic likelihood of a feature vector; a second feature vector calculating unit (3) for calculating a second feature vector from the voice data; a noise degree calculating unit (6) for calculating a noise degree of the second feature vector by using a discriminant model used for calculating a noise degree indicating whether a feature vector is noise or voice; a noise likelihood recalculating unit (8) for recalculating an acoustic likelihood of noise on the basis of the acoustic likelihood of the first feature vector and the noise degree of the second feature vector; and a collation unit (9) for performing collation with a pattern of a vocabulary word to be recognized, by using the acoustic likelihood calcula
    Type: Grant
    Filed: March 24, 2017
    Date of Patent: September 28, 2021
    Assignee: MITSUBISHI ELECTRIC CORPORATION
    Inventors: Toshiyuki Hanazawa, Tomohiro Narita
  • Patent number: 11126797
    Abstract: Methods, systems, and devices for language mapping are described. Some machine learning models may be trained to support multiple languages. However, word embedding alignments may be too general to accurately capture the meaning of certain words when mapping different languages into a single reference vector space. To improve the accuracy of vector mapping, a system may implement a supervised learning layer to refine the cross-lingual alignment of particular vectors corresponding to a vocabulary of interest (e.g., toxic language). This supervised learning layer may be trained using a dictionary of toxic words or phrases across the different supported languages in order to learn how to weight an initial vector alignment to more accurately map the meanings behind insults, threats, or other toxic words or phrases between languages. The vector output from this weighted mapping can be sent to supervised models, trained on the reference vector space, to determine toxicity scores.
    Type: Grant
    Filed: July 2, 2019
    Date of Patent: September 21, 2021
    Assignee: Spectrum Labs, Inc.
    Inventors: Josh Newman, Yacov Salomon, Jonathan Thomas Purnell, Indrajit Haridas, Alexander Greene
  • Patent number: 11107463
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for speech recognition using attention-based sequence-to-sequence models. In some implementations, audio data indicating acoustic characteristics of an utterance is received. A sequence of feature vectors indicative of the acoustic characteristics of the utterance is generated. The sequence of feature vectors is processed using a speech recognition model that has been trained using a loss function that uses N-best lists of decoded hypotheses, the speech recognition model including an encoder, an attention module, and a decoder. The encoder and decoder each include one or more recurrent neural network layers. A sequence of output vectors representing distributions over a predetermined set of linguistic units is obtained. A transcription for the utterance is obtained based on the sequence of output vectors. Data indicating the transcription of the utterance is provided.
    Type: Grant
    Filed: August 1, 2019
    Date of Patent: August 31, 2021
    Assignee: Google LLC
    Inventors: Rohit Prakash Prabhavalkar, Tara N. Sainath, Yonghui Wu, Patrick An Phu Nguyen, Zhifeng Chen, Chung-Cheng Chiu, Anjuli Patricia Kannan
  • Patent number: 11086596
    Abstract: Provided are a display apparatus, a control method thereof, a server, and a control method thereof. The display apparatus includes: a processor which processes a signal; a display which displays an image based on the processed signal; a first command receiver which receives a voice command; a storage which stores a plurality of voice commands said by a user; a second command receiver which receives a user's manipulation command; and a controller which, upon receiving the voice command, displays a list of the stored plurality of voice commands, selects one of the plurality of voice commands of the list according to the received user's manipulation command and controls the processor to process based on the selected voice command.
    Type: Grant
    Filed: September 11, 2018
    Date of Patent: August 10, 2021
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Do-wan Kim, Oh-yun Kwon, Tae-hwan Cha
  • Patent number: 11062693
    Abstract: To provide a more natural sounding set of voice prompts of an interactive voice response (IVR) script, the voice recordings of the prompts may be modified to have a predetermined amount of silence at the end of the recording. The amount of silence required can be determined from the context in which the voice prompt appears in the IVR script. Different contexts may include mid-sentence, terminating in a comma, or a sentence ending context. These contexts may require silence periods of 100 ms, 250 ms and 500 ms respectively. Voice files may be trimmed to remove any existing silence and then the required silence period may be added.
    Type: Grant
    Filed: June 20, 2019
    Date of Patent: July 13, 2021
    Assignee: West Corporation
    Inventors: Terry Olson, Mark Sempek, Roger Wehrle
  • Patent number: 11056127
    Abstract: Aspects of the subject disclosure may include, for example, a device that includes a processing system having a processor and a memory that stores executable instructions that, when executed by the processing system, facilitate performance of operations, where the operations include determining parameters for adapting audio in the content to the device, wherein the device renders the content, and wherein the parameters are based on semantic metadata embedded in the content, adapting the audio in the content based on the parameters, and rendering the content, as adapted by the parameters, to represent a semantic in the semantic metadata. Other embodiments are disclosed.
    Type: Grant
    Filed: April 30, 2019
    Date of Patent: July 6, 2021
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Eric Zavesky, Jason Decuir, Robert Gratz
  • Patent number: 11043215
    Abstract: A method and a system for generating textual representation of user spoken utterance is disclosed. The method comprises receiving an indication of the user spoken utterance; generating, at least two hypotheses; generating, by the electronic device, from the at least two hypotheses a set of paired hypotheses, a given one of the set of paired hypotheses including a first hypothesis paired with a second hypothesis; determining, for the given one of the set of paired hypotheses, a pair score; generating a set of utterance features, the set of utterance features being indicative of one or more characteristics associated with the user spoken utterance; ranking, the first hypothesis and the second hypothesis based at least on the pair score and the set of utterance features; and in response to the first hypothesis being a highest ranked hypothesis, selecting the first hypothesis as the textual representation of user spoken utterance.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: June 22, 2021
    Assignee: YANDEX EUROPE AG
    Inventors: Sergey Surenovich Galustyan, Fedor Aleksandrovich Minkin
  • Patent number: 11024294
    Abstract: The present teaching relates to method, system, medium, and implementations for user machine dialogue. An instruction is received by an agent device for rendering a communication directed to a user involved in a dialogue in a dialogue scene and is used to render the communication. A first representation of a mindset of the agent is updated accordingly after the rendering. Input data are received in one or more media types capturing a response from the user and information surrounding the dialogue scene and a second representation of a mindset of the user is updated based on the response from the user and the information surrounding of the dialogue scene. A next communication to the user is then determined based on the first representation of the mindset of the agent and the second representation of the mindset of the user.
    Type: Grant
    Filed: December 27, 2018
    Date of Patent: June 1, 2021
    Assignee: DMAI, INC.
    Inventors: Rui Fang, Changsong Liu
  • Patent number: 11024330
    Abstract: A signal processing apparatus includes a detection unit configured to perform a voice detection process on each of a plurality of audio signals captured by a plurality of microphones arranged at mutually different positions, a determination unit configured to determine a degree of similarity between two or more of the plurality of audio signals in which voice is detected by the detection unit, and a suppression unit configured to perform a process of suppressing the voice contained in at least one of the two or more audio signals, in response to a determination that the degree of similarity between the two or more audio signals is less than a threshold by the determination unit.
    Type: Grant
    Filed: April 16, 2019
    Date of Patent: June 1, 2021
    Assignee: CANON KABUSHIKI KAISHA
    Inventor: Masanobu Funakoshi
  • Patent number: 11017180
    Abstract: Systems, apparatuses, and methods for the interpretation and routing of short text messages, such as those that might be received as part of a “chat” between a customer and a customer service representative. In some embodiments, this is achieved by constructing word “vectors” based on the text in a message, with a token corresponding to each word. The word vectors are then compared to a set of mutually orthogonal unit vectors representing the “classes” or “categories” of messages that are received and are intended to be acted upon by a person or automated process. The orthogonal class unit vectors are generated by training a machine learning model using a set of previously classified text or messages.
    Type: Grant
    Filed: April 9, 2019
    Date of Patent: May 25, 2021
    Assignee: Helpshift, Inc.
    Inventors: Yashkumar Gandhi, Srinivas Nagamalla, Christian Leipski
  • Patent number: 11010552
    Abstract: Embodiments relate to a type of expression based on a particular theme. An aspect includes acquiring, by an electronic apparatus, from text data for learning, a subset of the text data associated with the particular theme and with particular time period information. Another aspect includes extracting text data containing negative information from the acquired subset of the text data. Another aspect includes extracting a word or phrase having a high correlation with the extracted text data or a word or phrase having a high appearance frequency in the extracted text data from the extracted text data. Yet another aspect includes determining that the extracted word or phrase is the type of expression based on the particular theme.
    Type: Grant
    Filed: January 16, 2019
    Date of Patent: May 18, 2021
    Assignee: International Business Machines Corporation
    Inventors: Emiko Takeuchi, Daisuke Takuma, Hirobumi Toyoshima
  • Patent number: 11003860
    Abstract: The present teaching relates to method, system, medium, and implementations for user machine dialogue. Historic dialogue data related to past dialogues are accessed and used to learn, via machine learning, expected utilities. During a dialogue involving a user and a machine agent, a representation of a shared mindset between the user and the agent is obtained to characterize the current state of the dialogue, which is then used to update the expected utilities. Continuous expected utility functions are then generated based on the updated expected utilities, wherein the continuous expected utility functions are to be used in determining how to conduct a dialogue with a user.
    Type: Grant
    Filed: December 27, 2018
    Date of Patent: May 11, 2021
    Assignee: DMAI, INC.
    Inventors: Rui Fang, Changsong Liu
  • Patent number: 10997965
    Abstract: An automated testing system and method for evaluating voice processing systems is provided. In one embodiment, a method includes receiving a plurality of voice command inputs and a plurality of expected responses associated with the voice command inputs. A text-to-speech engine is applied to the voice command inputs to generate test command audio files. The test command audio files are provided to a testing apparatus in communication with a voice processing system. A generated response output from the voice processing system is obtained for each of the test command audio files. The generated response is captured from the testing apparatus using a sensor to detect audio and/or visual information. The obtained generated response is compared to an expected response from the plurality of expected responses for each of the test command audio files. Based on the comparison, a test result is provided for each of the voice command inputs.
    Type: Grant
    Filed: April 2, 2019
    Date of Patent: May 4, 2021
    Assignee: Accenture Global Solutions Limited
    Inventors: Gregory Spata, Paul M. Barsamian
  • Patent number: 10978042
    Abstract: The present disclosure discloses a method and apparatus for generating a speech synthesis model. A specific embodiment of the method comprises: acquiring a text characteristic of a text and an acoustic characteristic of a speech corresponding to the text used for training a neural network corresponding to a speech synthesis model, fundamental frequency data in the acoustic characteristic of the speech corresponding to the text used for the training being extracted through a fundamental frequency data extraction model, and the fundamental frequency data extraction model being generated based on pre-training a neural network corresponding to the fundamental frequency data extraction model using the speech including each frame of speech having corresponding fundamental frequency data; and training the neural network corresponding to the speech synthesis model using the text characteristic of the text and the acoustic characteristic of the speech corresponding to the text.
    Type: Grant
    Filed: August 3, 2018
    Date of Patent: April 13, 2021
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventor: Hao Li
  • Patent number: 10971170
    Abstract: Methods, systems, and computer program products for generating, from an input character sequence, an output sequence of audio data representing the input character sequence. The output sequence of audio data includes a respective audio output sample for each of a number of time steps. One example method includes, for each of the time steps: generating a mel-frequency spectrogram for the time step by processing a representation of a respective portion of the input character sequence using a decoder neural network; generating a probability distribution over a plurality of possible audio output samples for the time step by processing the mel-frequency spectrogram for the time step using a vocoder neural network; and selecting the audio output sample for the time step from the possible audio output samples in accordance with the probability distribution.
    Type: Grant
    Filed: August 8, 2018
    Date of Patent: April 6, 2021
    Assignee: Google LLC
    Inventors: Yonghui Wu, Jonathan Shen, Ruoming Pang, Ron J. Weiss, Michael Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, Russell John Wyatt Skerry-Ryan, Ryan M. Rifkin, Ioannis Agiomyrgiannakis