Patents Examined by James S. Wozniak
  • Patent number: 10929617
    Abstract: Text analysis includes determining one or more global analysis parameters based on backtranslation of a first corpus between supported languages. A new text analysis model is determined for an unsupported language based on the one or more global analysis parameters and a text analysis model for a first supported language. An input text is analyzed in the unsupported language with the new text analysis model.
    Type: Grant
    Filed: July 20, 2018
    Date of Patent: February 23, 2021
    Assignee: International Business Machines Corporation
    Inventors: Kohichi Kamijoh, Tetsuya Nasukawa, Yohei Ikawa, Masaki Ono
  • Patent number: 10896681
    Abstract: This document describes, among other things, a computer-implemented method for transcribing an utterance. The method can include receiving, at a computing system, speech data that characterizes an utterance of a user. A first set of candidate transcriptions of the utterance can be generated using a static class-based language model that includes a plurality of classes that are each populated with class-based terms selected independently of the utterance or the user. The computing system can then determine whether the first set of candidate transcriptions includes class-based terms. Based on whether the first set of candidate transcriptions includes class-based terms, the computing system can determine whether to generate a dynamic class-based language model that includes at least one class that is populated with class-based terms selected based on a context associated with at least one of the utterance and the user.
    Type: Grant
    Filed: December 29, 2015
    Date of Patent: January 19, 2021
    Assignee: Google LLC
    Inventors: Petar Aleksic, Pedro J. Moreno Mengibar
  • Patent number: 10896673
    Abstract: Systems and techniques for authentication of impaired voices are described herein. A voice profile may be generated for a user. An audio stream may be received including an authentication voice of the user. It may be determined that the authentication voice does not match a first set of authentication criteria. The audio stream may be compared to a second set of authentication criteria. The user may be authenticated based on the comparison.
    Type: Grant
    Filed: September 21, 2017
    Date of Patent: January 19, 2021
    Assignee: Wells Fargo Bank, N.A.
    Inventors: Andrew J. Garner, IV, Tyua Larsen Fraser, Kimberly Ann MacInnis, Paul R. McMahon, Darrell Lee Suen, Zhong Wan
  • Patent number: 10878830
    Abstract: Described are an encoder for coding speech-like content and/or general audio content, wherein the encoder is configured to embed, at least in some frames, parameters in a bitstream, which parameters enhance a concealment in case an original frame is lost, corrupted or delayed, and a decoder for decoding speech-like content and/or general audio content, wherein the decoder is configured to use parameters which are sent later in time to enhance a concealment in case an original frame is lost, corrupted or delayed, as well as a method for encoding and a method for decoding.
    Type: Grant
    Filed: February 27, 2017
    Date of Patent: December 29, 2020
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Jérémie Lecomte, Benjamin Schubert, Michael Schnabel, Martin Dietz
  • Patent number: 10878801
    Abstract: A speech synthesis device of an embodiment includes a memory unit, a creating unit, a deciding unit, a generating unit and a waveform generating unit. The memory unit stores, as statistical model information of a statistical model, an output distribution of acoustic feature parameters including pitch feature parameters and a duration distribution. The creating unit creates a statistical model sequence from context information and the statistical model information. The deciding unit decides a pitch-cycle waveform count of each state using a duration based on the duration distribution of each state of each statistical model in the statistical model sequence, and pitch information based on the output distribution of the pitch feature parameters. The generating unit generates an output distribution sequence based on the pitch-cycle waveform count, and acoustic feature parameters based on the output distribution sequence.
    Type: Grant
    Filed: February 14, 2018
    Date of Patent: December 29, 2020
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Masatsune Tamura, Masahiro Morita
  • Patent number: 10839153
    Abstract: The discussion relates to unconscious bias detection. One example can detect potential bias words in a text sample and suggest alternative phrasing to eliminate the potential bias words from the text sample.
    Type: Grant
    Filed: May 24, 2017
    Date of Patent: November 17, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Naveen Gaur, Priyanka Subhash Kulkarni, Erika Debra Menezes, Wenlei Zhang
  • Patent number: 10824818
    Abstract: Systems and methods for synthesizing training data for multi-intent utterance segmentation include identifying a first corpus of utterances comprising a plurality of distinct single-intent in-domain utterances; identifying a second corpus of utterances comprising a plurality of distinct single-intent out-of-domain utterances; identifying a third corpus comprising a plurality of distinct conjunction terms; forming a multi-intent training corpus comprising synthetic multi-intent utterances, wherein forming each distinct multi-intent utterance includes: selecting a first distinct in-domain utterance from the first corpus of utterances; probabilistically selecting one of a first out-of-domain utterance from the second corpus and a second in-domain utterance from the first corpus; probabilistically selecting or not selecting a distinct conjunction term from the third corpus; and forming a synthetic multi-intent utterance including appending the first in-domain utterance with one of the first out-of-domain utteranc
    Type: Grant
    Filed: April 21, 2020
    Date of Patent: November 3, 2020
    Assignee: Clinc, Inc.
    Inventors: Joseph Peper, Parker Hill, Kevin Leach, Sean Stapleton, Jonathan K. Kummerfeld, Johann Hauswald, Michael Laurenzano, Lingjia Tang, Jason Mars
  • Patent number: 10825470
    Abstract: The present disclosure provides a method and apparatus for detecting a starting point and a finishing point of a speech, a computer device and a storage medium, wherein the method comprises: obtaining speech data to be detected; segmenting the speech data into speech segments, the number of speech segments being greater than one; respectively determining speech states of respective speech segments based on a Voice Activity Detection model obtained by pre-training; determining a starting point and a finishing point of the speech data according to the speech states. The solution of the present disclosure can be employed to improve the accuracy of the detection results.
    Type: Grant
    Filed: December 12, 2018
    Date of Patent: November 3, 2020
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Chao Li, Weixin Zhu
  • Patent number: 10826862
    Abstract: A system configured to generate and present hierarchical notifications is provided. The system may generate a notification structure comprising a plurality of notifications associated with a user, and output a top-level composite notification based on the notification structure. The system may receive a user command in response to the composite notification, and further output a lower-level composite notification based on the user-selected portion of the top-level composite notification identified by the user command. Further, the system may modify the notification structure based on the user command.
    Type: Grant
    Filed: February 27, 2018
    Date of Patent: November 3, 2020
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Syama Prasad Suprasadachandran Pillai, Jerry Bullard, Vivek Narayanan, Vishal Mangesh Shanbhag
  • Patent number: 10825462
    Abstract: Methods and systems are provided for optimizing call quality and improving network efficiency by reducing bandwidth requirements at the individual-voice-call level. Embodiments provided herein build vocal fingerprints that correspond to the frequency range of the human voice, as well as the frequency range of the voice of individual users. The vocal fingerprints are used minimize and reduce the transmission of background noise and ambient sound captured using HD voice while retaining the frequency range of a user's voice in HD voice. This filtered HD voice frequency range is then transmitted to recipients over the network. The reduced frequency range lowers bandwidth usage and conserves network resources, all while optimizing the call quality for individual users.
    Type: Grant
    Filed: September 4, 2018
    Date of Patent: November 3, 2020
    Assignee: Sprint Communications Company L.P.
    Inventors: Gregory Anderson Drews, Brian Dale Farst, Young Suk Lee, Raymond Reeves
  • Patent number: 10824664
    Abstract: Embodiments of the present disclosure disclose a method and apparatus for processing information. A specific implementation of the method includes: parsing a received voice query request sent by a user to obtain text query information corresponding to the voice query request; obtaining text push information obtained by searching using the text query information; processing the text push information to obtain to-be-pushed information corresponding to the text push information; and playing the to-be-pushed information. The implementation can play information when the user is not convenient to browse the information, so that the user can obtain the information in time.
    Type: Grant
    Filed: December 27, 2017
    Date of Patent: November 3, 2020
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO, LTD.
    Inventors: Hualong Zu, Haiguang Yuan, Ran Xu, Chen Chen, Lei Shi, Xin Li, Liou Chen
  • Patent number: 10810383
    Abstract: An image processing apparatus includes a control unit. The control unit acts as a searcher, a translator, a comparator, and an outputter. The searcher searches a first file and a second file, each including a text that is different in language from each other. The translator generates a translated text of the second file, by translating the text in the second file into a same language as a language of the text in the first file. The comparator compares between the text in the first file and the translated text generated from the second file, and detects a difference between the text of the first file and the text of the second file, through the comparison. The outputter outputs the difference detected by the comparator.
    Type: Grant
    Filed: December 11, 2018
    Date of Patent: October 20, 2020
    Assignee: KYOCERA Document Solutions Inc.
    Inventor: Shinobu Yoshioku
  • Patent number: 10811032
    Abstract: A method and apparatus to determine a direction of arrival (DOA) of a talker in the presence of a source of spatially-coherent noise. A time sequence of audio samples that include the spatially-coherent noise is received and buffered. Aided by previously known data, a trigger point is detected in the time sequence of audio samples when the talker begins to talk. The buffered time sequence of audio samples is separated into a noise segment and a signal-plus-noise segment based on the trigger point. For each direction of a plurality of distinct directions: an energy difference is computed for the direction between the noise segment and the signal-plus-noise segment, and the DOA of the talker is selected as the direction of the plurality of distinct directions having a largest of the computed energy differences.
    Type: Grant
    Filed: December 19, 2018
    Date of Patent: October 20, 2020
    Assignee: Cirrus Logic, Inc.
    Inventors: Ghassan Maalouli, Seth Suppappola
  • Patent number: 10800043
    Abstract: Disclosed herein are an interaction apparatus and method. The interaction apparatus includes an input unit for receiving multimodal information including an image and a voice of a target to allow the interaction apparatus to interact with the target, a recognition unit for recognizing turn-taking behavior of the target using the multimodal information, and an execution unit for taking an activity for interacting with the target based on results of recognition of the turn-taking behavior.
    Type: Grant
    Filed: November 30, 2018
    Date of Patent: October 13, 2020
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Cheon-Shu Park, Jae-Hong Kim, Jae-Yeon Lee, Min-Su Jang
  • Patent number: 10789962
    Abstract: A system and method are presented for the correction of packet loss in audio in automatic speech recognition (ASR) systems. Packet loss correction, as presented herein, occurs at the recognition stage without modifying any of the acoustic models generated during training. The behavior of the ASR engine in the absence of packet loss is thus not altered. To accomplish this, the actual input signal may be rectified, the recognition scores may be normalized to account for signal errors, and a best-estimate method using information from previous frames and acoustic models may be used to replace the noisy signal.
    Type: Grant
    Filed: November 12, 2018
    Date of Patent: September 29, 2020
    Inventors: Srinath Cheluvaraja, Ananth Nagaraja Iyer, Aravind Ganapathiraju, Felix Immanuel Wyss
  • Patent number: 10789948
    Abstract: This disclosure describes techniques and systems for encoding instructions in audio data that, when output on a speaker of a first device in an environment, cause a second device to output content in the environment. In some instances, the audio data has a frequency that is inaudible to users in the environment. Thus, the first device is able to cause the second device to output the content without users in the environment hearing the instructions. In some instances, the first device also outputs content, and the content output by the second device is played at an offset relative to a position of the content output by the first device.
    Type: Grant
    Filed: March 29, 2017
    Date of Patent: September 29, 2020
    Assignee: Amazon Technologies, Inc.
    Inventors: Pete Klein, Derick Deller, Zoe Adams, David Ignatius Murai, Scott Edward Van Vliet, Apoorv Naik, Aslan Appleman, Link Cornelius
  • Patent number: 10777192
    Abstract: A method and apparatus of recognizing a field of semantic parsing information, a device and a readable medium. The method includes: obtaining at least one preset keyword extracting pattern which is in a preset field and used to parse user-input speech data to generate semantic parsing information, each of the at least one preset keyword extracting pattern; obtaining subject weights of keywords according to importance degree identifiers of the keywords in the preset keyword extracting patterns; calculating a subject score of the speech parsing information according to the subject weights of the keywords; recognizing whether the speech parsing information belongs to the preset field according to the subject score of the speech parsing information. The method recognizes the field to which the speech parsing information belongs to ensure correctness of the recognized field, and thereby ensure correctness of operations performed by the App according to the semantic parsing information.
    Type: Grant
    Filed: May 15, 2018
    Date of Patent: September 15, 2020
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Qin Qu, Zejin Hu
  • Patent number: 10748525
    Abstract: A multi-modal dialog apparatus includes a memory embodying computer executable instructions; and at least one processor, coupled to the memory, and operative by the computer executable instructions. More specifically, the processor is operative by the computer executable instructions to facilitate receiving a remark from a user; passing the remark to an intelligent system; receiving a response and a level of confidence from the intelligent system; portraying the response to the user via an equivocal persona in case the level of confidence is less than a pre-determined threshold value; and portraying the response to the user via an authoritative persona in case the level of confidence equals or exceeds the pre-determined threshold value.
    Type: Grant
    Filed: December 11, 2017
    Date of Patent: August 18, 2020
    Assignee: International Business Machines Corporation
    Inventors: Tamer Abuelsaad, Ravindranath Kokku
  • Patent number: 10748546
    Abstract: Systems and processes for operating an intelligent automated assistant to provide extension of digital assistant services are provided. An example method includes, at an electronic device having one or more processors, receiving, from a first user, a first speech input representing a user request. The method further includes obtaining an identity of the first user; and in accordance with the user identity, providing a representation of the user request to at least one of a second electronic device or a third electronic device. The method further includes receiving, based on a determination of whether the second electronic device or the third electronic device, or both, is to provide the response to the first electronic device, the response to the user request from the second electronic device or the third electronic device. The method further includes providing a representation of the response to the first user.
    Type: Grant
    Filed: February 4, 2019
    Date of Patent: August 18, 2020
    Assignee: Apple Inc.
    Inventors: Yoon Kim, Charles Srisuwananukorn, David A. Carson, Thomas R. Gruber, Justin G. Binder
  • Patent number: 10734009
    Abstract: An envelope sequence is provided that can improve approximation accuracy near peaks caused by the pitch period of an audio signal. A periodic-combined-envelope-sequence generation device according to the present invention takes, as an input audio signal, a time-domain audio digital signal in each frame, which is a predetermined time segment, and generates a periodic combined envelope sequence as an envelope sequence. The periodic-combined-envelope-sequence generation device according to the present invention comprises at least a spectral-envelope-sequence calculating part and a periodic-combined-envelope generating part. The spectral-envelope-sequence calculating part calculates a spectral envelope sequence of the input audio signal on the basis of time-domain linear prediction of the input audio signal.
    Type: Grant
    Filed: December 21, 2018
    Date of Patent: August 4, 2020
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Takehiro Moriya, Yutaka Kamamoto, Noboru Harada