Patents Examined by James S. Wozniak

Text analysis in unsupported languages using backtranslation

Patent number: 10929617

Abstract: Text analysis includes determining one or more global analysis parameters based on backtranslation of a first corpus between supported languages. A new text analysis model is determined for an unsupported language based on the one or more global analysis parameters and a text analysis model for a first supported language. An input text is analyzed in the unsupported language with the new text analysis model.

Type: Grant

Filed: July 20, 2018

Date of Patent: February 23, 2021

Assignee: International Business Machines Corporation

Inventors: Kohichi Kamijoh, Tetsuya Nasukawa, Yohei Ikawa, Masaki Ono
Authentication of impaired voices

Patent number: 10896673

Abstract: Systems and techniques for authentication of impaired voices are described herein. A voice profile may be generated for a user. An audio stream may be received including an authentication voice of the user. It may be determined that the authentication voice does not match a first set of authentication criteria. The audio stream may be compared to a second set of authentication criteria. The user may be authenticated based on the comparison.

Type: Grant

Filed: September 21, 2017

Date of Patent: January 19, 2021

Assignee: Wells Fargo Bank, N.A.

Inventors: Andrew J. Garner, IV, Tyua Larsen Fraser, Kimberly Ann MacInnis, Paul R. McMahon, Darrell Lee Suen, Zhong Wan
Speech recognition with selective use of dynamic language models

Patent number: 10896681

Abstract: This document describes, among other things, a computer-implemented method for transcribing an utterance. The method can include receiving, at a computing system, speech data that characterizes an utterance of a user. A first set of candidate transcriptions of the utterance can be generated using a static class-based language model that includes a plurality of classes that are each populated with class-based terms selected independently of the utterance or the user. The computing system can then determine whether the first set of candidate transcriptions includes class-based terms. Based on whether the first set of candidate transcriptions includes class-based terms, the computing system can determine whether to generate a dynamic class-based language model that includes at least one class that is populated with class-based terms selected based on a context associated with at least one of the utterance and the user.

Type: Grant

Filed: December 29, 2015

Date of Patent: January 19, 2021

Assignee: Google LLC

Inventors: Petar Aleksic, Pedro J. Moreno Mengibar
Statistical speech synthesis device, method, and computer program product using pitch-cycle counts based on state durations

Patent number: 10878801

Abstract: A speech synthesis device of an embodiment includes a memory unit, a creating unit, a deciding unit, a generating unit and a waveform generating unit. The memory unit stores, as statistical model information of a statistical model, an output distribution of acoustic feature parameters including pitch feature parameters and a duration distribution. The creating unit creates a statistical model sequence from context information and the statistical model information. The deciding unit decides a pitch-cycle waveform count of each state using a duration based on the duration distribution of each state of each statistical model in the statistical model sequence, and pitch information based on the output distribution of the pitch feature parameters. The generating unit generates an output distribution sequence based on the pitch-cycle waveform count, and acoustic feature parameters based on the output distribution sequence.

Type: Grant

Filed: February 14, 2018

Date of Patent: December 29, 2020

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Masatsune Tamura, Masahiro Morita
Encoder, decoder and method for encoding and decoding audio content using parameters for enhancing a concealment

Patent number: 10878830

Abstract: Described are an encoder for coding speech-like content and/or general audio content, wherein the encoder is configured to embed, at least in some frames, parameters in a bitstream, which parameters enhance a concealment in case an original frame is lost, corrupted or delayed, and a decoder for decoding speech-like content and/or general audio content, wherein the decoder is configured to use parameters which are sent later in time to enhance a concealment in case an original frame is lost, corrupted or delayed, as well as a method for encoding and a method for decoding.

Type: Grant

Filed: February 27, 2017

Date of Patent: December 29, 2020

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Jérémie Lecomte, Benjamin Schubert, Michael Schnabel, Martin Dietz
Unconscious bias detection

Patent number: 10839153

Abstract: The discussion relates to unconscious bias detection. One example can detect potential bias words in a text sample and suggest alternative phrasing to eliminate the potential bias words from the text sample.

Type: Grant

Filed: May 24, 2017

Date of Patent: November 17, 2020

Assignee: Microsoft Technology Licensing, LLC

Inventors: Naveen Gaur, Priyanka Subhash Kulkarni, Erika Debra Menezes, Wenlei Zhang
Method and apparatus for detecting starting point and finishing point of speech, computer device and storage medium

Patent number: 10825470

Abstract: The present disclosure provides a method and apparatus for detecting a starting point and a finishing point of a speech, a computer device and a storage medium, wherein the method comprises: obtaining speech data to be detected; segmenting the speech data into speech segments, the number of speech segments being greater than one; respectively determining speech states of respective speech segments based on a Voice Activity Detection model obtained by pre-training; determining a starting point and a finishing point of the speech data according to the speech states. The solution of the present disclosure can be employed to improve the accuracy of the detection results.

Type: Grant

Filed: December 12, 2018

Date of Patent: November 3, 2020

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Chao Li, Weixin Zhu
Generation and transmission of hierarchical notifications to networked devices

Patent number: 10826862

Abstract: A system configured to generate and present hierarchical notifications is provided. The system may generate a notification structure comprising a plurality of notifications associated with a user, and output a top-level composite notification based on the notification structure. The system may receive a user command in response to the composite notification, and further output a lower-level composite notification based on the user-selected portion of the top-level composite notification identified by the user command. Further, the system may modify the notification structure based on the user command.

Type: Grant

Filed: February 27, 2018

Date of Patent: November 3, 2020

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: Syama Prasad Suprasadachandran Pillai, Jerry Bullard, Vivek Narayanan, Vishal Mangesh Shanbhag
Systems and methods for machine learning-based multi-intent segmentation and classification

Patent number: 10824818

Abstract: Systems and methods for synthesizing training data for multi-intent utterance segmentation include identifying a first corpus of utterances comprising a plurality of distinct single-intent in-domain utterances; identifying a second corpus of utterances comprising a plurality of distinct single-intent out-of-domain utterances; identifying a third corpus comprising a plurality of distinct conjunction terms; forming a multi-intent training corpus comprising synthetic multi-intent utterances, wherein forming each distinct multi-intent utterance includes: selecting a first distinct in-domain utterance from the first corpus of utterances; probabilistically selecting one of a first out-of-domain utterance from the second corpus and a second in-domain utterance from the first corpus; probabilistically selecting or not selecting a distinct conjunction term from the third corpus; and forming a synthetic multi-intent utterance including appending the first in-domain utterance with one of the first out-of-domain utteranc

Type: Grant

Filed: April 21, 2020

Date of Patent: November 3, 2020

Assignee: Clinc, Inc.

Inventors: Joseph Peper, Parker Hill, Kevin Leach, Sean Stapleton, Jonathan K. Kummerfeld, Johann Hauswald, Michael Laurenzano, Lingjia Tang, Jason Mars
Method and apparatus for providing text push information responsive to a voice query request

Patent number: 10824664

Abstract: Embodiments of the present disclosure disclose a method and apparatus for processing information. A specific implementation of the method includes: parsing a received voice query request sent by a user to obtain text query information corresponding to the voice query request; obtaining text push information obtained by searching using the text query information; processing the text push information to obtain to-be-pushed information corresponding to the text push information; and playing the to-be-pushed information. The implementation can play information when the user is not convenient to browse the information, so that the user can obtain the information in time.

Type: Grant

Filed: December 27, 2017

Date of Patent: November 3, 2020

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO, LTD.

Inventors: Hualong Zu, Haiguang Yuan, Ran Xu, Chen Chen, Lei Shi, Xin Li, Liou Chen
Optimizing call quality using vocal frequency fingerprints to filter voice calls

Patent number: 10825462

Abstract: Methods and systems are provided for optimizing call quality and improving network efficiency by reducing bandwidth requirements at the individual-voice-call level. Embodiments provided herein build vocal fingerprints that correspond to the frequency range of the human voice, as well as the frequency range of the voice of individual users. The vocal fingerprints are used minimize and reduce the transmission of background noise and ambient sound captured using HD voice while retaining the frequency range of a user's voice in HD voice. This filtered HD voice frequency range is then transmitted to recipients over the network. The reduced frequency range lowers bandwidth usage and conserves network resources, all while optimizing the call quality for individual users.

Type: Grant

Filed: September 4, 2018

Date of Patent: November 3, 2020

Assignee: Sprint Communications Company L.P.

Inventors: Gregory Anderson Drews, Brian Dale Farst, Young Suk Lee, Raymond Reeves
Image processing apparatus for comparing documents in different languages

Patent number: 10810383

Abstract: An image processing apparatus includes a control unit. The control unit acts as a searcher, a translator, a comparator, and an outputter. The searcher searches a first file and a second file, each including a text that is different in language from each other. The translator generates a translated text of the second file, by translating the text in the second file into a same language as a language of the text in the first file. The comparator compares between the text in the first file and the translated text generated from the second file, and detects a difference between the text of the first file and the text of the second file, through the comparison. The outputter outputs the difference detected by the comparator.

Type: Grant

Filed: December 11, 2018

Date of Patent: October 20, 2020

Assignee: KYOCERA Document Solutions Inc.

Inventor: Shinobu Yoshioku
Data aided method for robust direction of arrival (DOA) estimation in the presence of spatially-coherent noise interferers

Patent number: 10811032

Abstract: A method and apparatus to determine a direction of arrival (DOA) of a talker in the presence of a source of spatially-coherent noise. A time sequence of audio samples that include the spatially-coherent noise is received and buffered. Aided by previously known data, a trigger point is detected in the time sequence of audio samples when the talker begins to talk. The buffered time sequence of audio samples is separated into a noise segment and a signal-plus-noise segment based on the trigger point. For each direction of a plurality of distinct directions: an energy difference is computed for the direction between the noise segment and the signal-plus-noise segment, and the DOA of the talker is selected as the direction of the plurality of distinct directions having a largest of the computed energy differences.

Type: Grant

Filed: December 19, 2018

Date of Patent: October 20, 2020

Assignee: Cirrus Logic, Inc.

Inventors: Ghassan Maalouli, Seth Suppappola
Interaction apparatus and method for determining a turn-taking behavior using multimodel information

Patent number: 10800043

Abstract: Disclosed herein are an interaction apparatus and method. The interaction apparatus includes an input unit for receiving multimodal information including an image and a voice of a target to allow the interaction apparatus to interact with the target, a recognition unit for recognizing turn-taking behavior of the target using the multimodal information, and an execution unit for taking an activity for interacting with the target based on results of recognition of the turn-taking behavior.

Type: Grant

Filed: November 30, 2018

Date of Patent: October 13, 2020

Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Cheon-Shu Park, Jae-Hong Kim, Jae-Yeon Lee, Min-Su Jang
Accessory for a voice controlled device for output of supplementary content

Patent number: 10789948

Abstract: This disclosure describes techniques and systems for encoding instructions in audio data that, when output on a speaker of a first device in an environment, cause a second device to output content in the environment. In some instances, the audio data has a frequency that is inaudible to users in the environment. Thus, the first device is able to cause the second device to output the content without users in the environment hearing the instructions. In some instances, the first device also outputs content, and the content output by the second device is played at an offset relative to a position of the content output by the first device.

Type: Grant

Filed: March 29, 2017

Date of Patent: September 29, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Pete Klein, Derick Deller, Zoe Adams, David Ignatius Murai, Scott Edward Van Vliet, Apoorv Naik, Aslan Appleman, Link Cornelius
System and method to correct for packet loss using hidden markov models in ASR systems

Patent number: 10789962

Abstract: A system and method are presented for the correction of packet loss in audio in automatic speech recognition (ASR) systems. Packet loss correction, as presented herein, occurs at the recognition stage without modifying any of the acoustic models generated during training. The behavior of the ASR engine in the absence of packet loss is thus not altered. To accomplish this, the actual input signal may be rectified, the recognition scores may be normalized to account for signal errors, and a best-estimate method using information from previous frames and acoustic models may be used to replace the noisy signal.

Type: Grant

Filed: November 12, 2018

Date of Patent: September 29, 2020

Inventors: Srinath Cheluvaraja, Ananth Nagaraja Iyer, Aravind Ganapathiraju, Felix Immanuel Wyss
Method and apparatus of recognizing field of semantic parsing information, device and readable medium

Patent number: 10777192

Abstract: A method and apparatus of recognizing a field of semantic parsing information, a device and a readable medium. The method includes: obtaining at least one preset keyword extracting pattern which is in a preset field and used to parse user-input speech data to generate semantic parsing information, each of the at least one preset keyword extracting pattern; obtaining subject weights of keywords according to importance degree identifiers of the keywords in the preset keyword extracting patterns; calculating a subject score of the speech parsing information according to the subject weights of the keywords; recognizing whether the speech parsing information belongs to the preset field according to the subject score of the speech parsing information. The method recognizes the field to which the speech parsing information belongs to ensure correctness of the recognized field, and thereby ensure correctness of operations performed by the App according to the semantic parsing information.

Type: Grant

Filed: May 15, 2018

Date of Patent: September 15, 2020

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Qin Qu, Zejin Hu
Multi-modal dialog agents representing a level of confidence in analysis

Patent number: 10748525

Abstract: A multi-modal dialog apparatus includes a memory embodying computer executable instructions; and at least one processor, coupled to the memory, and operative by the computer executable instructions. More specifically, the processor is operative by the computer executable instructions to facilitate receiving a remark from a user; passing the remark to an intelligent system; receiving a response and a level of confidence from the intelligent system; portraying the response to the user via an equivocal persona in case the level of confidence is less than a pre-determined threshold value; and portraying the response to the user via an authoritative persona in case the level of confidence equals or exceeds the pre-determined threshold value.

Type: Grant

Filed: December 11, 2017

Date of Patent: August 18, 2020

Assignee: International Business Machines Corporation

Inventors: Tamer Abuelsaad, Ravindranath Kokku
Digital assistant services based on device capabilities

Patent number: 10748546

Abstract: Systems and processes for operating an intelligent automated assistant to provide extension of digital assistant services are provided. An example method includes, at an electronic device having one or more processors, receiving, from a first user, a first speech input representing a user request. The method further includes obtaining an identity of the first user; and in accordance with the user identity, providing a representation of the user request to at least one of a second electronic device or a third electronic device. The method further includes receiving, based on a determination of whether the second electronic device or the third electronic device, or both, is to provide the response to the first electronic device, the response to the user request from the second electronic device or the third electronic device. The method further includes providing a representation of the response to the first user.

Type: Grant

Filed: February 4, 2019

Date of Patent: August 18, 2020

Assignee: Apple Inc.

Inventors: Yoon Kim, Charles Srisuwananukorn, David A. Carson, Thomas R. Gruber, Justin G. Binder
Periodic-combined-envelope-sequence generation device, periodic-combined-envelope-sequence generation method, periodic-combined-envelope-sequence generation program and recording medium

Patent number: 10734009

Abstract: An envelope sequence is provided that can improve approximation accuracy near peaks caused by the pitch period of an audio signal. A periodic-combined-envelope-sequence generation device according to the present invention takes, as an input audio signal, a time-domain audio digital signal in each frame, which is a predetermined time segment, and generates a periodic combined envelope sequence as an envelope sequence. The periodic-combined-envelope-sequence generation device according to the present invention comprises at least a spectral-envelope-sequence calculating part and a periodic-combined-envelope generating part. The spectral-envelope-sequence calculating part calculates a spectral envelope sequence of the input audio signal on the basis of time-domain linear prediction of the input audio signal.

Type: Grant

Filed: December 21, 2018

Date of Patent: August 4, 2020

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Takehiro Moriya, Yutaka Kamamoto, Noboru Harada

1 2 3 4 5 … next