Patents Examined by Michael C Colucci

Decoder for generating a frequency enhanced audio signal, method of decoding, encoder for generating an encoded signal and method of encoding using compact selection side information

Patent number: 10657979

Abstract: A decoder for generating a frequency enhanced audio signal, includes: a feature extractor for extracting a feature from a core signal; a side information extractor for extracting a selection side information associated with the core signal; a parameter generator for generating a parametric representation for estimating a spectral range of the frequency enhanced audio signal not defined by the core signal, wherein the parameter generator is configured to provide a number of parametric representation alternatives in response to the feature, and wherein the parameter generator is configured to select one of the parametric representation alternatives as the parametric representation in response to the selection side information; and a signal estimator for estimating the frequency enhanced audio signal using the parametric representation selected.

Type: Grant

Filed: July 28, 2015

Date of Patent: May 19, 2020

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Frederik Nagel, Sascha Disch, Andreas Niedermeier
Identifying an accurate transcription from probabilistic inputs

Patent number: 10629205

Abstract: An approach is provided for identifying an accurate transcription of a sentence. Options for transcriptions of each word in the sentence are determined. Probabilistic scores of the options are determined. Variations of a transcription of the sentence are generated by randomly selecting from the options with the probabilistic scores weighting the selections. Plausibility scores for the variations are generated by performing syntactic, semantic, and redundancy analyses of the variations. Based on the plausibility scores, the probabilistic scores, and the variations, tentative transcriptions of the sentence are determined and refined repeatedly by employing a genetic evolution technique until a final refined tentative transcription is the accurate transcription of the sentence.

Type: Grant

Filed: June 12, 2018

Date of Patent: April 21, 2020

Assignee: International Business Machines Corporation

Inventors: Giulia Carnevale, Marco Gianfico, Ciro Ragusa, Roberto Ragusa
Cognitive print speaker modeler

Patent number: 10621990

Abstract: Aspects of the present invention provide devices that subtitle streaming video with audio and identify a speaker in a streaming video with audio according to words spoken by the speaker matched to a cognitive print. The cognitive print includes traits classified according a hierarchical long short term model (LSTM). The hierarchical LSTM includes layers of LSTMs and each layer corresponds to the classification of one trait. A processor annotates a subtitle of the words spoken by the speaker, which decorates the subtitle with a label representative of the identified speaker, and streams the decorated subtitle with the streaming video with audio.

Type: Grant

Filed: April 30, 2018

Date of Patent: April 14, 2020

Assignee: International Business Machines Corporation

Inventors: Jeff Amsterdam, Aaron K. Baughman, Stephen C. Hammer, David A. Provan
Auto-translation for multi user audio and video

Patent number: 10614173

Abstract: The disclosed subject matter provides a system, computer readable storage medium, and a method providing an audio and textual transcript of a communication. A conferencing services may receive audio or audio visual signals from a plurality of different devices that receive voice communications from participants in a communication, such as a chat or teleconference. The audio signals representing voice (speech) communications input into respective different devices by the participants. A translation services server may receive over a separate communication channel the audio signals for translation into a second language. As managed by the translation services server, the audio signals may be converted into textual data. The textual data may be translated into text of different languages based the language preferences of the end user devices in the teleconference. The translated text may be further translated into audio signals.

Type: Grant

Filed: July 9, 2019

Date of Patent: April 7, 2020

Assignee: Google LLC

Inventors: Trausti Kristjansson, John Huang, Yu-Kuan Lin, Hung-ying Tyan, Jakob David Uszkoreit, Joshua James Estelle, Chung-yi Wang, Kirill Buryak, Yusuke Konishi
Systems and methods for providing automated natural language dialogue with customers

Patent number: 10614793

Abstract: A system includes one or more memory devices storing instructions, and one or more processors configured to execute the instructions to perform steps of providing automated natural dialogue with a customer. The system may generate one or more events and commands temporarily stored in queues to be processed by one or more of a dialogue management device, an API server, and an NLP device. The dialogue management device may create adaptive responses to customer communications using a customer context, a rules-based platform, and a trained machine learning model.

Type: Grant

Filed: May 31, 2019

Date of Patent: April 7, 2020

Assignee: CAPITAL ONE SERVICES, LLC

Inventors: Gregory W. Zoller, Scott Karp, Sujay Eliphaz Jacob, Erik Mueller, Stephanie Hay, Adam Roy Paynter
Control device, display device, method, and program

Patent number: 10607607

Abstract: There is provided a control device to improve convenience for a user by resolving or alleviating a disadvantage of a known voice interaction, the control device including: a device control unit configured to control one or more controlled devices; a voice notification unit configured to output user-oriented voice notification regarding at least the one controlled device; and a display control unit configured to cause a display device to display a message corresponding to the voice notification output by the voice notification unit.

Type: Grant

Filed: October 4, 2016

Date of Patent: March 31, 2020

Assignee: SONY CORPORATION

Inventor: Hideo Nagasaka
Automatically suggesting a temporal opportunity for and assisting a writer in writing one or more sequel articles via artificial intelligence

Patent number: 10599783

Abstract: Methods, systems, and computer program products for automatically suggesting a temporal opportunity for writing one or more sequel articles via artificial intelligence are provided herein. A computer-implemented method includes extracting one or more types of information from a prior written document; automatically determining, based on the extracted information, at least one temporal opportunity for generating a follow-up written document to the prior written document; automatically generating a follow-up written document to the prior written document, the follow-up written document being written in a style that indicates that it is in response to the prior written document, in accordance with the at least one determined temporal opportunity, and based on (i) one or more items of information, related to the extracted information, derived from one or more web sources, and (ii) a writing model attributed to a user.

Type: Grant

Filed: December 26, 2017

Date of Patent: March 24, 2020

Assignee: International Business Machines Corporation

Inventors: Pranay Lohia, Saket Gurukar, Rishabh Gupta, Himanshu Gupta
Method and apparatus for sinusoidal encoding and decoding

Patent number: 10593342

Abstract: An audio signal encoding method is provided. The method comprises: collecting audio signal samples, determining sinusoidal components in subsequent frames, estimation of amplitudes and frequencies of the components for each frame, merging thus obtained pairs into sinusoidal trajectories, splitting particular trajectories into segments, transforming particular trajectories to the frequency domain by means of a digital transform performed on segments longer than the frame duration, quantization and selection of transform coefficients in the segments, entropy encoding, outputting the quantized coefficients as output data, wherein segments of different trajectories starting within a particular time are grouped into Groups of Segments (GOS), and the partitioning of trajectories into segments is synchronized with the endpoints of a Group of Segments).

Type: Grant

Filed: March 22, 2018

Date of Patent: March 17, 2020

Assignees: Huawei Technologies Co., Ltd., ZYLIA SP. Z O.O.

Inventors: Tomasz Żernicki, Łukasz Januszkiewicz, Panji Setiawan
System for automatic extraction of structure from spoken conversation using lexical and acoustic features

Patent number: 10592611

Abstract: Embodiments of the present invention provide a system for automatically extracting conversational structure from a voice record based on lexical and acoustic features. The system also aggregates business-relevant statistics and entities from a collection of spoken conversations. The system may infer a coarse-level conversational structure based on fine-level activities identified from extracted acoustic features. The system improves significantly over previous systems by extracting structure based on lexical and acoustic features. This enables extracting conversational structure on a larger scale and finer level of detail than previous systems, and can feed an analytics and business intelligence platform, e.g. for customer service phone calls. During operation, the system obtains a voice record. The system then extracts a lexical feature using automatic speech recognition (ASR). The system extracts an acoustic feature.

Type: Grant

Filed: October 24, 2016

Date of Patent: March 17, 2020

Assignee: Conduent Business Services, LLC

Inventors: Jesse Vig, Harish Arsikere, Margaret H. Szymanski, Luke R. Plurkowski, Kyle D. Dent, Daniel G. Bobrow, Daniel Davies, Eric Saund
Method and apparatus of processing user data of a multi-speaker conference call

Patent number: 10574827

Abstract: A method and apparatus of sharing documents during a conference call data is disclosed. One example method may include initiating a document sharing operation during a conference call conducted between at least two participants communicating during the conference call. The method may also include transferring the document from one of the two participants to another of the two participants, and recording at least one action performed to the document by the participants during the conference call.

Type: Grant

Filed: April 9, 2019

Date of Patent: February 25, 2020

Assignee: West Corporation

Inventors: Mark J. Pettay, Hendryanto Rilantono, Myron P. Sojka
Inter-channel bandwidth extension

Patent number: 10573326

Abstract: A method includes decoding a low-band mid channel bitstream to generate a low-band mid signal and a low-band mid excitation signal. The method further includes decoding a high-band mid channel bandwidth extension bitstream to generate a synthesized high-band mid signal. The method also includes determining an inter-channel bandwidth extension (ICBWE) gain mapping parameter corresponding to the synthesized high-band mid signal. The ICBWE gain mapping parameter is based on a selected frequency-domain gain parameter that is extracted from a stereo downmix/upmix parameter bitstream. The method further includes performing a gain scaling operation on the synthesized high-band mid signal based on the ICBWE gain mapping parameter to generate a reference high-band channel and a target high-band channel. The method includes outputting a first audio channel and a second audio channel. The first audio channel is based on the reference high-band channel, and the second audio channel is based on target high-band channel.

Type: Grant

Filed: March 26, 2018

Date of Patent: February 25, 2020

Assignee: Qualcomm Incorporated

Inventors: Venkata Subrahmanyam Chandra Sekhar Chebiyyam, Venkatraman Atti
Coding device, decoding device, and method and program thereof

Patent number: 10553229

Abstract: A technology of accurately coding and decoding coefficients which are convertible into linear prediction coefficients even for a frame in which the spectrum variation is great while suppressing an increase in the code amount as a whole is provided. A coding device includes: a first coding unit that obtains a first code by coding coefficients which are convertible into linear prediction coefficients of more than one order; and a second coding unit that obtains a second code by coding at least quantization errors of the first coding unit if (A-1) an index Q commensurate with how high the peak-to-valley height of a spectral envelope is, the spectral envelope corresponding to the coefficients which are convertible into the linear prediction coefficients of more than one order, is larger than or equal to a predetermined threshold value Th1 and/or (B-1) an index Q? commensurate with how short the peak-to-valley height of the spectral envelope is, is smaller than or equal to a predetermined threshold value Th1?.

Type: Grant

Filed: June 3, 2019

Date of Patent: February 4, 2020

Assignee: Nippon Telegraph and Telephone Corporation

Inventors: Takehiro Moriya, Yutaka Kamamoto, Noboru Harada
System and methods for correcting text-to-speech pronunciation

Patent number: 10553200

Abstract: A text-to-speech (TTS) computing includes a processor and a memory. The TTS computing device is configured to generate a machine pronunciation of a text data according to at least one phonetic rule, and provide the machine pronunciation to a user interface of the TTS computing device such that the machine pronunciation is audibly communicated to a user of the TTS computing device. The TTS computing device is also configured to receive a pronunciation correction of the machine pronunciation from the user via the user interface, and store the pronunciation correction in a TTS data source. The TTS computing device is further configured to assign the pronunciation correction provided by the user to a user profile that corresponds to the text data.

Type: Grant

Filed: April 27, 2018

Date of Patent: February 4, 2020

Assignee: Mastercard International Incorporated

Inventor: Jason Jay Lacoss-Arnold
Decoding apparatus, decoding method, and program

Patent number: 10553230

Abstract: The present disclosure relates to a decoding apparatus, a decoding method, and a program that can switch, as quickly as possible, a plurality of audio encoded bit streams with synchronized reproduction timing to thereby decode and output the plurality of audio encoded bit streams. An aspect of the present disclosure provides a decoding apparatus including: an acquisition unit that acquires a plurality of audio encoded bit streams; a selection unit that determines a boundary position for switching output of the plurality of audio encoded bit streams and that selectively supplies one of the plurality of acquired audio encoded bit streams to a decoding processing unit according to the boundary position; and the decoding processing unit that applies a decoding process including IMDCT processing to the one input through the selection unit, in which the decoding processing unit skips overlap-and-add in the IMDCT processing corresponding to each frame before and after the boundary position.

Type: Grant

Filed: October 26, 2016

Date of Patent: February 4, 2020

Assignee: Sony Corporation

Inventors: Mitsuyuki Hatanaka, Toru Chinen, Minoru Tsuji, Hiroyuki Honma
Pronoun mapping for sub-context rendering

Patent number: 10546060

Abstract: An approach is provided to detect pronouns that are included in textual posts that are found in an online discussion. The textual posts are analyzed using a natural language processing speech classification technique, that results in an identification of a noun to which the detected pronoun refers. The system then displays, on a display device, the noun to which the pronoun refers.

Type: Grant

Filed: December 19, 2017

Date of Patent: January 28, 2020

Assignee: International Business Machines Corporation

Inventors: Robert H. Grant, Trudy L. Hewitt, Fang Lu
Systems and methods for identifying speech based on spectral features

Patent number: 10546598

Abstract: Audio information defining audio content may be accessed. The audio content may have a duration. The audio content may be segmented into audio segments. Individual audio segments may correspond to a portion of the duration. The audio segments may include a first audio segment corresponding to a first portion of the duration. Energy features, entropy features, frequency features, and/or other features of the audio segments may be determined. Energy features may characterize energy of the audio segments. Entropy features may characterize spectral flatness of the audio segments. Frequency features may characterize highest frequencies of the audio segments. One or more of the audio segments may be identified as containing speech based on the energy features, the entropy features, the frequency features, and/or other information. Storage of the identification of the one or more of the audio segments as containing speech in one or more storage media may be effectuated.

Type: Grant

Filed: August 16, 2019

Date of Patent: January 28, 2020

Assignee: GoPro, Inc.

Inventor: Tom Médioni
Individualized hotword detection models

Patent number: 10535354

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining enrollment acoustic data representing an enrollment utterance spoken by a user, obtaining a set of candidate acoustic data representing utterances spoken by other users, determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data, selecting a subset of candidate acoustic data from the set of candidate acoustic data based at least on the similarity scores, generating a detection model based on the subset of candidate acoustic data, and providing the detection model for use in detecting an utterance spoken by the user.

Type: Grant

Filed: June 29, 2016

Date of Patent: January 14, 2020

Assignee: Google LLC

Inventor: Raziel Alvarez Guevara
Bot-based data collection for detecting phone solicitations

Patent number: 10535019

Abstract: One embodiment provides a method comprising answering one or more incoming phone calls received at one or more pre-specified phone numbers utilizing a bot. The bot is configured to engage in a conversation with a caller initiating an incoming phone call utilizing a voice recording that impersonates a human being. The method further comprises recording each conversation the bot engages in, and classifying each recorded conversation as one of poison data or truthful training data based on content of the recorded conversation and one or more learned detection models for detecting poisoned data.

Type: Grant

Filed: July 31, 2018

Date of Patent: January 14, 2020

Assignee: International Business Machines Corporation

Inventors: Nathalie Baracaldo Angel, Pawan R. Chowdhary, Heiko H. Ludwig, Robert J. Moore, Taiga Nakamura
Audio signal processing

Patent number: 10529352

Abstract: An audio signal processing device comprises: an audio input configured to receive an audio signal to be coded; an audio codec configured to apply audio coding to the audio signal, thereby generating coded audio data, having an audio bandwidth, for transmission to a remote device; a network interface configured to receive from the remote device an indication of at least one characteristic of an audio output device of the remote device; and an audio bandwidth selector configured to set an audio bandwidth parameter of the audio codec based on the indication received from the remote device, thereby setting the audio bandwidth of the coded audio data in dependence on the at least one characteristic of the audio output device.

Type: Grant

Filed: February 20, 2017

Date of Patent: January 7, 2020

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Karsten V. Sørensen, Karlheinz Wurm
Coding device, decoding device, and method and program thereof

Patent number: 10529350

Abstract: A technology of accurately coding and decoding coefficients which are convertible into linear prediction coefficients even for a frame in which the spectrum variation is great while suppressing an increase in the code amount as a whole is provided. A coding device includes: a first coding unit that obtains a first code by coding coefficients which are convertible into linear prediction coefficients of more than one order; and a second coding unit that obtains a second code by coding at least quantization errors of the first coding unit if (A-1) an index Q commensurate with how high the peak-to-valley height of a spectral envelope is, the spectral envelope corresponding to the coefficients which are convertible into the linear prediction coefficients of more than one order, is larger than or equal to a predetermined threshold value Th1 and/or (B-1) an index Q? commensurate with how short the peak-to-valley height of the spectral envelope is, is smaller than or equal to a predetermined threshold value Th1?.

Type: Grant

Filed: June 3, 2019

Date of Patent: January 7, 2020

Assignee: Nippon Telegraph and Telephone Corporation

Inventors: Takehiro Moriya, Yutaka Kamamoto, Noboru Harada

1 2 3 4 5 … next