Patents Examined by Michael C Colucci
  • Patent number: 10657979
    Abstract: A decoder for generating a frequency enhanced audio signal, includes: a feature extractor for extracting a feature from a core signal; a side information extractor for extracting a selection side information associated with the core signal; a parameter generator for generating a parametric representation for estimating a spectral range of the frequency enhanced audio signal not defined by the core signal, wherein the parameter generator is configured to provide a number of parametric representation alternatives in response to the feature, and wherein the parameter generator is configured to select one of the parametric representation alternatives as the parametric representation in response to the selection side information; and a signal estimator for estimating the frequency enhanced audio signal using the parametric representation selected.
    Type: Grant
    Filed: July 28, 2015
    Date of Patent: May 19, 2020
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Frederik Nagel, Sascha Disch, Andreas Niedermeier
  • Patent number: 10629205
    Abstract: An approach is provided for identifying an accurate transcription of a sentence. Options for transcriptions of each word in the sentence are determined. Probabilistic scores of the options are determined. Variations of a transcription of the sentence are generated by randomly selecting from the options with the probabilistic scores weighting the selections. Plausibility scores for the variations are generated by performing syntactic, semantic, and redundancy analyses of the variations. Based on the plausibility scores, the probabilistic scores, and the variations, tentative transcriptions of the sentence are determined and refined repeatedly by employing a genetic evolution technique until a final refined tentative transcription is the accurate transcription of the sentence.
    Type: Grant
    Filed: June 12, 2018
    Date of Patent: April 21, 2020
    Assignee: International Business Machines Corporation
    Inventors: Giulia Carnevale, Marco Gianfico, Ciro Ragusa, Roberto Ragusa
  • Patent number: 10621990
    Abstract: Aspects of the present invention provide devices that subtitle streaming video with audio and identify a speaker in a streaming video with audio according to words spoken by the speaker matched to a cognitive print. The cognitive print includes traits classified according a hierarchical long short term model (LSTM). The hierarchical LSTM includes layers of LSTMs and each layer corresponds to the classification of one trait. A processor annotates a subtitle of the words spoken by the speaker, which decorates the subtitle with a label representative of the identified speaker, and streams the decorated subtitle with the streaming video with audio.
    Type: Grant
    Filed: April 30, 2018
    Date of Patent: April 14, 2020
    Assignee: International Business Machines Corporation
    Inventors: Jeff Amsterdam, Aaron K. Baughman, Stephen C. Hammer, David A. Provan
  • Patent number: 10614173
    Abstract: The disclosed subject matter provides a system, computer readable storage medium, and a method providing an audio and textual transcript of a communication. A conferencing services may receive audio or audio visual signals from a plurality of different devices that receive voice communications from participants in a communication, such as a chat or teleconference. The audio signals representing voice (speech) communications input into respective different devices by the participants. A translation services server may receive over a separate communication channel the audio signals for translation into a second language. As managed by the translation services server, the audio signals may be converted into textual data. The textual data may be translated into text of different languages based the language preferences of the end user devices in the teleconference. The translated text may be further translated into audio signals.
    Type: Grant
    Filed: July 9, 2019
    Date of Patent: April 7, 2020
    Assignee: Google LLC
    Inventors: Trausti Kristjansson, John Huang, Yu-Kuan Lin, Hung-ying Tyan, Jakob David Uszkoreit, Joshua James Estelle, Chung-yi Wang, Kirill Buryak, Yusuke Konishi
  • Patent number: 10614793
    Abstract: A system includes one or more memory devices storing instructions, and one or more processors configured to execute the instructions to perform steps of providing automated natural dialogue with a customer. The system may generate one or more events and commands temporarily stored in queues to be processed by one or more of a dialogue management device, an API server, and an NLP device. The dialogue management device may create adaptive responses to customer communications using a customer context, a rules-based platform, and a trained machine learning model.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: April 7, 2020
    Assignee: CAPITAL ONE SERVICES, LLC
    Inventors: Gregory W. Zoller, Scott Karp, Sujay Eliphaz Jacob, Erik Mueller, Stephanie Hay, Adam Roy Paynter
  • Patent number: 10607607
    Abstract: There is provided a control device to improve convenience for a user by resolving or alleviating a disadvantage of a known voice interaction, the control device including: a device control unit configured to control one or more controlled devices; a voice notification unit configured to output user-oriented voice notification regarding at least the one controlled device; and a display control unit configured to cause a display device to display a message corresponding to the voice notification output by the voice notification unit.
    Type: Grant
    Filed: October 4, 2016
    Date of Patent: March 31, 2020
    Assignee: SONY CORPORATION
    Inventor: Hideo Nagasaka
  • Patent number: 10599783
    Abstract: Methods, systems, and computer program products for automatically suggesting a temporal opportunity for writing one or more sequel articles via artificial intelligence are provided herein. A computer-implemented method includes extracting one or more types of information from a prior written document; automatically determining, based on the extracted information, at least one temporal opportunity for generating a follow-up written document to the prior written document; automatically generating a follow-up written document to the prior written document, the follow-up written document being written in a style that indicates that it is in response to the prior written document, in accordance with the at least one determined temporal opportunity, and based on (i) one or more items of information, related to the extracted information, derived from one or more web sources, and (ii) a writing model attributed to a user.
    Type: Grant
    Filed: December 26, 2017
    Date of Patent: March 24, 2020
    Assignee: International Business Machines Corporation
    Inventors: Pranay Lohia, Saket Gurukar, Rishabh Gupta, Himanshu Gupta
  • Patent number: 10593342
    Abstract: An audio signal encoding method is provided. The method comprises: collecting audio signal samples, determining sinusoidal components in subsequent frames, estimation of amplitudes and frequencies of the components for each frame, merging thus obtained pairs into sinusoidal trajectories, splitting particular trajectories into segments, transforming particular trajectories to the frequency domain by means of a digital transform performed on segments longer than the frame duration, quantization and selection of transform coefficients in the segments, entropy encoding, outputting the quantized coefficients as output data, wherein segments of different trajectories starting within a particular time are grouped into Groups of Segments (GOS), and the partitioning of trajectories into segments is synchronized with the endpoints of a Group of Segments).
    Type: Grant
    Filed: March 22, 2018
    Date of Patent: March 17, 2020
    Assignees: Huawei Technologies Co., Ltd., ZYLIA SP. Z O.O.
    Inventors: Tomasz Żernicki, Łukasz Januszkiewicz, Panji Setiawan
  • Patent number: 10592611
    Abstract: Embodiments of the present invention provide a system for automatically extracting conversational structure from a voice record based on lexical and acoustic features. The system also aggregates business-relevant statistics and entities from a collection of spoken conversations. The system may infer a coarse-level conversational structure based on fine-level activities identified from extracted acoustic features. The system improves significantly over previous systems by extracting structure based on lexical and acoustic features. This enables extracting conversational structure on a larger scale and finer level of detail than previous systems, and can feed an analytics and business intelligence platform, e.g. for customer service phone calls. During operation, the system obtains a voice record. The system then extracts a lexical feature using automatic speech recognition (ASR). The system extracts an acoustic feature.
    Type: Grant
    Filed: October 24, 2016
    Date of Patent: March 17, 2020
    Assignee: Conduent Business Services, LLC
    Inventors: Jesse Vig, Harish Arsikere, Margaret H. Szymanski, Luke R. Plurkowski, Kyle D. Dent, Daniel G. Bobrow, Daniel Davies, Eric Saund
  • Patent number: 10573326
    Abstract: A method includes decoding a low-band mid channel bitstream to generate a low-band mid signal and a low-band mid excitation signal. The method further includes decoding a high-band mid channel bandwidth extension bitstream to generate a synthesized high-band mid signal. The method also includes determining an inter-channel bandwidth extension (ICBWE) gain mapping parameter corresponding to the synthesized high-band mid signal. The ICBWE gain mapping parameter is based on a selected frequency-domain gain parameter that is extracted from a stereo downmix/upmix parameter bitstream. The method further includes performing a gain scaling operation on the synthesized high-band mid signal based on the ICBWE gain mapping parameter to generate a reference high-band channel and a target high-band channel. The method includes outputting a first audio channel and a second audio channel. The first audio channel is based on the reference high-band channel, and the second audio channel is based on target high-band channel.
    Type: Grant
    Filed: March 26, 2018
    Date of Patent: February 25, 2020
    Assignee: Qualcomm Incorporated
    Inventors: Venkata Subrahmanyam Chandra Sekhar Chebiyyam, Venkatraman Atti
  • Patent number: 10574827
    Abstract: A method and apparatus of sharing documents during a conference call data is disclosed. One example method may include initiating a document sharing operation during a conference call conducted between at least two participants communicating during the conference call. The method may also include transferring the document from one of the two participants to another of the two participants, and recording at least one action performed to the document by the participants during the conference call.
    Type: Grant
    Filed: April 9, 2019
    Date of Patent: February 25, 2020
    Assignee: West Corporation
    Inventors: Mark J. Pettay, Hendryanto Rilantono, Myron P. Sojka
  • Patent number: 10553230
    Abstract: The present disclosure relates to a decoding apparatus, a decoding method, and a program that can switch, as quickly as possible, a plurality of audio encoded bit streams with synchronized reproduction timing to thereby decode and output the plurality of audio encoded bit streams. An aspect of the present disclosure provides a decoding apparatus including: an acquisition unit that acquires a plurality of audio encoded bit streams; a selection unit that determines a boundary position for switching output of the plurality of audio encoded bit streams and that selectively supplies one of the plurality of acquired audio encoded bit streams to a decoding processing unit according to the boundary position; and the decoding processing unit that applies a decoding process including IMDCT processing to the one input through the selection unit, in which the decoding processing unit skips overlap-and-add in the IMDCT processing corresponding to each frame before and after the boundary position.
    Type: Grant
    Filed: October 26, 2016
    Date of Patent: February 4, 2020
    Assignee: Sony Corporation
    Inventors: Mitsuyuki Hatanaka, Toru Chinen, Minoru Tsuji, Hiroyuki Honma
  • Patent number: 10553200
    Abstract: A text-to-speech (TTS) computing includes a processor and a memory. The TTS computing device is configured to generate a machine pronunciation of a text data according to at least one phonetic rule, and provide the machine pronunciation to a user interface of the TTS computing device such that the machine pronunciation is audibly communicated to a user of the TTS computing device. The TTS computing device is also configured to receive a pronunciation correction of the machine pronunciation from the user via the user interface, and store the pronunciation correction in a TTS data source. The TTS computing device is further configured to assign the pronunciation correction provided by the user to a user profile that corresponds to the text data.
    Type: Grant
    Filed: April 27, 2018
    Date of Patent: February 4, 2020
    Assignee: Mastercard International Incorporated
    Inventor: Jason Jay Lacoss-Arnold
  • Patent number: 10553229
    Abstract: A technology of accurately coding and decoding coefficients which are convertible into linear prediction coefficients even for a frame in which the spectrum variation is great while suppressing an increase in the code amount as a whole is provided. A coding device includes: a first coding unit that obtains a first code by coding coefficients which are convertible into linear prediction coefficients of more than one order; and a second coding unit that obtains a second code by coding at least quantization errors of the first coding unit if (A-1) an index Q commensurate with how high the peak-to-valley height of a spectral envelope is, the spectral envelope corresponding to the coefficients which are convertible into the linear prediction coefficients of more than one order, is larger than or equal to a predetermined threshold value Th1 and/or (B-1) an index Q? commensurate with how short the peak-to-valley height of the spectral envelope is, is smaller than or equal to a predetermined threshold value Th1?.
    Type: Grant
    Filed: June 3, 2019
    Date of Patent: February 4, 2020
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Takehiro Moriya, Yutaka Kamamoto, Noboru Harada
  • Patent number: 10546598
    Abstract: Audio information defining audio content may be accessed. The audio content may have a duration. The audio content may be segmented into audio segments. Individual audio segments may correspond to a portion of the duration. The audio segments may include a first audio segment corresponding to a first portion of the duration. Energy features, entropy features, frequency features, and/or other features of the audio segments may be determined. Energy features may characterize energy of the audio segments. Entropy features may characterize spectral flatness of the audio segments. Frequency features may characterize highest frequencies of the audio segments. One or more of the audio segments may be identified as containing speech based on the energy features, the entropy features, the frequency features, and/or other information. Storage of the identification of the one or more of the audio segments as containing speech in one or more storage media may be effectuated.
    Type: Grant
    Filed: August 16, 2019
    Date of Patent: January 28, 2020
    Assignee: GoPro, Inc.
    Inventor: Tom Médioni
  • Patent number: 10546060
    Abstract: An approach is provided to detect pronouns that are included in textual posts that are found in an online discussion. The textual posts are analyzed using a natural language processing speech classification technique, that results in an identification of a noun to which the detected pronoun refers. The system then displays, on a display device, the noun to which the pronoun refers.
    Type: Grant
    Filed: December 19, 2017
    Date of Patent: January 28, 2020
    Assignee: International Business Machines Corporation
    Inventors: Robert H. Grant, Trudy L. Hewitt, Fang Lu
  • Patent number: 10535019
    Abstract: One embodiment provides a method comprising answering one or more incoming phone calls received at one or more pre-specified phone numbers utilizing a bot. The bot is configured to engage in a conversation with a caller initiating an incoming phone call utilizing a voice recording that impersonates a human being. The method further comprises recording each conversation the bot engages in, and classifying each recorded conversation as one of poison data or truthful training data based on content of the recorded conversation and one or more learned detection models for detecting poisoned data.
    Type: Grant
    Filed: July 31, 2018
    Date of Patent: January 14, 2020
    Assignee: International Business Machines Corporation
    Inventors: Nathalie Baracaldo Angel, Pawan R. Chowdhary, Heiko H. Ludwig, Robert J. Moore, Taiga Nakamura
  • Patent number: 10535354
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting notifications in an enterprise system. In one aspect, a method include actions of obtaining enrollment acoustic data representing an enrollment utterance spoken by a user, obtaining a set of candidate acoustic data representing utterances spoken by other users, determining, for each candidate acoustic data of the set of candidate acoustic data, a similarity score that represents a similarity between the enrollment acoustic data and the candidate acoustic data, selecting a subset of candidate acoustic data from the set of candidate acoustic data based at least on the similarity scores, generating a detection model based on the subset of candidate acoustic data, and providing the detection model for use in detecting an utterance spoken by the user.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: January 14, 2020
    Assignee: Google LLC
    Inventor: Raziel Alvarez Guevara
  • Patent number: 10529317
    Abstract: A neural network training apparatus includes a primary trainer configured to perform a primary training of a neural network model based on clean training data and target data corresponding to the clean training data; and a secondary trainer configured to perform a secondary training of the neural network model on which the primary training has been performed based on noisy training data and an output probability distribution of an output class for the clean training data calculated during the primary training of the neural network model.
    Type: Grant
    Filed: November 4, 2016
    Date of Patent: January 7, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Ho Shik Lee, Hee Youl Choi
  • Patent number: 10529352
    Abstract: An audio signal processing device comprises: an audio input configured to receive an audio signal to be coded; an audio codec configured to apply audio coding to the audio signal, thereby generating coded audio data, having an audio bandwidth, for transmission to a remote device; a network interface configured to receive from the remote device an indication of at least one characteristic of an audio output device of the remote device; and an audio bandwidth selector configured to set an audio bandwidth parameter of the audio codec based on the indication received from the remote device, thereby setting the audio bandwidth of the coded audio data in dependence on the at least one characteristic of the audio output device.
    Type: Grant
    Filed: February 20, 2017
    Date of Patent: January 7, 2020
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Karsten V. Sørensen, Karlheinz Wurm