Patents Examined by Keara Harris
  • Patent number: 10140972
    Abstract: A method of training an acoustic model for a text-to-speech system, the method comprising: receiving speech data; said speech data comprising data corresponding to different values of a first speech factor, and wherein said speech data is unlabeled, such that for a given item of speech data, the value of said first speech factor is unknown; clustering said speech data according to the value of said first speech factor into a first set of clusters; and estimating a first set of parameters to enable the acoustic model to accommodate speech for the different values of the first speech factor, wherein said clustering and said first parameter estimation are jointly performed according to a common maximum likelihood criterion.
    Type: Grant
    Filed: August 22, 2014
    Date of Patent: November 27, 2018
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Langzhou Chen
  • Patent number: 10083174
    Abstract: A multilayered context enriched text translation interface includes a simulation layer comprising one or more text objects and a translation layer. The interface displays one or more mimicked views of an application GUI in the simulation layer. Subsequent to a user engaging a text object, the interface displays a prompt for a text translation of the text object within a translation layer. In certain embodiments, the mimicked views are graphical reproductions of the application GUI pages with functionality of one or more text objects of the application GUI disabled. In certain embodiments, the prompt includes an accentuation objects to visually accentuate the text object, a text-editing object to receive the text translation of the text object, and a link object to visually connect the accentuation objects and text-editing object.
    Type: Grant
    Filed: October 24, 2017
    Date of Patent: September 25, 2018
    Assignee: International Business Machines Corporation
    Inventors: Amit Bareket, Nadav Parag, Dan Ravid, Tamir Riechberg, Moshe Weiss
  • Patent number: 10055402
    Abstract: A device may obtain text to be analyzed to determine semantic connections between sections of the text. The device may identify subject-verb-object (SVO) units included in the text, and may determine SVO unit information that describes the SVO units. The device may analyze the SVO unit information to determine semantic connection information that identifies one or more semantic connections between two or more of the SVO units. The one or more semantic connections may identify relationships between verbs associated with the two or more of the SVO units. The device may generate a semantic network based on the SVO unit information and the semantic connection information, and may provide information regarding the semantic network.
    Type: Grant
    Filed: March 16, 2015
    Date of Patent: August 21, 2018
    Assignee: Accenture Global Services Limited
    Inventors: Shubhashis Sengupta, Roshni Ramesh Ramnani, Subhabrata Das, Anitha Chandran
  • Patent number: 10019984
    Abstract: Techniques and technologies for diagnosing speech recognition errors are described. In an example implementation, a system for diagnosing speech recognition errors may include an error detection module configured to determine that a speech recognition result is least partially erroneous, and a recognition error diagnostics module. The recognition error diagnostics module may be configured to (a) perform a first error analysis of the at least partially erroneous speech recognition result to provide a first error analysis result; (b) perform a second error analysis of the at least partially erroneous speech recognition result to provide a second error analysis result; and (c) determine at least one category of recognition error associated with the at least partially erroneous speech recognition result based on a combination of the first error analysis result and the second error analysis result.
    Type: Grant
    Filed: February 27, 2015
    Date of Patent: July 10, 2018
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Shiun-Zu Kuo, Thomas Reutter, Yifan Gong, Mark T. Hanson, Ye Tian, Shuangyu Chang, Jonathan Hamaker, Qi Miao, Yuancheng Tu
  • Patent number: 9984706
    Abstract: Voice activity detection (VAD) is an enabling technology for a variety of speech based applications. Herein disclosed is a robust VAD algorithm that is also language independent. Rather than classifying short segments of the audio as either “speech” or “silence”, the VAD as disclosed herein employees a soft-decision mechanism. The VAD outputs a speech-presence probability, which is based on a variety of characteristics.
    Type: Grant
    Filed: August 1, 2014
    Date of Patent: May 29, 2018
    Assignee: VERINT SYSTEMS LTD.
    Inventor: Ron Wein
  • Patent number: 9966080
    Abstract: An audio object encoder comprises a receiver (701) which receives N audio objects. A downmixer (703) downmixes the N audio objects to M audio channels, and a channel circuit (707) derives K audio channels from the M audio channels, K=1, 2 and K<M. A parameter circuit (709) generates audio object upmix parameters for at least part of each of the N audio objects relative to the K audio channels and an output circuit (705, 711) generates an output data stream comprising the audio object upmix parameters and the M audio channels. An audio object decoder receives the data stream and includes a channel circuit (805) deriving K audio channels from the M channel downmix; and an object decoder (807) for generating at least part of each of the N audio objects by upmixing the K audio channels based on the audio object upmix parameters. The invention may allow improved object encoding while maintaining backwards compatibility.
    Type: Grant
    Filed: October 29, 2012
    Date of Patent: May 8, 2018
    Assignee: KONINKLIJKE PHILIPS N.V.
    Inventors: Jeroen Gerardus Henricus Koppens, Arnoldus Werner Johannes Oomen, Leon Maria Van De Kerkhof
  • Patent number: 9947335
    Abstract: Embodiments are directed to a companding method and system for reducing coding noise in an audio codec. A compression process reduces an original dynamic range of an initial audio signal through a compression process that divides the initial audio signal into a plurality of segments using a defined window shape, calculates a wideband gain in the frequency domain using a non-energy based average of frequency domain samples of the initial audio signal, and applies individual gain values to amplify segments of relatively low intensity and attenuate segments of relatively high intensity. The compressed audio signal is then expanded back to substantially the original dynamic range that applies inverse gain values to amplify segments of relatively high intensity and attenuating segments of relatively low intensity. A QMF filterbank is used to analyze the initial audio signal to obtain a frequency domain representation.
    Type: Grant
    Filed: April 1, 2014
    Date of Patent: April 17, 2018
    Assignees: Dolby Laboratories Licensing Corporation, Dolby International AB
    Inventors: Per Hedelin, Arijit Biswas, Michael Schug, Vinay Melkote
  • Patent number: 9934776
    Abstract: Method of selecting training text for language model, and method of training language model using the training text, and computer and computer program for executing the methods. The present invention provides for selecting training text for a language model that includes: generating a template for selecting training text from a corpus in a first domain according to generation techniques of: (i) replacing one or more words in a word string selected from the corpus in the first domain with a special symbol representing any word or word string, and adopting the word string after replacement as a template for selecting the training text; and/or (ii) adopting the word string selected from the corpus in the first domain as the template for selecting the training text; and selecting text covered by the template as the training text from a corpus in a second domain different from the first domain.
    Type: Grant
    Filed: July 20, 2015
    Date of Patent: April 3, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura
  • Patent number: 9910847
    Abstract: A plurality of documents in each of a plurality of languages can be received. A Latent Semantic Indexing (LSI) index can be created from the plurality of documents. A language classification model can be trained from the LSI index. A document to be identified by language can be received. A vector in the LSI index can be generated for the document to be identified by language. The vector can be evaluated against the language classification model.
    Type: Grant
    Filed: September 30, 2014
    Date of Patent: March 6, 2018
    Assignee: ACCENTURE GLOBAL SERVICES LIMITED
    Inventor: Mark Bittmann
  • Patent number: 9892727
    Abstract: Method of selecting training text for language model, and method of training language model using the training text, and computer and computer program for executing the methods. The present invention provides for selecting training text for a language model that includes: generating a template for selecting training text from a corpus in a first domain according to generation techniques of: (i) replacing one or more words in a word string selected from the corpus in the first domain with a special symbol representing any word or word string, and adopting the word string after replacement as a template for selecting the training text; and/or (ii) adopting the word string selected from the corpus in the first domain as the template for selecting the training text; and selecting text covered by the template as the training text from a corpus in a second domain different from the first domain.
    Type: Grant
    Filed: December 10, 2015
    Date of Patent: February 13, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura
  • Patent number: 9865253
    Abstract: The present invention is a system and method for discriminating between human and synthetic speech. The method and system include memory for storing a speaker verification application, a communication network that receives from a client device a speech signal having one or more discriminating features, and a processor for executing instructions stored in memory. The execution of the instructions by the processor extracts the one or more discriminating features from the speech signal and classifies the speech signal as human or synthetic based on the extracted features.
    Type: Grant
    Filed: August 21, 2014
    Date of Patent: January 9, 2018
    Assignee: VoiceCipher, Inc.
    Inventors: Phillip L. De Leon, Steven Spence, Bryan Stewart, Junichi Yamagishi
  • Patent number: 9864935
    Abstract: The present invention is characterized by including: an analysis processing unit 19 that can analyze PDL data of a particular PDL; a text counter 21 that counts the number of processes of codes outside of a character-assigned range during a text process based on a character code table in the analysis by the analysis processing unit 19; and an interruption unit 23 that interrupts the analysis of the PDL data when the counted number by the text counter 21 exceeds a predetermined first threshold.
    Type: Grant
    Filed: February 15, 2013
    Date of Patent: January 9, 2018
    Assignee: KYOCERA DOCUMENT SOLUTIONS INC.
    Inventor: Toshihiro Seko
  • Patent number: 9858272
    Abstract: A multilayered context enriched text translation interface includes a simulation layer comprising one or more text objects and a translation layer. The interface displays one or more mimicked views of an application GUI in the simulation layer. Subsequent to a user engaging a text object, the interface displays a prompt for a text translation of the text object within a translation layer. In certain embodiments, the mimicked views are graphical reproductions of the application GUI pages with functionality of one or more text objects of the application GUI disabled. In certain embodiments, the prompt includes an accentuation objects to visually accentuate the text object, a text-editing object to receive the text translation of the text object, and a link object to visually connect the accentuation objects and text-editing object.
    Type: Grant
    Filed: February 16, 2014
    Date of Patent: January 2, 2018
    Assignee: International Business Machines Corporation
    Inventors: Amit Bareket, Nadav Parag, Dan Ravid, Tamir Riechberg, Moshe Weiss
  • Patent number: 9852128
    Abstract: A method and/or computer program product validates a translation memory against terminology dictionary of a source and target language. For each source term, occurrences of a particular source term within source segments are identified, where an occurrence is determined according to grammar rules. For each identified source term occurrence in a source segment, a closeness score between a corresponding target term and a corresponding occurrence of the corresponding target term in a target segment is calculated. Each identified occurrence of a source term in a source segment is reported, as well as a closeness score, for each identified occurrence, between each identified occurrence and a corresponding target term in a target segment.
    Type: Grant
    Filed: April 7, 2014
    Date of Patent: December 26, 2017
    Assignee: International Business Machines Corporation
    Inventor: Christophe D. A. Chenon
  • Patent number: 9852734
    Abstract: System and methods are provided for modifying audio signals. A waveform representing an audio signal changing over time is received. A first time length is selected. A first starting point in the waveform is selected. A first pair of adjacent segments of the waveform are determined based at least in part on the first starting point and the first time length. The first pair of adjacent segments each correspond to the first time length. A first difference measure associated with the first pair of adjacent segments is calculated. In response to the first difference measure being smaller than a threshold, compression or expansion of the waveform is performed based at least in part on the first time length and the first starting point.
    Type: Grant
    Filed: April 11, 2014
    Date of Patent: December 26, 2017
    Assignees: SYNAPTICS INCORPORATED, SYNAPTICS LLC
    Inventors: Zhuojin Sun, Bingsen Xie
  • Patent number: 9852741
    Abstract: Methods, an encoder and a decoder are configured for transition between frames with different internal sampling rates. Linear predictive (LP) filter parameters are converted from a sampling rate S1 to a sampling rate S2. A power spectrum of a LP synthesis filter is computed, at the sampling rate S1, using the LP filter parameters. The power spectrum of the LP synthesis filter is modified to convert it from the sampling rate S1 to the sampling rate S2. The modified power spectrum of the LP synthesis filter is inverse transformed to determine autocorrelations of the LP synthesis filter at the sampling rate S2. The autocorrelations are used to compute the LP filter parameters at the sampling rate S2.
    Type: Grant
    Filed: April 2, 2015
    Date of Patent: December 26, 2017
    Assignee: VOICEAGE CORPORATION
    Inventors: Redwan Salami, Vaclav Eksler
  • Patent number: 9852122
    Abstract: Automated analysis of text documents is used to scan text documents in order to find phrases or text fragments from other documents, or modifying the existing ones. A comparatively fast and universally applicable method finds phrases, sentences or even text fragments from other documents. The method includes: all electronic files containing model documents are converted to a given format; meaningful fragments, called “clauses”, are extracted from them; the converted files containing model documents are stored in the database; each electronic file containing a document to be analyzed is converted to the given format; clauses extracted from analyzed documents are compared with clauses extracted from model documents; fractions of clauses from an analyzed document matching clauses from each model document are calculated; fractions found are then compared with a pre-set threshold value in order to find out whether there are text fragments from a model document in the analyzed one.
    Type: Grant
    Filed: November 16, 2012
    Date of Patent: December 26, 2017
    Assignee: OBSHCHESTVO S OGRANICHENNOY OTVETSTVENNOST'YU “TSENTR INNOVATSIY NATAL'I KASPERSKOY”
    Inventors: Vladimir Anatol'yevich Lapshin, Dmitriy Vsevolodovich Perov, Yekaterina Aleksandrovna Pshekhotskaya, Sergey S. Ryabov
  • Patent number: 9842604
    Abstract: An apparatus includes a user input unit, a display unit, a control unit, and a buffer unit. The display unit includes a speed setting menu. The control unit selects a mode from the speed setting menu in response to the selection signal of the user, and controls a compression ratio of a voice codec and a transfer rate of a modem corresponding to a transmission-side radio, and a reception rate of a modem and a restoration rate of a voice codec corresponding to a reception-side radio, based on the selected mode. The buffer unit performs a storage function if there is a difference between the compression ratio of the voice codec and the transfer rate of the modem or if there is a difference between the reception rate of the modem and the restoration rate of the voice codec.
    Type: Grant
    Filed: August 20, 2014
    Date of Patent: December 12, 2017
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Young Ho Son, CheolYong Park, Tae uk Yang, Jang Hong Yoon, Jeong-Seok Lim, Jung-Gil Park
  • Patent number: 9837080
    Abstract: Systems and methods for maintaining speaker recognition performance are provided. A method for maintaining speaker recognition performance, comprises training a plurality of models respectively corresponding to speaker recognition scores from a plurality of speakers over a plurality of sessions, and using the plurality of models to conclude whether a speaker seeking access to an environment is a non-ideal target speaker or a non-ideal non-target speaker. Using the plurality of models to conclude comprises calculating a first probability that the speaker seeking access is the non-ideal target speaker, calculating a second probability that the speaker seeking access is the non-ideal non-target speaker, and determining whether the first probability, the second probability or a sum of the first probability and the second probability is above a probability threshold.
    Type: Grant
    Filed: August 21, 2014
    Date of Patent: December 5, 2017
    Assignee: International Business Machines Corporation
    Inventors: Hagai Aronowitz, Shay Ben-David, David Nahamoo, Jason W. Pelecanos, Orith Toledo-Ronen
  • Patent number: 9830906
    Abstract: A speech recognition control device has a plurality of microphones placed at different positions, a speech transmission control unit, and a speech recognition execution control unit. The speech transmission control unit stores data based on the speeches which are input from the microphones and time data related to ranks among the microphones, assigns ranks to the plurality of microphones using the time data based on a preset condition, and transmits a speech data signal corresponding to the microphone to the speech recognition execution control unit in the order of the ranks. The speech recognition execution control unit executes the speech recognition process according to the order of the speech data signals transmitted from the speech transmission control unit.
    Type: Grant
    Filed: April 8, 2014
    Date of Patent: November 28, 2017
    Assignee: Kojima Industries Corporation
    Inventors: Takashi Inose, Shinobu Nakamura