Patents Examined by Huyen Vo
  • Patent number: 9477925
    Abstract: The use of a pipelined algorithm that performs parallelized computations to train deep neural networks (DNNs) for performing data analysis may reduce training time. The DNNs may be one of context-independent DNNs or context-dependent DNNs. The training may include partitioning training data into sample batches of a specific batch size. The partitioning may be performed based on rates of data transfers between processors that execute the pipelined algorithm, considerations of accuracy and convergence, and the execution speed of each processor. Other techniques for training may include grouping layers of the DNNs for processing on a single processor, distributing a layer of the DNNs to multiple processors for processing, or modifying an execution order of steps in the pipelined algorithm.
    Type: Grant
    Filed: November 20, 2012
    Date of Patent: October 25, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Frank Torsten Bernd Seide, Gang Li, Dong Yu, Adam C. Eversole, Xie Chen
  • Patent number: 9472184
    Abstract: Embodiments that relate to identifying potential cross-language speech recognition problems are disclosed. For example, in one disclosed embodiment a speech recognition problem detection program receives a target word in a non-native language from a target application. A phonetic transcription of the target word comprising a plurality of target phonetic units is acquired. The program determines that at least one of the target phonetic units is not found in a plurality of native phonetic units associated with a native language. In response, a warning of the potential cross-language speech recognition problem may be outputted for display on a display device. The warning may comprise the target word.
    Type: Grant
    Filed: November 6, 2013
    Date of Patent: October 18, 2016
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Michael Tjalve, Pavan Karnam, Dennis Mooney
  • Patent number: 9448990
    Abstract: A statistical language model (SLM) may be iteratively refined by considering N-gram counts in new data, and blending the information contained in the new data with the existing SLM. A first group of documents is evaluated to determine the probabilities associated with the different N-grams observed in the documents. An SLM is constructed based on these probabilities. A second group of documents is then evaluated to determine the probabilities associated with each N-gram in that second group. The existing SLM is then evaluated to determine how well it explains the probabilities in the second group of documents, and a weighting parameter is calculated from that evaluation. Using the weighting parameter, a new SLM is then constructed as a weighted average of the existing SLM and the new probabilities.
    Type: Grant
    Filed: November 5, 2013
    Date of Patent: September 20, 2016
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Kuansan Wang, Xiaolong Li, Jiangbo Miao, Frederic H. Behr, Jr.
  • Patent number: 9442693
    Abstract: A method of utilizing a speech assistant, the speech assistant designed to provide a voice input and speech output capability, the method comprising, enabling the use of the speech assistant for communication with a user, and terminating the speech assistant when the communication is complete. The method further comprises receiving a notification from a native application associated with the communication, and activating a sub-portion of the speech assistant, to enable outputting of the notification using speech output, thereby enabling the use of speech output for periodic announcements without enabling the speech assistant.
    Type: Grant
    Filed: June 4, 2013
    Date of Patent: September 13, 2016
    Assignee: Nuance Communications, Inc.
    Inventors: Elizabeth A. Dykstra-Erickson, Jared L. Strawderman
  • Patent number: 9437207
    Abstract: Various of the disclosed embodiments relate to systems and methods for extracting audio information, e.g. a textual description of speech, from a speech recording while retaining the anonymity of the speaker. In certain embodiments, a third party may perform various aspects of the anonymization and speech processing. Certain embodiments facilitate anonymization in compliance with various legislative requirements even when third parties are involved.
    Type: Grant
    Filed: April 3, 2013
    Date of Patent: September 6, 2016
    Assignee: PULLSTRING, INC.
    Inventors: Oren M Jacob, Martin Reddy, Brian Langner
  • Patent number: 9431010
    Abstract: With respect to speech data 4 of an input speech 2, a speech-recognition device 1 performs at an internal recognizer 7, recognition processing using an acoustic model 9, to calculate an internal recognition result 10 and its acoustic likelihood. A reading-addition processor 12 acquires an external recognition result 11 from recognition processing of the speech data 4 of the input speech 2 by an external recognizer 19 and adds a reading thereto, and a re-collation processor 15 calculates, using the acoustic model 9, the acoustic likelihood of the external recognition result 11 to provide a re-collation result 16. A result-determination processor 17 compares the acoustic likelihood of the internal recognition result 10 with the acoustic likelihood of the external recognition result 11 included in the re-collation result 16, to thereby determine a final recognition result 18.
    Type: Grant
    Filed: March 6, 2013
    Date of Patent: August 30, 2016
    Assignee: Mitsubishi Electric Corporation
    Inventor: Toshiyuki Hanazawa
  • Patent number: 9431030
    Abstract: A method is provided for detecting a predetermined frequency band in an audio data signal which has previously been coded according to a succession of data blocks, among which at least certain blocks contain respectively at least one set of spectral parameters representing a linear prediction filter. Such a method of detection implements, for a current block among the at least certain blocks and for which at least a plurality of spectral parameters of the set have been previously decoded, acts of: determining, among the plurality of previously decoded spectral parameters, the index of the first spectral parameter closest to a threshold frequency; calculating at least one criterion on the basis of the determined index; and deciding whether the predetermined frequency band is detected in the current block, as a function of the criterion calculated.
    Type: Grant
    Filed: December 11, 2012
    Date of Patent: August 30, 2016
    Assignee: ORANGE
    Inventors: Arnault Nagle, Claude Lamblin
  • Patent number: 9424252
    Abstract: An information providing device: saves a posted document and respective electronic files of a translation thereof in one or more other languages, in association with one another; issues a code image including a two-dimensional code created by encoding a two-dimensional code character string for identifying the electronic file associated with the same identification information; receives, from a user terminal device that accesses the electronic file by decoding the two-dimensional code from the document on which the code image is printed, character code data indicating the language set in the user terminal device, and transmits the electronic file of the translation translated into the language indicated by the character code data to the user terminal device together with information that indicates the posting place of the document, thereby providing the translation and the information that indicates the posting place of the document.
    Type: Grant
    Filed: April 2, 2012
    Date of Patent: August 23, 2016
    Assignee: PIJIN CO. LTD.
    Inventors: Kenji Takaoka, Takao Yano, Masayasu Iwashima, Kenichi Maeda, Zhichen Geng
  • Patent number: 9418653
    Abstract: An operation assisting method comprising comparing input spoken voices with a preliminarily stored keyword associated with an operation target and determining whether or not the keyword is spoken, determining whether or not similarity between or among the input spoken voices falls within a predetermined range. In a case where it is determined that the keyword is not spoken, determining whether or not eyes of a user are directed at the operation target, and in a case of the similarity falling within the predetermined range, determining that the keyword is spoken, in a case of being determined that the eyes of the user are directed at the operation target.
    Type: Grant
    Filed: May 18, 2015
    Date of Patent: August 16, 2016
    Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.
    Inventors: Takeshi Sekiguchi, Nobuyuki Kunieda
  • Patent number: 9418670
    Abstract: An encoding apparatus includes a noise detector configured to detect noise included in a certain band in accordance with an audio signal, a gain controller configured to perform gain control on the audio signal so that components in the certain band of the audio signal are attenuated when the noise is detected by the noise detector, a bit allocation calculation unit configured to calculate the numbers of bits to be allocated to frequency spectra of the audio signal which have been subjected to the gain control performed by the gain controller in accordance with the frequency spectra, and a quantization unit configured to quantize the frequency spectra of the audio signal which have been subjected to the gain control in accordance with the numbers of the bits.
    Type: Grant
    Filed: May 28, 2015
    Date of Patent: August 16, 2016
    Assignee: Sony Corporation
    Inventors: Yuuki Matsumura, Shiro Suzuki
  • Patent number: 9412391
    Abstract: According to an embodiment, a signal processing device includes a background calculator, a signal generator, an extractor, a similarity calculator, and a mixer. The background calculator is configured to calculate a first background signal in which a speech signal is removed, based on the acoustic signals. The signal generator is configured to generate a reference signal from at least one of the acoustic signals. The extractor is configured to extract a second background signal by removing a speech signal from the reference signal. The similarity calculator is configured to calculate a similarity between feature data of the background signals. The mixer is configured to calculate a weighted sum of the background signals in such a way that a greater weight is given to the first background signal as the similarity is higher and a greater weight is given to the second background signal as the similarity is lower.
    Type: Grant
    Filed: December 20, 2013
    Date of Patent: August 9, 2016
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Toshiyuki Ono, Makoto Hirohata, Masashi Nishiyama, Toru Taniguchi
  • Patent number: 9412376
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying a user in a multi-user environment. One of the methods includes receiving, by a first user device, an audio signal encoding an utterance, obtaining, by the first user device, a first speaker model for a first user of the first user device, obtaining, by the first user device for a second user of a second user device that is co-located with the first user device, a second speaker model for the second user or a second score that indicates a respective likelihood that the utterance was spoken by the second user, and determining, by the first user device, that the utterance was spoken by the first user using (i) the first speaker model and the second speaker model or (ii) the first speaker model and the second score.
    Type: Grant
    Filed: July 22, 2015
    Date of Patent: August 9, 2016
    Assignee: Google Inc.
    Inventors: Raziel Alvarez Guevara, Othar Hansson
  • Patent number: 9412368
    Abstract: A display apparatus includes a voice collecting device which collects a user voice, a communication device which performs communication with an interactive server, and a control device which, when response information corresponding to the user voice sent to the interactive server is received from the interactive server, controls to perform a feature corresponding to the response information, and the control device controls the communication device to receive replacement response information, related to the user voice, through a web search and a social network service (SNS).
    Type: Grant
    Filed: June 14, 2013
    Date of Patent: August 9, 2016
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Hye-hyun Heo, Ki-suk Kim, Hae-rim Son
  • Patent number: 9403279
    Abstract: A method and apparatus for moving an object. A verbal instruction for moving the object is received. The verbal instruction is converted into text. A logical representation of the verbal instruction is generated. A movement of a robotic system that corresponds to the verbal instruction for moving the object using a model of an environment in which the object and the robotic system are located is identified. A set of commands used by the robotic system for the movement of the robotic system is identified. The set of commands is sent to the robotic system.
    Type: Grant
    Filed: June 13, 2013
    Date of Patent: August 2, 2016
    Assignee: THE BOEING COMPANY
    Inventors: Scott D. G. Smith, Ronald Carl Provine, Mario A. Mendez
  • Patent number: 9400630
    Abstract: Certain implementations of the disclosed technology include systems and methods for an enhanced speech recognition interface. According to an example implementation, a method includes outputting a first icon and second icon for presentation on a display device; responsive to receiving an indication of an input object being maintained at a first location of an input device, causing a recording device to record an audio signal; responsive to receiving an indication that the input object has moved across the input device from the first location of the input device to a second location of the input device, causing the recording device to stop recording the audio signal; outputting text, based on the recorded audio signal, for presentation on the display device; and responsive to receiving an indication of the input object being maintained at the second location of the input device, causing a portion of the text to be removed from presentation on the display device.
    Type: Grant
    Filed: December 20, 2013
    Date of Patent: July 26, 2016
    Assignee: Google Inc.
    Inventor: Jakob David Uskoreit
  • Patent number: 9396182
    Abstract: The present invention relates to a new method and system for use of a multi-protocol conference bridge, and more specifically a new multi-language conference bridge system and method of use where different cues, such as an attenuated voice of an original non-interpreted speaker, is used to improve the flow of information over the system.
    Type: Grant
    Filed: April 2, 2015
    Date of Patent: July 19, 2016
    Assignee: ZipDX LLC
    Inventors: David Frankel, Barry Slaughter Olsen
  • Patent number: 9396724
    Abstract: A method includes: acquiring data samples; performing categorized sentence mining in the acquired data samples to obtain categorized training samples for multiple categories; building a text classifier based on the categorized training samples; classifying the data samples using the text classifier to obtain a class vocabulary and a corpus for each category; mining the corpus for each category according to the class vocabulary for the category to obtain a respective set of high-frequency language templates; training on the templates for each category to obtain a template-based language model for the category; training on the corpus for each category to obtain a class-based language model for the category; training on the class vocabulary for each category to obtain a lexicon-based language model for the category; building a speech decoder according to an acoustic model, the class-based language model and the lexicon-based language model for any given field, and the data samples.
    Type: Grant
    Filed: February 14, 2014
    Date of Patent: July 19, 2016
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Feng Rao, Li Lu, Bo Chen, Xiang Zhang, Shuai Yue, Lu Li
  • Patent number: 9378732
    Abstract: A system and method is provided for combining active and unsupervised learning for automatic speech recognition. This process enables a reduction in the amount of human supervision required for training acoustic and language models and an increase in the performance given the transcribed and un-transcribed data.
    Type: Grant
    Filed: August 25, 2015
    Date of Patent: June 28, 2016
    Assignee: Interactions LLC
    Inventors: Dilek Zeynep Hakkani-Tur, Giuseppe Riccardi
  • Patent number: 9372130
    Abstract: To facilitate text-to-speech conversion of a username, a first or last name of a user associated with the username may be retrieved, and a pronunciation of the username may be determined based at least in part on whether the name forms at least part of the username. To facilitate text-to-speech conversion of a domain name having a top level domain and at least one other level domain, a pronunciation for the top level domain may be determined based at least in part upon whether the top level domain is one of a predetermined set of top level domains. Each other level domain may be searched for one or more recognized words therewithin, and a pronunciation of the other level domain may be determined based at least in part on an outcome of the search. The username and domain name may form part of a network address such as an email address, URL or URI.
    Type: Grant
    Filed: June 25, 2015
    Date of Patent: June 21, 2016
    Assignee: BLACKBERRY LIMITED
    Inventors: Matthew Bells, Jennifer Elizabeth Lhotak, Michael Angelo Nanni
  • Patent number: RE46037
    Abstract: An audio information retrieval method, medium, and system that can rapidly retrieve audio information, even in noisy environments, by extracting a modulation spectrum that is robust against noise, converting features of the extracted modulation spectrum into hash bits, and using a hash table. The audio information retrieval method may include extracting a modulation spectrum from audio data of a compressed domain, converting the extracted modulation spectrum into fingerprint bits, arranging the fingerprint bits in a form of a hash table, converting a received query into an address by a hash function corresponding to the query, and retrieving the audio information by referring to the hash table.
    Type: Grant
    Filed: October 19, 2012
    Date of Patent: June 21, 2016
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Hyoung Gook Kim, Ki Wan Eom, Ji Yeun Kim, Yuan Yuan She, Xuan Zhu