Patents Examined by Huyen Vo

Deep neural networks training for speech and pattern recognition

Patent number: 9477925

Abstract: The use of a pipelined algorithm that performs parallelized computations to train deep neural networks (DNNs) for performing data analysis may reduce training time. The DNNs may be one of context-independent DNNs or context-dependent DNNs. The training may include partitioning training data into sample batches of a specific batch size. The partitioning may be performed based on rates of data transfers between processors that execute the pipelined algorithm, considerations of accuracy and convergence, and the execution speed of each processor. Other techniques for training may include grouping layers of the DNNs for processing on a single processor, distributing a layer of the DNNs to multiple processors for processing, or modifying an execution order of steps in the pipelined algorithm.

Type: Grant

Filed: November 20, 2012

Date of Patent: October 25, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Frank Torsten Bernd Seide, Gang Li, Dong Yu, Adam C. Eversole, Xie Chen
Cross-language speech recognition

Patent number: 9472184

Abstract: Embodiments that relate to identifying potential cross-language speech recognition problems are disclosed. For example, in one disclosed embodiment a speech recognition problem detection program receives a target word in a non-native language from a target application. A phonetic transcription of the target word comprising a plurality of target phonetic units is acquired. The program determines that at least one of the target phonetic units is not found in a plurality of native phonetic units associated with a native language. In response, a warning of the potential cross-language speech recognition problem may be outputted for display on a display device. The warning may comprise the target word.

Type: Grant

Filed: November 6, 2013

Date of Patent: October 18, 2016

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Michael Tjalve, Pavan Karnam, Dennis Mooney
Adaptive construction of a statistical language model

Patent number: 9448990

Abstract: A statistical language model (SLM) may be iteratively refined by considering N-gram counts in new data, and blending the information contained in the new data with the existing SLM. A first group of documents is evaluated to determine the probabilities associated with the different N-grams observed in the documents. An SLM is constructed based on these probabilities. A second group of documents is then evaluated to determine the probabilities associated with each N-gram in that second group. The existing SLM is then evaluated to determine how well it explains the probabilities in the second group of documents, and a weighting parameter is calculated from that evaluation. Using the weighting parameter, a new SLM is then constructed as a weighted average of the existing SLM and the new probabilities.

Type: Grant

Filed: November 5, 2013

Date of Patent: September 20, 2016

Assignee: Microsoft Technology Licensing, LLC

Inventors: Kuansan Wang, Xiaolong Li, Jiangbo Miao, Frederic H. Behr, Jr.
Reducing speech session resource use in a speech assistant

Patent number: 9442693

Abstract: A method of utilizing a speech assistant, the speech assistant designed to provide a voice input and speech output capability, the method comprising, enabling the use of the speech assistant for communication with a user, and terminating the speech assistant when the communication is complete. The method further comprises receiving a notification from a native application associated with the communication, and activating a sub-portion of the speech assistant, to enable outputting of the notification using speech output, thereby enabling the use of speech output for periodic announcements without enabling the speech assistant.

Type: Grant

Filed: June 4, 2013

Date of Patent: September 13, 2016

Assignee: Nuance Communications, Inc.

Inventors: Elizabeth A. Dykstra-Erickson, Jared L. Strawderman
Feature extraction for anonymized speech recognition

Patent number: 9437207

Abstract: Various of the disclosed embodiments relate to systems and methods for extracting audio information, e.g. a textual description of speech, from a speech recording while retaining the anonymity of the speaker. In certain embodiments, a third party may perform various aspects of the anonymization and speech processing. Certain embodiments facilitate anonymization in compliance with various legislative requirements even when third parties are involved.

Type: Grant

Filed: April 3, 2013

Date of Patent: September 6, 2016

Assignee: PULLSTRING, INC.

Inventors: Oren M Jacob, Martin Reddy, Brian Langner
Speech-recognition device and speech-recognition method

Patent number: 9431010

Abstract: With respect to speech data 4 of an input speech 2, a speech-recognition device 1 performs at an internal recognizer 7, recognition processing using an acoustic model 9, to calculate an internal recognition result 10 and its acoustic likelihood. A reading-addition processor 12 acquires an external recognition result 11 from recognition processing of the speech data 4 of the input speech 2 by an external recognizer 19 and adds a reading thereto, and a re-collation processor 15 calculates, using the acoustic model 9, the acoustic likelihood of the external recognition result 11 to provide a re-collation result 16. A result-determination processor 17 compares the acoustic likelihood of the internal recognition result 10 with the acoustic likelihood of the external recognition result 11 included in the re-collation result 16, to thereby determine a final recognition result 18.

Type: Grant

Filed: March 6, 2013

Date of Patent: August 30, 2016

Assignee: Mitsubishi Electric Corporation

Inventor: Toshiyuki Hanazawa
Method of detecting a predetermined frequency band in an audio data signal, detection device and computer program corresponding thereto

Patent number: 9431030

Abstract: A method is provided for detecting a predetermined frequency band in an audio data signal which has previously been coded according to a succession of data blocks, among which at least certain blocks contain respectively at least one set of spectral parameters representing a linear prediction filter. Such a method of detection implements, for a current block among the at least certain blocks and for which at least a plurality of spectral parameters of the set have been previously decoded, acts of: determining, among the plurality of previously decoded spectral parameters, the index of the first spectral parameter closest to a threshold frequency; calculating at least one criterion on the basis of the determined index; and deciding whether the predetermined frequency band is detected in the current block, as a function of the criterion calculated.

Type: Grant

Filed: December 11, 2012

Date of Patent: August 30, 2016

Assignee: ORANGE

Inventors: Arnault Nagle, Claude Lamblin
Information providing device, information providing method, and computer program

Patent number: 9424252

Abstract: An information providing device: saves a posted document and respective electronic files of a translation thereof in one or more other languages, in association with one another; issues a code image including a two-dimensional code created by encoding a two-dimensional code character string for identifying the electronic file associated with the same identification information; receives, from a user terminal device that accesses the electronic file by decoding the two-dimensional code from the document on which the code image is printed, character code data indicating the language set in the user terminal device, and transmits the electronic file of the translation translated into the language indicated by the character code data to the user terminal device together with information that indicates the posting place of the document, thereby providing the translation and the information that indicates the posting place of the document.

Type: Grant

Filed: April 2, 2012

Date of Patent: August 23, 2016

Assignee: PIJIN CO. LTD.

Inventors: Kenji Takaoka, Takao Yano, Masayasu Iwashima, Kenichi Maeda, Zhichen Geng
Operation assisting method and operation assisting device

Patent number: 9418653

Abstract: An operation assisting method comprising comparing input spoken voices with a preliminarily stored keyword associated with an operation target and determining whether or not the keyword is spoken, determining whether or not similarity between or among the input spoken voices falls within a predetermined range. In a case where it is determined that the keyword is not spoken, determining whether or not eyes of a user are directed at the operation target, and in a case of the similarity falling within the predetermined range, determining that the keyword is spoken, in a case of being determined that the eyes of the user are directed at the operation target.

Type: Grant

Filed: May 18, 2015

Date of Patent: August 16, 2016

Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.

Inventors: Takeshi Sekiguchi, Nobuyuki Kunieda
Encoding apparatus, encoding method, and program

Patent number: 9418670

Abstract: An encoding apparatus includes a noise detector configured to detect noise included in a certain band in accordance with an audio signal, a gain controller configured to perform gain control on the audio signal so that components in the certain band of the audio signal are attenuated when the noise is detected by the noise detector, a bit allocation calculation unit configured to calculate the numbers of bits to be allocated to frequency spectra of the audio signal which have been subjected to the gain control performed by the gain controller in accordance with the frequency spectra, and a quantization unit configured to quantize the frequency spectra of the audio signal which have been subjected to the gain control in accordance with the numbers of the bits.

Type: Grant

Filed: May 28, 2015

Date of Patent: August 16, 2016

Assignee: Sony Corporation

Inventors: Yuuki Matsumura, Shiro Suzuki
Signal processing device, signal processing method, and computer program product

Patent number: 9412391

Abstract: According to an embodiment, a signal processing device includes a background calculator, a signal generator, an extractor, a similarity calculator, and a mixer. The background calculator is configured to calculate a first background signal in which a speech signal is removed, based on the acoustic signals. The signal generator is configured to generate a reference signal from at least one of the acoustic signals. The extractor is configured to extract a second background signal by removing a speech signal from the reference signal. The similarity calculator is configured to calculate a similarity between feature data of the background signals. The mixer is configured to calculate a weighted sum of the background signals in such a way that a greater weight is given to the first background signal as the similarity is higher and a greater weight is given to the second background signal as the similarity is lower.

Type: Grant

Filed: December 20, 2013

Date of Patent: August 9, 2016

Assignee: Kabushiki Kaisha Toshiba

Inventors: Toshiyuki Ono, Makoto Hirohata, Masashi Nishiyama, Toru Taniguchi
Speaker verification using co-location information

Patent number: 9412376

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying a user in a multi-user environment. One of the methods includes receiving, by a first user device, an audio signal encoding an utterance, obtaining, by the first user device, a first speaker model for a first user of the first user device, obtaining, by the first user device for a second user of a second user device that is co-located with the first user device, a second speaker model for the second user or a second score that indicates a respective likelihood that the utterance was spoken by the second user, and determining, by the first user device, that the utterance was spoken by the first user using (i) the first speaker model and the second speaker model or (ii) the first speaker model and the second score.

Type: Grant

Filed: July 22, 2015

Date of Patent: August 9, 2016

Assignee: Google Inc.

Inventors: Raziel Alvarez Guevara, Othar Hansson
Display apparatus, interactive system, and response information providing method

Patent number: 9412368

Abstract: A display apparatus includes a voice collecting device which collects a user voice, a communication device which performs communication with an interactive server, and a control device which, when response information corresponding to the user voice sent to the interactive server is received from the interactive server, controls to perform a feature corresponding to the response information, and the control device controls the communication device to receive replacement response information, related to the user voice, through a web search and a social network service (SNS).

Type: Grant

Filed: June 14, 2013

Date of Patent: August 9, 2016

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Hye-hyun Heo, Ki-suk Kim, Hae-rim Son
Robotic system with verbal interaction

Patent number: 9403279

Abstract: A method and apparatus for moving an object. A verbal instruction for moving the object is received. The verbal instruction is converted into text. A logical representation of the verbal instruction is generated. A movement of a robotic system that corresponds to the verbal instruction for moving the object using a model of an environment in which the object and the robotic system are located is identified. A set of commands used by the robotic system for the movement of the robotic system is identified. The set of commands is sent to the robotic system.

Type: Grant

Filed: June 13, 2013

Date of Patent: August 2, 2016

Assignee: THE BOEING COMPANY

Inventors: Scott D. G. Smith, Ronald Carl Provine, Mario A. Mendez
Systems and methods for enhanced speech recognition interface on mobile device

Patent number: 9400630

Abstract: Certain implementations of the disclosed technology include systems and methods for an enhanced speech recognition interface. According to an example implementation, a method includes outputting a first icon and second icon for presentation on a display device; responsive to receiving an indication of an input object being maintained at a first location of an input device, causing a recording device to record an audio signal; responsive to receiving an indication that the input object has moved across the input device from the first location of the input device to a second location of the input device, causing the recording device to stop recording the audio signal; outputting text, based on the recorded audio signal, for presentation on the display device; and responsive to receiving an indication of the input object being maintained at the second location of the input device, causing a portion of the text to be removed from presentation on the display device.

Type: Grant

Filed: December 20, 2013

Date of Patent: July 26, 2016

Assignee: Google Inc.

Inventor: Jakob David Uskoreit
Multi-lingual conference bridge with cues and method of use

Patent number: 9396182

Abstract: The present invention relates to a new method and system for use of a multi-protocol conference bridge, and more specifically a new multi-language conference bridge system and method of use where different cues, such as an attenuated voice of an original non-interpreted speaker, is used to improve the flow of information over the system.

Type: Grant

Filed: April 2, 2015

Date of Patent: July 19, 2016

Assignee: ZipDX LLC

Inventors: David Frankel, Barry Slaughter Olsen
Method and apparatus for building a language model

Patent number: 9396724

Abstract: A method includes: acquiring data samples; performing categorized sentence mining in the acquired data samples to obtain categorized training samples for multiple categories; building a text classifier based on the categorized training samples; classifying the data samples using the text classifier to obtain a class vocabulary and a corpus for each category; mining the corpus for each category according to the class vocabulary for the category to obtain a respective set of high-frequency language templates; training on the templates for each category to obtain a template-based language model for the category; training on the corpus for each category to obtain a class-based language model for the category; training on the class vocabulary for each category to obtain a lexicon-based language model for the category; building a speech decoder according to an acoustic model, the class-based language model and the lexicon-based language model for any given field, and the data samples.

Type: Grant

Filed: February 14, 2014

Date of Patent: July 19, 2016

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Feng Rao, Li Lu, Bo Chen, Xiang Zhang, Shuai Yue, Lu Li
System and method for unsupervised and active learning for automatic speech recognition

Patent number: 9378732

Abstract: A system and method is provided for combining active and unsupervised learning for automatic speech recognition. This process enables a reduction in the amount of human supervision required for training acoustic and language models and an increase in the performance given the transcribed and un-transcribed data.

Type: Grant

Filed: August 25, 2015

Date of Patent: June 28, 2016

Assignee: Interactions LLC

Inventors: Dilek Zeynep Hakkani-Tur, Giuseppe Riccardi
Facilitating text-to-speech conversion of a domain name or a network address containing a domain name

Patent number: 9372130

Abstract: To facilitate text-to-speech conversion of a username, a first or last name of a user associated with the username may be retrieved, and a pronunciation of the username may be determined based at least in part on whether the name forms at least part of the username. To facilitate text-to-speech conversion of a domain name having a top level domain and at least one other level domain, a pronunciation for the top level domain may be determined based at least in part upon whether the top level domain is one of a predetermined set of top level domains. Each other level domain may be searched for one or more recognized words therewithin, and a pronunciation of the other level domain may be determined based at least in part on an outcome of the search. The username and domain name may form part of a network address such as an email address, URL or URI.

Type: Grant

Filed: June 25, 2015

Date of Patent: June 21, 2016

Assignee: BLACKBERRY LIMITED

Inventors: Matthew Bells, Jennifer Elizabeth Lhotak, Michael Angelo Nanni
Method, medium, and system for music retrieval using modulation spectrum

Patent number: RE46037

Abstract: An audio information retrieval method, medium, and system that can rapidly retrieve audio information, even in noisy environments, by extracting a modulation spectrum that is robust against noise, converting features of the extracted modulation spectrum into hash bits, and using a hash table. The audio information retrieval method may include extracting a modulation spectrum from audio data of a compressed domain, converting the extracted modulation spectrum into fingerprint bits, arranging the fingerprint bits in a form of a hash table, converting a received query into an address by a hash function corresponding to the query, and retrieving the audio information by referring to the hash table.

Type: Grant

Filed: October 19, 2012

Date of Patent: June 21, 2016

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Hyoung Gook Kim, Ki Wan Eom, Ji Yeun Kim, Yuan Yuan She, Xuan Zhu

prev … 2 3 4 5 6 7 8 9 10 next