Patents Examined by Huyen Vo
-
Patent number: 9477925Abstract: The use of a pipelined algorithm that performs parallelized computations to train deep neural networks (DNNs) for performing data analysis may reduce training time. The DNNs may be one of context-independent DNNs or context-dependent DNNs. The training may include partitioning training data into sample batches of a specific batch size. The partitioning may be performed based on rates of data transfers between processors that execute the pipelined algorithm, considerations of accuracy and convergence, and the execution speed of each processor. Other techniques for training may include grouping layers of the DNNs for processing on a single processor, distributing a layer of the DNNs to multiple processors for processing, or modifying an execution order of steps in the pipelined algorithm.Type: GrantFiled: November 20, 2012Date of Patent: October 25, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Frank Torsten Bernd Seide, Gang Li, Dong Yu, Adam C. Eversole, Xie Chen
-
Patent number: 9472184Abstract: Embodiments that relate to identifying potential cross-language speech recognition problems are disclosed. For example, in one disclosed embodiment a speech recognition problem detection program receives a target word in a non-native language from a target application. A phonetic transcription of the target word comprising a plurality of target phonetic units is acquired. The program determines that at least one of the target phonetic units is not found in a plurality of native phonetic units associated with a native language. In response, a warning of the potential cross-language speech recognition problem may be outputted for display on a display device. The warning may comprise the target word.Type: GrantFiled: November 6, 2013Date of Patent: October 18, 2016Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Michael Tjalve, Pavan Karnam, Dennis Mooney
-
Patent number: 9448990Abstract: A statistical language model (SLM) may be iteratively refined by considering N-gram counts in new data, and blending the information contained in the new data with the existing SLM. A first group of documents is evaluated to determine the probabilities associated with the different N-grams observed in the documents. An SLM is constructed based on these probabilities. A second group of documents is then evaluated to determine the probabilities associated with each N-gram in that second group. The existing SLM is then evaluated to determine how well it explains the probabilities in the second group of documents, and a weighting parameter is calculated from that evaluation. Using the weighting parameter, a new SLM is then constructed as a weighted average of the existing SLM and the new probabilities.Type: GrantFiled: November 5, 2013Date of Patent: September 20, 2016Assignee: Microsoft Technology Licensing, LLCInventors: Kuansan Wang, Xiaolong Li, Jiangbo Miao, Frederic H. Behr, Jr.
-
Patent number: 9442693Abstract: A method of utilizing a speech assistant, the speech assistant designed to provide a voice input and speech output capability, the method comprising, enabling the use of the speech assistant for communication with a user, and terminating the speech assistant when the communication is complete. The method further comprises receiving a notification from a native application associated with the communication, and activating a sub-portion of the speech assistant, to enable outputting of the notification using speech output, thereby enabling the use of speech output for periodic announcements without enabling the speech assistant.Type: GrantFiled: June 4, 2013Date of Patent: September 13, 2016Assignee: Nuance Communications, Inc.Inventors: Elizabeth A. Dykstra-Erickson, Jared L. Strawderman
-
Patent number: 9437207Abstract: Various of the disclosed embodiments relate to systems and methods for extracting audio information, e.g. a textual description of speech, from a speech recording while retaining the anonymity of the speaker. In certain embodiments, a third party may perform various aspects of the anonymization and speech processing. Certain embodiments facilitate anonymization in compliance with various legislative requirements even when third parties are involved.Type: GrantFiled: April 3, 2013Date of Patent: September 6, 2016Assignee: PULLSTRING, INC.Inventors: Oren M Jacob, Martin Reddy, Brian Langner
-
Patent number: 9431010Abstract: With respect to speech data 4 of an input speech 2, a speech-recognition device 1 performs at an internal recognizer 7, recognition processing using an acoustic model 9, to calculate an internal recognition result 10 and its acoustic likelihood. A reading-addition processor 12 acquires an external recognition result 11 from recognition processing of the speech data 4 of the input speech 2 by an external recognizer 19 and adds a reading thereto, and a re-collation processor 15 calculates, using the acoustic model 9, the acoustic likelihood of the external recognition result 11 to provide a re-collation result 16. A result-determination processor 17 compares the acoustic likelihood of the internal recognition result 10 with the acoustic likelihood of the external recognition result 11 included in the re-collation result 16, to thereby determine a final recognition result 18.Type: GrantFiled: March 6, 2013Date of Patent: August 30, 2016Assignee: Mitsubishi Electric CorporationInventor: Toshiyuki Hanazawa
-
Patent number: 9431030Abstract: A method is provided for detecting a predetermined frequency band in an audio data signal which has previously been coded according to a succession of data blocks, among which at least certain blocks contain respectively at least one set of spectral parameters representing a linear prediction filter. Such a method of detection implements, for a current block among the at least certain blocks and for which at least a plurality of spectral parameters of the set have been previously decoded, acts of: determining, among the plurality of previously decoded spectral parameters, the index of the first spectral parameter closest to a threshold frequency; calculating at least one criterion on the basis of the determined index; and deciding whether the predetermined frequency band is detected in the current block, as a function of the criterion calculated.Type: GrantFiled: December 11, 2012Date of Patent: August 30, 2016Assignee: ORANGEInventors: Arnault Nagle, Claude Lamblin
-
Patent number: 9424252Abstract: An information providing device: saves a posted document and respective electronic files of a translation thereof in one or more other languages, in association with one another; issues a code image including a two-dimensional code created by encoding a two-dimensional code character string for identifying the electronic file associated with the same identification information; receives, from a user terminal device that accesses the electronic file by decoding the two-dimensional code from the document on which the code image is printed, character code data indicating the language set in the user terminal device, and transmits the electronic file of the translation translated into the language indicated by the character code data to the user terminal device together with information that indicates the posting place of the document, thereby providing the translation and the information that indicates the posting place of the document.Type: GrantFiled: April 2, 2012Date of Patent: August 23, 2016Assignee: PIJIN CO. LTD.Inventors: Kenji Takaoka, Takao Yano, Masayasu Iwashima, Kenichi Maeda, Zhichen Geng
-
Patent number: 9418653Abstract: An operation assisting method comprising comparing input spoken voices with a preliminarily stored keyword associated with an operation target and determining whether or not the keyword is spoken, determining whether or not similarity between or among the input spoken voices falls within a predetermined range. In a case where it is determined that the keyword is not spoken, determining whether or not eyes of a user are directed at the operation target, and in a case of the similarity falling within the predetermined range, determining that the keyword is spoken, in a case of being determined that the eyes of the user are directed at the operation target.Type: GrantFiled: May 18, 2015Date of Patent: August 16, 2016Assignee: PANASONIC INTELLECTUAL PROPERTY MANAGEMENT CO., LTD.Inventors: Takeshi Sekiguchi, Nobuyuki Kunieda
-
Patent number: 9418670Abstract: An encoding apparatus includes a noise detector configured to detect noise included in a certain band in accordance with an audio signal, a gain controller configured to perform gain control on the audio signal so that components in the certain band of the audio signal are attenuated when the noise is detected by the noise detector, a bit allocation calculation unit configured to calculate the numbers of bits to be allocated to frequency spectra of the audio signal which have been subjected to the gain control performed by the gain controller in accordance with the frequency spectra, and a quantization unit configured to quantize the frequency spectra of the audio signal which have been subjected to the gain control in accordance with the numbers of the bits.Type: GrantFiled: May 28, 2015Date of Patent: August 16, 2016Assignee: Sony CorporationInventors: Yuuki Matsumura, Shiro Suzuki
-
Patent number: 9412391Abstract: According to an embodiment, a signal processing device includes a background calculator, a signal generator, an extractor, a similarity calculator, and a mixer. The background calculator is configured to calculate a first background signal in which a speech signal is removed, based on the acoustic signals. The signal generator is configured to generate a reference signal from at least one of the acoustic signals. The extractor is configured to extract a second background signal by removing a speech signal from the reference signal. The similarity calculator is configured to calculate a similarity between feature data of the background signals. The mixer is configured to calculate a weighted sum of the background signals in such a way that a greater weight is given to the first background signal as the similarity is higher and a greater weight is given to the second background signal as the similarity is lower.Type: GrantFiled: December 20, 2013Date of Patent: August 9, 2016Assignee: Kabushiki Kaisha ToshibaInventors: Toshiyuki Ono, Makoto Hirohata, Masashi Nishiyama, Toru Taniguchi
-
Patent number: 9412376Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying a user in a multi-user environment. One of the methods includes receiving, by a first user device, an audio signal encoding an utterance, obtaining, by the first user device, a first speaker model for a first user of the first user device, obtaining, by the first user device for a second user of a second user device that is co-located with the first user device, a second speaker model for the second user or a second score that indicates a respective likelihood that the utterance was spoken by the second user, and determining, by the first user device, that the utterance was spoken by the first user using (i) the first speaker model and the second speaker model or (ii) the first speaker model and the second score.Type: GrantFiled: July 22, 2015Date of Patent: August 9, 2016Assignee: Google Inc.Inventors: Raziel Alvarez Guevara, Othar Hansson
-
Patent number: 9412368Abstract: A display apparatus includes a voice collecting device which collects a user voice, a communication device which performs communication with an interactive server, and a control device which, when response information corresponding to the user voice sent to the interactive server is received from the interactive server, controls to perform a feature corresponding to the response information, and the control device controls the communication device to receive replacement response information, related to the user voice, through a web search and a social network service (SNS).Type: GrantFiled: June 14, 2013Date of Patent: August 9, 2016Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Hye-hyun Heo, Ki-suk Kim, Hae-rim Son
-
Patent number: 9403279Abstract: A method and apparatus for moving an object. A verbal instruction for moving the object is received. The verbal instruction is converted into text. A logical representation of the verbal instruction is generated. A movement of a robotic system that corresponds to the verbal instruction for moving the object using a model of an environment in which the object and the robotic system are located is identified. A set of commands used by the robotic system for the movement of the robotic system is identified. The set of commands is sent to the robotic system.Type: GrantFiled: June 13, 2013Date of Patent: August 2, 2016Assignee: THE BOEING COMPANYInventors: Scott D. G. Smith, Ronald Carl Provine, Mario A. Mendez
-
Patent number: 9400630Abstract: Certain implementations of the disclosed technology include systems and methods for an enhanced speech recognition interface. According to an example implementation, a method includes outputting a first icon and second icon for presentation on a display device; responsive to receiving an indication of an input object being maintained at a first location of an input device, causing a recording device to record an audio signal; responsive to receiving an indication that the input object has moved across the input device from the first location of the input device to a second location of the input device, causing the recording device to stop recording the audio signal; outputting text, based on the recorded audio signal, for presentation on the display device; and responsive to receiving an indication of the input object being maintained at the second location of the input device, causing a portion of the text to be removed from presentation on the display device.Type: GrantFiled: December 20, 2013Date of Patent: July 26, 2016Assignee: Google Inc.Inventor: Jakob David Uskoreit
-
Patent number: 9396182Abstract: The present invention relates to a new method and system for use of a multi-protocol conference bridge, and more specifically a new multi-language conference bridge system and method of use where different cues, such as an attenuated voice of an original non-interpreted speaker, is used to improve the flow of information over the system.Type: GrantFiled: April 2, 2015Date of Patent: July 19, 2016Assignee: ZipDX LLCInventors: David Frankel, Barry Slaughter Olsen
-
Patent number: 9396724Abstract: A method includes: acquiring data samples; performing categorized sentence mining in the acquired data samples to obtain categorized training samples for multiple categories; building a text classifier based on the categorized training samples; classifying the data samples using the text classifier to obtain a class vocabulary and a corpus for each category; mining the corpus for each category according to the class vocabulary for the category to obtain a respective set of high-frequency language templates; training on the templates for each category to obtain a template-based language model for the category; training on the corpus for each category to obtain a class-based language model for the category; training on the class vocabulary for each category to obtain a lexicon-based language model for the category; building a speech decoder according to an acoustic model, the class-based language model and the lexicon-based language model for any given field, and the data samples.Type: GrantFiled: February 14, 2014Date of Patent: July 19, 2016Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Feng Rao, Li Lu, Bo Chen, Xiang Zhang, Shuai Yue, Lu Li
-
Patent number: 9378732Abstract: A system and method is provided for combining active and unsupervised learning for automatic speech recognition. This process enables a reduction in the amount of human supervision required for training acoustic and language models and an increase in the performance given the transcribed and un-transcribed data.Type: GrantFiled: August 25, 2015Date of Patent: June 28, 2016Assignee: Interactions LLCInventors: Dilek Zeynep Hakkani-Tur, Giuseppe Riccardi
-
Patent number: 9372130Abstract: To facilitate text-to-speech conversion of a username, a first or last name of a user associated with the username may be retrieved, and a pronunciation of the username may be determined based at least in part on whether the name forms at least part of the username. To facilitate text-to-speech conversion of a domain name having a top level domain and at least one other level domain, a pronunciation for the top level domain may be determined based at least in part upon whether the top level domain is one of a predetermined set of top level domains. Each other level domain may be searched for one or more recognized words therewithin, and a pronunciation of the other level domain may be determined based at least in part on an outcome of the search. The username and domain name may form part of a network address such as an email address, URL or URI.Type: GrantFiled: June 25, 2015Date of Patent: June 21, 2016Assignee: BLACKBERRY LIMITEDInventors: Matthew Bells, Jennifer Elizabeth Lhotak, Michael Angelo Nanni
-
Patent number: RE46037Abstract: An audio information retrieval method, medium, and system that can rapidly retrieve audio information, even in noisy environments, by extracting a modulation spectrum that is robust against noise, converting features of the extracted modulation spectrum into hash bits, and using a hash table. The audio information retrieval method may include extracting a modulation spectrum from audio data of a compressed domain, converting the extracted modulation spectrum into fingerprint bits, arranging the fingerprint bits in a form of a hash table, converting a received query into an address by a hash function corresponding to the query, and retrieving the audio information by referring to the hash table.Type: GrantFiled: October 19, 2012Date of Patent: June 21, 2016Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Hyoung Gook Kim, Ki Wan Eom, Ji Yeun Kim, Yuan Yuan She, Xuan Zhu