Abstract: Disclosed are a method for determining whether a person is drunk after consuming alcohol capable of analyzing alcohol consumption in a time domain by analyzing a voice, and a recording medium and a terminal for carrying out same.
Type:
Grant
Filed:
January 24, 2014
Date of Patent:
April 3, 2018
Assignee:
FOUNDATION OF SOONGSIL UNIVERSITY-INDUSTRY COOPERATION
Inventors:
Myung Jin Bae, Sang Gil Lee, Geum Ran Baek
Abstract: Disclosed are a method for determining whether a person is drunk after consuming alcohol on the basis of a difference among a plurality of formant energy energies, which are generated by applying linear predictive coding according to a plurality of linear prediction orders, and a recording medium and a terminal for carrying out the method.
Type:
Grant
Filed:
January 28, 2014
Date of Patent:
March 13, 2018
Assignee:
FOUNDATION OF SOONGSIL UNIVERSITY-INDUSTRY COOPERATION
Inventors:
Myung Jin Bae, Sang Gil Lee, Geum Ran Baek
Abstract: An audio encoding apparatus and method that encodes hybrid contents including an object sound, a background sound, and metadata, and an audio decoding apparatus and method that decodes the encoded hybrid contents are provided. The audio encoding apparatus may include a mixing unit to generate an intermediate channel signal by mixing a background sound and an object sound, a matrix information encoding unit to encode matrix information used for the mixing, an audio encoding unit to encode the intermediate channel signal, and a metadata encoding unit to encode metadata including control information of the object sound.
Type:
Grant
Filed:
September 4, 2014
Date of Patent:
February 27, 2018
Assignee:
Electronics and Telecommunications Research Institute
Inventors:
Seung Kwon Beack, Tae Jin Lee, Jong Mo Sung, Kyeong Ok Kang, Jeong Il Seo, Dae Young Jang, Yong Ju Lee, Jin Woong Kim
Abstract: A system of this invention is directed to a speech processing system that efficiently performs noise suppression processing for a plurality of noise sources spreading in a lateral direction with respect to a speaker of interest.
Type:
Grant
Filed:
January 16, 2014
Date of Patent:
February 27, 2018
Assignee:
NEC CORPORATION
Inventors:
Masanori Tsujikawa, Ken Hanazawa, Akihiko Sugiyama
Abstract: Disclosed is a method for determining alcohol consumption capable of analyzing alcohol consumption in a time domain by analyzing a formant slope of a voice signal, and a recording medium and a terminal for carrying out same. An terminal for determining whether a person is drunk comprises: a voice input unit for generating a voice frame by receiving a voice signal; a voiced/unvoiced sound analysis unit for determining whether a received voiced frame corresponds to a voiced sound; a formant frequency extraction unit for extracting a plurality of formant frequencies of the voice frame corresponding to the voiced sound; and an alcohol consumption determining unit for calculating a formant slope between the plurality of formant frequencies, and determining the state of alcohol consumption depending on the formant slope, thereby determining whether a person is drunk by analyzing the formant slope of an inputted voice.
Type:
Grant
Filed:
January 24, 2014
Date of Patent:
February 20, 2018
Assignee:
FOUNDATION OF SOONGSIL UNIVERSITY-INDUSTRY COOPERATION
Inventors:
Myung Jin Bae, Sang Gil Lee, Geum Ran Baek
Abstract: Incoming e-mails, instant messages, SMS, and MMS, are scanned for new language objects such as words, abbreviations, text shortcuts and, in appropriate languages, ideograms, that are placed in a list for use by a text input process of a handheld electronic device to facilitate the generation of text.
Abstract: A method for processing a signal in the form of consecutive sample blocks, the method comprising filtering in a transformed domain of sub-bands, and particularly equalization processing, applied to a current block in the transformed domain, and filtering-adjustment processing that is applied in the transformed domain to at least one block adjacent to the current block.
Abstract: Systems, methods, and apparatuses are presented for a trained language model to be stored in an efficient manner such that the trained language model may be utilized in virtually any computing device to conduct natural language processing. Unlike other natural language processing engines that may be computationally intensive to the point of being capable of running only on high performance machines, the organization of the natural language models according to the present disclosures allows for natural language processing to be performed even on smaller devices, such as mobile devices.
Type:
Grant
Filed:
December 9, 2015
Date of Patent:
December 5, 2017
Assignee:
Sansa AI Inc.
Inventors:
Schuyler D. Erle, Robert J. Munro, Brendan D. Callahan, Gary C. King, Jason Brenier, James B. Robinson
Abstract: The disclosure describes an overall system/method for developing a “speak and touch auto correction interface” referred to as STACI which is far more superior to existing user interfaces including the widely adopted qwerty. Using STACI a user speaks and types a word at the same time. The redundant information from the two modes, namely speech and the letters typed, enables the user to sloppily and partially type the words. The result is a very fast and accurate enhanced keyboard interface enabling document production on computing devices like phones and tablets.
Abstract: A speech recognition system that automatically sets the volume of output audio based on a sound intensity of a command spoken by a user to adjust the output volume. The system can compensate for variation in the intensity of the captured speech command based on the distance between the speaker and the audio capture device, the pitch of the spoken command and the acoustic profile of the system, and the relative intensity of ambient noise.
Type:
Grant
Filed:
December 4, 2013
Date of Patent:
November 28, 2017
Assignee:
Amazon Technologies, Inc.
Inventors:
Ronald Joseph Degges, Jr., Kerry H Scharfglass, Eric Wang
Abstract: An audio coding method and apparatus, where the method includes, for each audio frame in audio, when a signal characteristic of the audio frame and a signal characteristic of a previous audio frame meet a preset modification condition, determining a first modification weight according to linear spectral frequency (LSF) differences of the audio frame and LSF differences of the previous audio frame, modifying a linear predictive parameter of the audio frame according to the determined first modification weight, and coding the audio frame according to a modified linear predictive parameter of the audio frame. According to the present disclosure, audio having a wider bandwidth can be coded while a bit rate remains unchanged or a bit rate slightly changes and a spectrum between audio frames is steadier.
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for translating terms using numeric representations. One of the methods includes obtaining data that associates each term in a vocabulary of terms in a first language with a respective high-dimensional representation of the term; obtaining data that associates each term in a vocabulary of terms in a second language with a respective high-dimensional representation of the term; receiving a first language term; and determining a translation into the second language of the first language term from the high-dimensional representation of the first language term and the high-dimensional representations of terms in the vocabulary of terms in the second language.
Type:
Grant
Filed:
September 17, 2015
Date of Patent:
October 31, 2017
Assignee:
Google Inc.
Inventors:
Ilya Sutskever, Tomas Mikolov, Jeffrey Adgate Dean, Quoc V. Le
Abstract: Systems and methods providing for secure voice print authentication over a network are disclosed herein. During an enrollment stage, a client's voice is recorded and characteristics of the recording are used to create and store a voice print. When an enrolled client seeks access to secure information over a network, a sample voice recording is created. The sample voice recording is compared to at least one voice print. If a match is found, the client is authenticated and granted access to secure information. Systems and methods providing for a dual use voice analysis system are disclosed herein. Speech recognition is achieved by comparing characteristics of words spoken by a speaker to one or more templates of human language words. Speaker identification is achieved by comparing characteristics of a speaker's speech to one or more templates, or voice prints. The system is adapted to increase or decrease matching constraints depending on whether speaker identification or speaker recognition is desired.
Abstract: Embodiments of the invention may be used to implement a rate converter that includes: 6 channels in forward (audio) path, each channel having a 24-bit signal path per channel, an End-to-end SNR of 110 dB, all within the 20 Hz to 20 KHz bandwidth. Embodiment may also be used to implement a rate converter having: 2 channels in a reverse path, such as for voice signals, 16-bit signal path per channel, an End-to-end SNR of 93 dB, all within 20 Hz to 20 KHz bandwidth. The rate converter may include sample rates such as 8, 11.025, 12, 16, 22.05, 24, 32 44.1, 48, and 96 KHz. Further, rate converters according to embodiments may include a gated clock in low-power mode to conserve power.
Abstract: Mechanisms are provided for performing context based synonym filtering for natural language processing. Content is parsed into one or more conceptual units, wherein each conceptual unit comprises a portion of text of the content that is associated with a single concept. For each conceptual unit, a term in the conceptual unit is identified that has a synonym to be utilized during natural language processing of the content. A first measure of relatedness of the term to at least one other term in the conceptual unit is determined. A second measure of relatedness of the synonym of the term to the at least one other term in the conceptual unit is determined. A determination whether or not to utilize the synonym when performing natural language processing on the conceptual unit is made based on the first and second measures of relatedness and natural language processing on the content is performed accordingly.
Type:
Grant
Filed:
June 3, 2016
Date of Patent:
October 17, 2017
Assignee:
International Business Machines Corporation
Inventors:
Kay Mueller, Christopher M. Nolan, William G. Visotski, David E. Wilson
Abstract: Methods and systems for automated language detection for domain names are disclosed. In some embodiments, a method for detecting a language of an Internationalized Domain Name (IDN) comprises receiving, by an I/O interface, a string of characters for the IDN; receiving training data, including a plurality of multi-gram analyses for a set of languages; analyzing, by a processor, the string of characters based on the training data, wherein the analyzing includes extracting a set of multi-grams from the string of characters and comparing the extracted set of multi-grams with the training data; detecting the language of the IDN based on results of the analyzing. In some embodiments, the method further comprises comparing the detected language of the IDN with a user selected language and using the IDN to generate a domain name, if the comparing indicates that the detected language of the IDN is consistent with the user selected language.
Type:
Grant
Filed:
December 15, 2015
Date of Patent:
October 10, 2017
Assignee:
VERISIGN, INC.
Inventors:
Ronald Andrew Hoskinson, Lambert Arians, Marc Anderson, Mahendra Jain
Abstract: A system and method for adapting a speech recognition and generation system. The system and method include providing a speech recognition and generation engine that processes speech received from a user and providing a dictionary adaptation module that adds out of vocabulary words to a baseline dictionary of the speech recognition and generation system. Words are added by extracting words that are encountered and adding out of vocabulary words to the baseline dictionary of the speech recognition and generation system.
Type:
Grant
Filed:
November 5, 2013
Date of Patent:
October 3, 2017
Assignee:
GM Global Technology Operations LLC
Inventors:
Ron M. Hecht, Omer Tsimhoni, Timothy J. Grost
Abstract: Provided are a voice audio encoding device, voice audio decoding device, voice audio encoding method, and voice audio decoding method that efficiently perform bit distribution and improve sound quality. Dominant frequency band identification unit identifies a dominant frequency band having a norm factor value that is the maximum value within the spectrum of an input voice audio signal. Dominant group determination units and non-dominant group determination unit group all sub-bands into a dominant group that contains the dominant frequency band and a non-dominant group that contains no dominant frequency band. Group bit distribution unit distributes bits to each group on the basis of the energy and norm variance of each group. Sub-band bit distribution unit redistributes the bits that have been distributed to each group to each sub-band in accordance with the ratio of the norm to the energy of the groups.
Type:
Grant
Filed:
November 26, 2013
Date of Patent:
September 19, 2017
Assignee:
PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
Abstract: A method of utilizing a speech assistant, the speech assistant designed to provide a voice input and speech output capability, the method comprising, enabling the use of the speech assistant for communication with a user, and terminating the speech assistant when the communication is complete. The method further comprises receiving a notification from a native application associated with the communication, and activating a sub-portion of the speech assistant, to enable outputting of the notification using speech output, thereby enabling the use of speech output for periodic announcements without enabling the speech assistant.
Type:
Grant
Filed:
August 16, 2016
Date of Patent:
September 19, 2017
Assignee:
Nuance Communications, Inc.
Inventors:
Elizabeth A. Dykstra-Erickson, Jared L. Strawderman
Abstract: A method for speaker verification is disclosed. The method comprises using at least one hardware processor for: providing a development set comprising multiple voice samples of multiple speakers uttering a predefined development text, prompting a test text to a target speaker, wherein the test text is different from the development text, and recording a test sample of the target speaker uttering the test text, synthesizing a set of artificial voice samples based on the multiple voice samples, wherein each of the artificial voice samples simulates a different speaker of the multiple speakers uttering the test text, and verifying an identity of the target speaker based on the set of artificial voice samples and on the test sample of the target speaker.
Type:
Grant
Filed:
January 1, 2014
Date of Patent:
September 19, 2017
Assignee:
International Business Machines Corporation