Voice Recognition Patents (Class 704/246)
  • Patent number: 9015046
    Abstract: A method and system for indicating in real time that an interaction is associated with a problem or issue, comprising: receiving a segment of an interaction in which a representative of the organization participates; extracting a feature from the segment; extracting a global feature associated with the interaction; aggregating the feature and the global feature; and classifying the segment or the interaction in association with the problem or issue by applying a model to the feature and the global feature. The method and system may also use features extracted from earlier segments within the interaction. The method and system can also evaluate the model based on features extracted from training interactions and manual tagging assigned to the interactions or segments thereof.
    Type: Grant
    Filed: June 10, 2010
    Date of Patent: April 21, 2015
    Assignee: Nice-Systems Ltd.
    Inventors: Oren Pereg, Moshe Wasserblat, Yuval Lubowich, Ronen Laperdon, Dori Shapira, Vladislav Feigin, Oz Fox-Kahana
  • Patent number: 9014346
    Abstract: A method, apparatus and computer-readable medium for handling incoming calls destined for a called party. The method comprises detecting arrival of an incoming call destined for the called party and attempting to reach the called party by causing a communication device associated with the called party to emit a voice message soliciting a spoken call handling command from the called party. This allows the called party not only to recognize the calling party, but also to decide whether to accept, reject or forward the incoming call without having to physically manipulate the communication device. The network-based example of implementation is compatible with many existing communication devices and has the ability to query the calling party for identification information, whereas the communication device-based example of implementation is compatible with many existing network architectures, and does not require the called party to subscribe to any particular network service.
    Type: Grant
    Filed: September 22, 2006
    Date of Patent: April 21, 2015
    Assignee: BCE Inc.
    Inventor: Jeffrey William Dawson
  • Patent number: 9015044
    Abstract: Implementations of systems, method and devices described herein enable enhancing the intelligibility of a target voice signal included in a noisy audible signal received by a hearing aid device or the like. In particular, in some implementations, systems, methods and devices are operable to generate a machine readable formant based codebook. In some implementations, the method includes determining whether or not a candidate codebook tuple includes a sufficient amount of new information to warrant either adding the candidate codebook tuple to the codebook or using at least a portion of the candidate codebook tuple to update an existing codebook tuple. Additionally and/or alternatively, in some implementations systems, methods and devices are operable to reconstruct a target voice signal by detecting formants in an audible signal, using the detected formants to select codebook tuples, and using the formant information in the selected codebook tuples to reconstruct the target voice signal.
    Type: Grant
    Filed: August 20, 2012
    Date of Patent: April 21, 2015
    Assignee: Malaspina Labs (Barbados) Inc.
    Inventors: Pierre Zakarauskas, Alexander Escott, Clarence S. H. Chu, Shawn E. Stevenson
  • Patent number: 9015043
    Abstract: A computer-implemented method includes receiving an electronic representation of one or more human voices, recognizing words in a first portion of the electronic representation of the one or more human voices, and sending suggested search terms to a display device for display to a user in a text format. The suggested search terms are based on the recognized words in the first portion of the electronic representation of the one or more human voices. A search query is received from the user, which includes one or more of the suggested search terms that were displayed to the user.
    Type: Grant
    Filed: October 1, 2010
    Date of Patent: April 21, 2015
    Assignee: Google Inc.
    Inventor: Scott Jenson
  • Publication number: 20150106098
    Abstract: A voice input device provided with an input section for inputting a voice of a user, a recognition section for recognizing the voice of the user inputted by the input section, a generation section for generating characters or a command based on a recognition result of the recognition section, a detection section for detecting a device's own posture, and an instruction section for instructing the generation section to generate the command when a detection result of the detection section represents a specific posture as compared to instructing the generation section to generate the characters when the detection result of the detection section represents a posture other than the specific posture. Accordingly, character input and command input during dictation is correctly distinguished, or more specifically unexpected character input during dictation is avoided.
    Type: Application
    Filed: October 10, 2012
    Publication date: April 16, 2015
    Inventor: Yusuke Inutsuka
  • Publication number: 20150106097
    Abstract: There is provided a method of determining a main speaker that is performed by a first terminal participating in a distributed telepresence service. The method of determining a main speaker according to an embodiment of the invention includes obtaining first feature information for determining a main speaker from an audio input signal, obtaining second feature information for determining a main speaker of a second terminal from the second terminal participating in the distributed telepresence service, and determining a main speaker terminal for providing a video and an audio of a main speaker who is participating in a telepresence and is speaking based on the first feature information for determining a main speaker and the second feature information for determining a main speaker.
    Type: Application
    Filed: February 21, 2014
    Publication date: April 16, 2015
    Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventor: Hyun-Woo KIM
  • Publication number: 20150106099
    Abstract: An image processing apparatus includes: a voice input receiver configured to receive a voice input of user; a signal processor configured to recognize and process the received voice input received through the voice input receiver; a buffer configured to store the voice input; and a controller configured to determine whether a voice recognition function of the signal processor is activated and control the signal processor to recognize the voice input stored in the buffer in response to the voice recognition function being determined to be activated wherein the controller is further configured to store the received voice input in the buffer in response to the received voice input being input through the voice input receiver while the voice recognition function is not activated, so that the received voice input is recognized by the signal processor when the voice recognition function is activated.
    Type: Application
    Filed: September 23, 2014
    Publication date: April 16, 2015
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Chan-hee CHOI, Kyung-mi PARK, Hee-seob RYU, Chan-sik BOK
  • Publication number: 20150106089
    Abstract: A computer-implemented method includes listening for audio name information indicative of a name of a computer, with the computer configured to listen for the audio name information in a first power mode that promotes a conservation of power; detecting the audio name information indicative of the name of the computer; after detection of the audio name information, switching to a second power mode that promotes a performance of speech recognition; receiving audio command information; and performing speech recognition on the audio command information.
    Type: Application
    Filed: December 30, 2010
    Publication date: April 16, 2015
    Inventors: Evan H. Parker, Michal R. Grabowski
  • Patent number: 9009025
    Abstract: In some implementations, a digital work provider may provide language model information related to a plurality of different contexts, such as a plurality of different digital works. For example, the language model information may include language model difference information identifying a plurality of sequences of one or more words in a digital work that have probabilities of occurrence that differ from probabilities of occurrence in a base language model by a threshold amount. The language model difference information corresponding to a particular context may be used in conjunction with the base language model to recognize an utterance made by a user of a user device. In some examples, the recognition is performed on the user device. In other examples, the utterance and associated context information are sent over a network to a recognition computing device that performs the recognition.
    Type: Grant
    Filed: December 27, 2011
    Date of Patent: April 14, 2015
    Assignee: Amazon Technologies, Inc.
    Inventor: Brandon W. Porter
  • Patent number: 9009039
    Abstract: Technologies are described herein for noise adaptive training to achieve robust automatic speech recognition. Through the use of these technologies, a noise adaptive training (NAT) approach may use both clean and corrupted speech for training. The NAT approach may normalize the environmental distortion as part of the model training. A set of underlying “pseudo-clean” model parameters may be estimated directly. This may be done without point estimation of clean speech features as an intermediate step. The pseudo-clean model parameters learned from the NAT technique may be used with a Vector Taylor Series (VTS) adaptation. Such adaptation may support decoding noisy utterances during the operating phase of a automatic voice recognition system.
    Type: Grant
    Filed: June 12, 2009
    Date of Patent: April 14, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Lewis Seltzer, James Garnet Droppo, Ozlem Kalinli, Alejandro Acero
  • Patent number: 9002702
    Abstract: Embodiments of the present invention provide an approach for automatically assigning a confidence level to information extracted from a transcription of a voice recording. Specifically, in a typical embodiment, an axiom is extracted from a source associated with the text of the transcription. A confidence level of the source is determined. A confidence level is assigned to the axiom based on the confidence level of the source.
    Type: Grant
    Filed: May 3, 2012
    Date of Patent: April 7, 2015
    Assignee: International Business Machines Corporation
    Inventors: James E. Bostick, John M. Ganci, Jr., John P. Kaemmerer, Craig M. Trim
  • Patent number: 9002705
    Abstract: The present invention provides an interactive device which allows quick utterance recognition results and sequential output thereof and which diminishes a recognition rate decrease even if user's utterance is divided by a short interval into frames for quick decision. The interactive device: sets a recognition section for voice recognition; performs voice recognition for the recognition section; when the voice recognition includes a key phrase, determines response actions corresponding thereto; and executes the response actions. The interactive device repeatedly updates the set recognition terminal point to a frame which is the predetermined time length ahead of the set recognition terminal point to set a plurality of recognition sections. The interactive device performs voice recognition for each recognition section.
    Type: Grant
    Filed: April 19, 2012
    Date of Patent: April 7, 2015
    Assignee: Honda Motor Co., Ltd.
    Inventors: Yuichi Yoshida, Taku Osada
  • Patent number: 9002706
    Abstract: The invention refers to a method for comparing voice utterances, the method comprising the steps: extracting a plurality of features (201) from a first voice utterance of a given text sample and extracting a plurality of features (201) from a second voice utterance of said given text sample, wherein each feature is extracted as a function of time, and wherein each feature of the second voice utterance corresponds to a feature of the first voice utterance; applying dynamic time warping (202) to one or more time dependent characteristics of the first and/or second voice utterance e.g.
    Type: Grant
    Filed: December 10, 2009
    Date of Patent: April 7, 2015
    Assignee: Agnitio SL
    Inventors: Jesus Antonio Villalba Lopez, Alfonso Ortega Gimenez, Eduardo Lleida Solano, Sara Varela Redondo, Marta Garcia Gomar
  • Patent number: 9002713
    Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions.
    Type: Grant
    Filed: June 9, 2009
    Date of Patent: April 7, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 9002703
    Abstract: The community-based generation of audio narrations for a text-based work leverages collaboration of a community of people to provide human-voiced audio readings. During the community-based generation, a collection of audio recordings for the text-based work may be collected from multiple human readers in a community. An audio recording for each section in the text-based work may be selected from the collection of audio recordings. The selected audio recordings may be then combined to produce an audio reading of at least a portion of the text-based work.
    Type: Grant
    Filed: September 28, 2011
    Date of Patent: April 7, 2015
    Assignee: Amazon Technologies, Inc.
    Inventor: Jay A. Crosley
  • Patent number: 9002710
    Abstract: The invention involves the loading and unloading of dynamic section grammars and language models in a speech recognition system. The values of the sections of the structured document are either determined in advance from a collection of documents of the same domain, document type, and speaker; or collected incrementally from documents of the same domain, document type, and speaker; or added incrementally to an already existing set of values. Speech recognition in the context of the given field is constrained to the contents of these dynamic values. If speech recognition fails or produces a poor match within this grammar or section language model, speech recognition against a larger, more general vocabulary that is not constrained to the given section is performed.
    Type: Grant
    Filed: September 12, 2012
    Date of Patent: April 7, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Alwin B. Carus, Larissa Lapshina, Raghu Vemula
  • Patent number: 9002707
    Abstract: An information processing apparatus includes: a plurality of information input units; an event detection unit that generates event information including estimated position information and estimated identification information of users present in the real space based on analysis of the information from the information input unit; and an information integration processing unit that inputs the event information, and generates target information including a position of each user and user identification information based on the input event information, and signal information representing a probability value of the event generation source, wherein the information integration processing unit includes an utterance source probability calculation unit, and wherein the utterance source probability calculation unit performs a process of calculating an utterance source score as an index value representing an utterance source probability of each target by multiplying weights based on utterance situations by a plurality of d
    Type: Grant
    Filed: November 6, 2012
    Date of Patent: April 7, 2015
    Assignee: Sony Corporation
    Inventor: Keiichi Yamada
  • Patent number: 9001976
    Abstract: A method for speaker adaptation includes receiving a plurality of media files, each associated with a call center agent of a plurality of call center agents and receiving a plurality of terms. Speech processing is performed on at least some of the media files to identify putative instances of at least some of the plurality of terms. Each putative instance is associated with a hit quality that characterizes a quality of recognition of the corresponding term. One or more call center agents for performing speaker adaptation are determined, including identifying call center agents that are associated with at least one media file that includes one or more putative instances with a hit quality below a predetermined threshold. Speaker adaptation is performed for each identified call center agent based on the media files associated with the identified call center agent and the identified instances of the plurality of terms.
    Type: Grant
    Filed: May 3, 2012
    Date of Patent: April 7, 2015
    Assignee: Nexidia, Inc.
    Inventors: Jon A. Arrowood, Robert W. Morris, Marsal Gavalda
  • Publication number: 20150095028
    Abstract: Systems and methods for determining an identity of an individual are provided. Audio may be received that includes a key phrase spoken by the individual, and the key phrase may include an identifier spoken by the individual. A key phrase voice print and key phrase text corresponding to the audio may be obtained. The key phrase text may include text corresponding to the identifier spoken by the individual. Voice prints may be retrieved based on the text corresponding to the identifier, and the voice prints may be provided to a voice biometric engine for comparison to the key phrase voice print. The individual may be authenticated based on a comparison of the key phrase voice print to the voice prints. The identifier may include a first name and a last name of the individual.
    Type: Application
    Filed: September 30, 2013
    Publication date: April 2, 2015
    Applicant: Bank of America Corporation
    Inventors: David Karpey, Mark Pender
  • Patent number: 8996374
    Abstract: Embodiments of the present invention include an apparatus, method, and system for calculating senone scores for multiple concurrent input speech streams. The method can include the following: receiving one or more feature vectors from one or more input streams; accessing the acoustic model one senone at a time; and calculating separate senone scores corresponding to each incoming feature vector. The calculation uses a single read access to the acoustic model for a single senone and calculates a set of separate senone scores for the one or more feature vectors, before proceeding to the next senone in the acoustic model.
    Type: Grant
    Filed: November 6, 2012
    Date of Patent: March 31, 2015
    Assignee: Spansion LLC
    Inventor: Ojas A. Bapat
  • Patent number: 8996382
    Abstract: Systems and methods for inhibiting access to the lips of speaking person including a sound receiving device for receiving speech of a person speaking, the person having lips that move when the person speaks, a blocker connected to the device for blocking the lips of the person speaking while the person is speaking; and, in some aspects, such a blocker with a material addition apparatus to provide added material for the breath of a person speaking, e.g., for preventing the spread of disease or to freshen a speaker's breath. This abstract is provided to comply with the rules requiring an abstract which will allow a searcher or other reader to quickly ascertain the subject matter of the technical disclosure and is submitted with the understanding that it will not be used to interpret or limit the scope or meaning of the claims, 37 C.F.R. 1.72(b).
    Type: Grant
    Filed: October 11, 2011
    Date of Patent: March 31, 2015
    Inventor: Guy L. McClung, III
  • Patent number: 8996373
    Abstract: A state detection device includes: a first model generation unit to generate a first specific speaker model obtained by modeling speech features of a specific speaker in an undepressed state; a second model generation unit to generate a second specific speaker model obtained by modeling speech features of the specific speaker in the depressed state; a likelihood calculation unit to calculate a first likelihood as a likelihood of the first specific speaker model with respect to input voice, and a second likelihood as a likelihood of the second specific speaker model with respect to the input voice; and a state determination unit to determine a state of the speaker of the input voice using the first likelihood and the second likelihood.
    Type: Grant
    Filed: October 5, 2011
    Date of Patent: March 31, 2015
    Assignee: Fujitsu Limited
    Inventors: Shoji Hayakawa, Naoshi Matsuo
  • Patent number: 8996368
    Abstract: A feature transform for speech recognition is described. An input speech utterance is processed to produce a sequence of representative speech vectors. A time-synchronous speech recognition pass is performed using a decoding search to determine a recognition output corresponding to the speech input. The decoding search includes, for each speech vector after some first threshold number of speech vectors, estimating a feature transform based on the preceding speech vectors in the utterance and partial decoding results of the decoding search. The current speech vector is then adjusted based on the current feature transform, and the adjusted speech vector is used in a current frame of the decoding search.
    Type: Grant
    Filed: February 22, 2010
    Date of Patent: March 31, 2015
    Assignee: Nuance Communications, Inc.
    Inventor: Daniel Willett
  • Patent number: 8996387
    Abstract: For clearing transaction data selected for a processing, there is generated in a portable data carrier (1) a transaction acoustic signal (003; 103; 203) (S007; S107; S207) upon whose acoustic reproduction by an end device (10) at least transaction data selected for the processing are reproduced superimposed acoustically with a melody specific to a user of the data carrier (1) (S009; S109; S209). The generated transaction acoustic signal (003; 103; 203) is electronically transferred to an end device (10) (S108; S208), which processes the selected transaction data (S011; S121; S216) only when the user of the data carrier (1) confirms vis-à-vis the end device (10) an at least partial match both of the acoustically reproduced melody with the user-specific melody and of the acoustically reproduced transaction data with the selected transaction data (S010; S110, S116; S210).
    Type: Grant
    Filed: September 8, 2009
    Date of Patent: March 31, 2015
    Assignee: Giesecke & Devrient GmbH
    Inventors: Thomas Stocker, Michael Baldischweiler
  • Publication number: 20150088513
    Abstract: A sound processing system is provided and is executed by a processor. The processor acquires a video/audio file from video/audio files. The processor controls a video/audio processing chip to build a voiceprint feature model of each section for use in speaker recognition, and to identify the speaker of each section based on comparison of the built voiceprint feature model of the acquired video/audio file and the voiceprint feature models of speakers stored in a storage unit. The processor generates a tag file recording relationships between the plurality of sections of the acquired video/audio file and the speakers according to the identification result. A sound processing method is also provided.
    Type: Application
    Filed: September 17, 2014
    Publication date: March 26, 2015
    Inventors: HAI-HSING LIN, HSIN-TSUNG TUNG
  • Publication number: 20150088512
    Abstract: For context-based audio filter selection, a type module determines a recipient type for a recipient process of an audio signal. The recipient type includes a human destination recipient type and a speech recognition recipient type. A filter module selects an audio filter in response to the recipient type.
    Type: Application
    Filed: September 20, 2013
    Publication date: March 26, 2015
    Applicant: LENOVO (Singapore) PTE, LTD.
    Inventors: John Miles Hunt, John Weldon Nicholson
  • Patent number: 8990071
    Abstract: A method for managing an interaction of a calling party to a communication partner is provided. The method includes automatically determining if the communication partner expects DTMF input. The method also includes translating speech input to one or more DTMF tones and communicating the one or more DTMF tones to the communication partner, if the communication partner expects DTMF input.
    Type: Grant
    Filed: March 29, 2010
    Date of Patent: March 24, 2015
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Yun-Cheng Ju, Stefanie Tomko, Frank Liu, Ivan Tashev
  • Publication number: 20150081300
    Abstract: An embodiment of the present invention relates to a speech recognition system and method using incremental device-based acoustic model adaptation.
    Type: Application
    Filed: April 18, 2014
    Publication date: March 19, 2015
    Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventor: Dong-Hyun Kim
  • Publication number: 20150081299
    Abstract: A system for use in assisting a user in a social interaction with another person is provided, the system being configured to determine whether the user recognizes the person and, if it is determined that the user does not recognize the person, to provide information to the user about the person. A corresponding method and computer program product for performing the method are also provided.
    Type: Application
    Filed: June 1, 2012
    Publication date: March 19, 2015
    Applicant: Koninklijke Philips N.V.
    Inventors: Radu Serban Jasinschi, Murtaza Bulut, Luca Bellodi
  • Patent number: 8983838
    Abstract: A global speech user interface (GSUI) comprises an input system to receive a user's spoken command, a feedback system along with a set of feedback overlays to give the user information on the progress of his spoken requests, a set of visual cues on the television screen to help the user understand what he can say, a help system, and a model for navigation among applications. The interface is extensible to make it easy to add new applications.
    Type: Grant
    Filed: September 17, 2013
    Date of Patent: March 17, 2015
    Assignee: Promptu Systems Corporation
    Inventors: Adam Jordan, Scott Lynn Maddux, Tim Plowman, Victoria Stanbach, Jody Williams
  • Patent number: 8983837
    Abstract: A computerized alert mode management method of a communication device, the communication device includes a sound capture unit. Vocal sounds of the environment around the communication device are extracted at regular intervals using the sound capture unit. Voice characteristic information of the captured vocal sounds is extracted using a speech recognition method and/or a voice recognition method. The communication device is controlled to work at one of a plurality of predetermined alert modes according to the extracted voice characteristic information.
    Type: Grant
    Filed: December 6, 2012
    Date of Patent: March 17, 2015
    Assignee: Hon Hai Precision Industry Co., Ltd.
    Inventor: Tsung-Jen Chuang
  • Patent number: 8982971
    Abstract: A multi-carrier signal is typically comprised of many equidistant sub-carriers. This results in periodicity of spectrum within the bandwidth of such a multi-carrier signal. An unknown multi-carrier signal with equidistant sub-carriers can thus be sensed together with its sub-carrier spacing by finding a discernable local maximum in the cepstrum (Fourier transform of the log spectrum) of the multi-carrier signal.
    Type: Grant
    Filed: March 29, 2012
    Date of Patent: March 17, 2015
    Assignee: QRC, Inc.
    Inventors: Sinisa Peric, Thomas F. Callahan, III
  • Patent number: 8983207
    Abstract: A technique for authenticating a user is described. During this authentication technique, an electronic device (such as a cellular telephone) captures multiple images of the user while the user moves the electronic device in a pre-defined manner (for example, along a path in 3-dimensional space), and determines positions of the electronic device when the multiple images were captured. Then, the electronic device compares the images at the positions with corresponding pre-existing images of the user captured at different points of view. If the comparisons achieve a match condition, the electronic device authenticates the user. In this way, the authentication technique may be used to prevent successful replay attacks.
    Type: Grant
    Filed: January 10, 2013
    Date of Patent: March 17, 2015
    Assignee: Intuit Inc.
    Inventors: Alexander S. Ran, Christopher Z. Lesner, Cynthia J. Osmon
  • Patent number: 8983836
    Abstract: Mechanisms for performing dynamic automatic speech recognition on a portion of multimedia content are provided. Multimedia content is segmented into homogeneous segments of content with regard to speakers and background sounds. For the at least one segment, a speaker providing speech in an audio track of the at least one segment is identified using information retrieved from a social network service source. A speech profile for the speaker is generated using information retrieved from the social network service source, an acoustic profile for the segment is generated based on the generated speech profile, and an automatic speech recognition engine is dynamically configured for operation on the at least one segment based on the acoustic profile. Automatic speech recognition operations are performed on the audio track of the at least one segment to generate a textual representation of speech content in the audio track corresponding to the speaker.
    Type: Grant
    Filed: September 26, 2012
    Date of Patent: March 17, 2015
    Assignee: International Business Machines Corporation
    Inventors: Elizabeth V. Woodward, Shunguo Yan
  • Publication number: 20150073801
    Abstract: There are provided an apparatus and a method for selecting a control object through voice recognition. The apparatus for selecting a control object through voice recognition according to the present invention includes one or more processing devices, in which the one or more processing devices are configured to obtain input information on the basis of a voice of a user, to match the input information to at least one first identification information obtained based on a control object and second identification information corresponding to the first identification information, to obtain matched identification information matched to the input information within the first identification information and the second identification information, and to select a control object corresponding to the matched identification information.
    Type: Application
    Filed: August 29, 2014
    Publication date: March 12, 2015
    Inventors: Jongwon Shin, Semi Kim, Kanglae Jung, Jeongin Doh, Jehseon Youn, Kyeogsun Kim
  • Publication number: 20150073799
    Abstract: A voice verifying system, which comprises: a microphone, which is always turned on to output at least one voice signal; a speech determining device, for determining if the voice signal is valid or not according to a reference value, wherein the speech determining device passes the voice signal if the voice signal is valid; and a verifying module, for verifying a speech signal generated from the voice signal and for outputting a device activating signal to activate a target device if the speech signal matches a predetermined rule; and a reference value generating device, for generating the reference value according to speech signal information from the verifying module.
    Type: Application
    Filed: September 12, 2013
    Publication date: March 12, 2015
    Applicant: Mediatek Inc.
    Inventors: Liang-Che Sun, Yiou-Wen Cheng, Ting-Yuan Chiu
  • Publication number: 20150073800
    Abstract: A speaker-verification digital signature system is disclosed that provides greater confidence in communications having digital signatures because a signing party may be prompted to speak a text-phrase that may be different for each digital signature, thus making it difficult for anyone other than the legitimate signing party to provide a valid signature.
    Type: Application
    Filed: June 9, 2014
    Publication date: March 12, 2015
    Inventors: Pradeep K. BANSAL, Lee BEGEJA, Carroll W. CRESWELL, Jeffrey Joseph FARAH, Benjamin J. STERN, Jay WILPON
  • Patent number: 8977555
    Abstract: Features are disclosed for generating markers for elements or other portions of an audio presentation so that a speech processing system may determine which portion of the audio presentation a user utterance refers to. For example, an utterance may include a pronoun with no explicit antecedent. The marker may be used to associate the utterance with the corresponding content portion for processing. The markers can be provided to a client device with a text-to-speech (“TTS”) presentation. The markers may then be provided to a speech processing system along with a user utterance captured by the client device. The speech processing system, which may include automatic speech recognition (“ASR”) modules and/or natural language understanding (“NLU”) modules, can generate hints based on the marker. The hints can be provided to the ASR and/or NLU modules in order to aid in processing the meaning or intent of a user utterance.
    Type: Grant
    Filed: December 20, 2012
    Date of Patent: March 10, 2015
    Assignee: Amazon Technologies, Inc.
    Inventors: Fred Torok, Frédéric Johan Georges Deramat, Vikram Kumar Gundeti
  • Patent number: 8976943
    Abstract: Provided is a method and a telephone-based system with voice-verification capabilities that enable a user to safely and securely conduct transactions with his or her online financial transaction program account over the phone in a convenient and user-friendly fashion, without having to depend on an internet connection.
    Type: Grant
    Filed: September 25, 2012
    Date of Patent: March 10, 2015
    Assignee: Ebay Inc.
    Inventor: Will Tonini
  • Patent number: 8977547
    Abstract: A voice recognition system includes: a voice input unit 11 for inputting a voice uttered a plurality of times; a registering voice data storage unit 12 for storing voice data uttered the plurality of times and input into the voice input unit 11; an utterance stability verification unit 13 for determining a similarity between the voice data uttered the plurality of times that are read from the registering voice data storage unit 12, and determining that registration of the voice data is acceptable when the similarity is greater than a threshold Tl; and a standard pattern creation unit 14 for creating a standard pattern by using the voice data where the utterance stability verification unit 13 determines that registration is acceptable.
    Type: Grant
    Filed: October 8, 2009
    Date of Patent: March 10, 2015
    Assignee: Mitsubishi Electric Corporation
    Inventors: Michihiro Yamazaki, Jun Ishii, Hiroki Sakashita, Kazuyuki Nogi
  • Patent number: 8977549
    Abstract: A natural language business system and method is developed to understand the underlying meaning of a person's speech, such as during a transaction with the business system. The system includes a speech recognition engine, and action classification engine, and a control module. The control module causes the system to execute an inventive method wherein the speech recognition and action classification models may be recursively optimized on an unisolated performance metric that is pertinent to the overall performance of the natural language business system, as opposed to the isolated model-specific criteria previously employed.
    Type: Grant
    Filed: September 26, 2013
    Date of Patent: March 10, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Sabine V. Deligne, Yuqing Gao, Vaibhava Goel, Hong-Kwang Kuo, Cheng Wu
  • Patent number: 8976906
    Abstract: A multi-carrier signal is typically comprised of many equidistant sub-carriers. This results in periodicity of spectrum within the bandwidth of such a multi-carrier signal. An unknown multi-carrier signal with equidistant sub-carriers can thus be sensed together with its sub-carrier spacing by finding a discernible local maximum in the cepstrum (Fourier transform of the log spectrum) of the multi-carrier signal.
    Type: Grant
    Filed: March 29, 2012
    Date of Patent: March 10, 2015
    Assignee: QRC, Inc.
    Inventors: Sinisa Peric, Thomas F. Callahan, III
  • Publication number: 20150066509
    Abstract: In a method for encrypting and decrypting a document based on a voiceprint recognition technology on an electronic device, an encryption key is generated and stored in a storage device of the electronic device. And a voiceprint is verified to determined whether the voiceprint is identical to a predefined voiceprint. if the voiceprint is identical to a predefined voiceprint, the encryption key is obtained from the storage device to encrypt a document. When the encrypted document is decrypted, a decryption key is generated to decrypt the encrypted document.
    Type: Application
    Filed: October 22, 2013
    Publication date: March 5, 2015
    Applicants: HON HAI PRECISION INDUSTRY CO., LTD., HONG FU JIN PRECISION INDUSTRY (WUHAN) CO., LTD.
    Inventors: SHI-CHAO WANG, WEN-TING PENG, JIAN LI, YI-HUNG PENG
  • Publication number: 20150066508
    Abstract: An apparatus, system, and computer readable media for data pre-processing and processing for voice recognition are described herein. The apparatus includes logic to pre-process multi-channel audio data and logic to resolve a source location. The apparatus also includes logic to perform wide range adaptive beam forming, and logic to perform full voice recognition.
    Type: Application
    Filed: August 30, 2013
    Publication date: March 5, 2015
    Inventor: Gangatharan Jothiswaran
  • Publication number: 20150064666
    Abstract: The present disclosures relates to a control terminal, comprising: a data communication unit for receiving a first user voice by data communication with a first audio device and receiving a second user voice by data communication with a second audio device; a turn information generating unit for generating turn information, which is voice unit information, by using the first and second user voices; and a metalanguage processing unit for determining a conversation pattern of the first and second users by using the turn information, and outputting a reminder message corresponding to a reminder event to the first user when the conversation pattern corresponds to a preset reminder event occurrence condition.
    Type: Application
    Filed: October 7, 2013
    Publication date: March 5, 2015
    Applicant: KOREA ADVANCED INSTITUTE OF SCIENCE AND TECHNOLOGY
    Inventors: June Hwa Song, In Seok Hwang, Chung Kuk Yoo, Chan You Hwang, Young Ki Lee, John Dong Jun Kim, Dong Sun Jennifer Yim, Chul Hong Min
  • Patent number: 8972259
    Abstract: A method and system for teaching non-lexical speech effects includes delexicalizing a first speech segment to provide a first prosodic speech signal and data indicative of the first prosodic speech signal is stored in a computer memory. The first speech segment is audibly played to a language student and the student is prompted to recite the speech segment. The speech uttered by the student in response to the prompt, is recorded.
    Type: Grant
    Filed: September 9, 2010
    Date of Patent: March 3, 2015
    Assignee: Rosetta Stone, Ltd.
    Inventors: Joseph Tepperman, Theban Stanley, Kadri Hacioglu
  • Patent number: 8972265
    Abstract: A content customization service is disclosed. The content customization service may identify one or more speakers in an item of content, and map one or more portions of the item of content to a speaker. A speaker may also be mapped to a voice. In one embodiment, the content customization service obtains portions of audio content synchronized to the mapped portions of the item of content. Each portion of audio content may be associated with a voice to which the speaker of the portion of the item of content is mapped. These portions of audio content may be combined to produce a combined item of audio content with multiple voices.
    Type: Grant
    Filed: June 18, 2012
    Date of Patent: March 3, 2015
    Assignee: Audible, Inc.
    Inventor: Kevin S. Lester
  • Patent number: 8972258
    Abstract: Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.
    Type: Grant
    Filed: May 22, 2014
    Date of Patent: March 3, 2015
    Assignee: Nuance Communications, Inc.
    Inventors: Vaibhava Goel, Peder A. Olsen, Steven J. Rennie, Jing Huang
  • Patent number: 8972260
    Abstract: In accordance with one embodiment, a method of generating language models for speech recognition includes identifying a plurality of utterances in training data corresponding to speech, generating a frequency count of each utterance in the plurality of utterances, generating a high-frequency plurality of utterances from the plurality of utterances having a frequency that exceeds a predetermined frequency threshold, generating a low-frequency plurality of utterances from the plurality of utterances having a frequency that is below the predetermined frequency threshold, generating a grammar-based language model using the high-frequency plurality of utterances as training data, and generating a statistical language model using the low-frequency plurality of utterances as training data.
    Type: Grant
    Filed: April 19, 2012
    Date of Patent: March 3, 2015
    Assignee: Robert Bosch GmbH
    Inventors: Fuliang Weng, Zhe Feng, Kui Xu, Lin Zhao
  • Patent number: 8972855
    Abstract: A method and apparatus for providing case restoration in a communication network are disclosed. For example, the method obtains one or more content sources from one or more information feeds, and extracts textual information from the one or more content sources obtained from the one or more information feeds. The method then creates or updates a capitalization model based on the textual information.
    Type: Grant
    Filed: December 16, 2008
    Date of Patent: March 3, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Zhu Liu, David Gibbon, Behzad Shahraray