Specialized Models Patents (Class 704/250)
  • Patent number: 10692503
    Abstract: A voice data processing method and apparatus are provided. The method includes obtaining an I-Vector vector of each of voice samples, and determining a target seed sample in the voice samples. A first cosine distance is calculated between an I-Vector vector of the target seed sample and an I-Vector vector of a target remaining voice sample, where the target remaining voice sample is a voice sample other than the target seed sample in the voice samples. A target voice sample is filtered from the voice samples or the target remaining voice sample according to the first cosine distance, to obtain a target voice sample whose first cosine distance is greater than a first threshold.
    Type: Grant
    Filed: March 3, 2017
    Date of Patent: June 23, 2020
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Xingming Jin, Wei Li, Fangmai Zheng, Fuzhang Wu, Bilei Zhu, Binghua Qian, Ke Li, Yongjian Wu, Feiyue Huang
  • Patent number: 10685657
    Abstract: Systems and methods for analyzing digital recordings of the human voice in order to find characteristics unique to an individual. A biometrics engine may use an analytics service in a contact center to supply audio streams based on configured rules and providers for biometric detection. The analytics service may provide call audio data and attributes to connected engines based on a provider-set of selection rules. The connected providers send call audio data and attributes through the analytics service. The engines are notified when a new call is available for processing and can then retrieve chunks of audio data and call attributes by polling an analytics service interface. A mathematical model of the human vocal tract in the call audio data is created and/or matched against existing models. The result is analogous to a fingerprint, i.e., a pattern unique to an individual to within some level of probability.
    Type: Grant
    Filed: February 15, 2019
    Date of Patent: June 16, 2020
    Assignee: VERINT AMERICAS INC
    Inventors: Jamie Richard Williams, Robert John Barnes, Ian Linsdell, Scott M. Bluman
  • Patent number: 10685666
    Abstract: A mechanism is described for facilitating automatic gain adjustment in audio systems according to one embodiment. A method of embodiments, as described herein, includes determining status of one or more of gain settings, mute settings, and boost settings associated with one or more microphones based on a configuration of a computing device including a voice-enabled device. The method may further comprise recommending adjustment of microphone gain based on the configuration and the status of one or more of the gain, mute, and boost settings, and applying the recommended adjustment of the microphone gain.
    Type: Grant
    Filed: April 6, 2018
    Date of Patent: June 16, 2020
    Assignee: INTEL CORPORATION
    Inventors: Przemyslaw Maziewski, Adam Kupryjanow, Lukasz Kurylo, Pawel Trella
  • Patent number: 10666594
    Abstract: Non-limiting examples of the present disclosure describe proactive action by an intelligent personal assistant application/service to improve functionality of one or more applications. In one example, an intelligent personal assistant service may interface with a messaging application to analyze a message thread within the messaging application. The intelligent personal assistant service may analyze the message thread by evaluating context of message content within the message thread. Analysis of the message thread may occur proactively without requiring an explicit request for assistance from a user of a processing device. In response to the analyzing of the message thread, the intelligent personal assistant service may proactively provide a cue that includes content retrieved by the intelligent personal assistant service. An input may be received to include the cue within the message thread. In response to receiving the input, the cue may be displayed within the message thread.
    Type: Grant
    Filed: November 13, 2018
    Date of Patent: May 26, 2020
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Zachary Adam Pfriem, Mark Jozef Vitazko, Jared Frew, Jason Michael Nelson
  • Patent number: 10650055
    Abstract: A wearable sound capturing and retrieval system that includes a wearable sound capturing device that comprises a data collection device including at least one microphone configured for capturing sound data adjacent a user in at least a substantially continuous manner. The system may, for example: (1) store the captured sound data; (2) convert the captured sound data to captured textual data; (3) index data selected from: one or more segments of captured sound data and one or more segments of captured textual data; and (3) facilitate retrieval of at least a portion of the indexed data, wherein facilitating the retrieval includes (I) scanning the indexed data to identify one or more key phrases, (ii) retrieving one or more segments of indexed data that was communicated by the user at least about contemporaneously with the one or more key phrases, and (iii) saving the one or more segments of indexed data.
    Type: Grant
    Filed: February 8, 2019
    Date of Patent: May 12, 2020
    Assignee: Viesoft, Inc.
    Inventor: Anthony Vierra
  • Patent number: 10621985
    Abstract: A voice recognition device for a vehicle includes: an input device receiving a command and a negative interjection uttered by a user, converting the command into a first voice signal, and converting the negative interjection into a second voice signal; a storage device storing a negative context, an interjection context, and an acoustic model; and a control device receiving the first voice signal, detecting a first start point and a first end point of the first voice signal, receiving the second voice signal after the detection of the first start point and the first end point of the first voice signal, detecting a second start point and a second end point of the second voice signal, and recognizing the second voice signal based on at least one of the negative context, the interjection context, and the acoustic model when the reception of the first voice signal and the second voice signal is completed.
    Type: Grant
    Filed: May 29, 2018
    Date of Patent: April 14, 2020
    Assignees: Hyundai Motor Company, Kia Motors Corporation
    Inventor: Jae Min Joh
  • Patent number: 10565080
    Abstract: A method for monitoring a condition of a system or process includes acquiring sensor data from a plurality of sensors disposed within the system (S41 and S44). The acquired sensor data is streamed in real-time to a computer system (S42 and S44). A discriminative framework is applied to the streaming sensor data using the computer system (S43 and S45). The discriminative framework provides a probability value representing a probability that the sensor data is indicative of an anomaly within the system. The discriminative framework is an integration of a Kalman filter with a logistical function (S41).
    Type: Grant
    Filed: June 11, 2013
    Date of Patent: February 18, 2020
    Assignee: SIEMENS AKTIENGESELLSCHAFT
    Inventors: Chao Yuan, Amit Chakraborty, Holger Hackstein, Leif Wiebking
  • Patent number: 10515655
    Abstract: Techniques for selecting an emotion type code associated with semantic content in an interactive dialog system. In an aspect, fact or profile inputs are provided to an emotion classification algorithm, which selects an emotion type based on the specific combination of fact or profile inputs. The emotion classification algorithm may be rules-based or derived from machine learning. A previous user input may be further specified as input to the emotion classification algorithm. The techniques are especially applicable in mobile communications devices such as smartphones, wherein the fact or profile inputs may be derived from usage of the diverse function set of the device, including online access, text or voice communications, scheduling functions, etc.
    Type: Grant
    Filed: September 4, 2017
    Date of Patent: December 24, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Edward Un, Max Leung
  • Patent number: 10511585
    Abstract: Techniques are provided for smoothing discretized values used, for example, for authentication or identity assurance. An illustrative method comprises obtaining at least one probability of transitioning between at least two discretized values of a given feature; computing a smoothed feature score for the given feature for a transition from a first one of the discretized values to a second one of the discretized values based on the probability of the transition from the first discrete value to the second discrete value; and performing one or more of authenticating a user and verifying an identity of a user based at least in part on the smoothed feature score. The probabilities of transitioning between the discretized values are optionally stored in a transition matrix. Feature scores for first and second discretized values are optionally weighted based on the probability of the transition from the first discrete value to the second discrete value.
    Type: Grant
    Filed: April 27, 2017
    Date of Patent: December 17, 2019
    Assignee: EMC IP Holding Company LLC
    Inventors: Shay Amram, Carmit Sahar, Anatoly Gendelev, Idan Achituve
  • Patent number: 10475441
    Abstract: A voice end-point detection device, a system and a method are provided. The voice end-point detection system includes a processor that is configured to determine an end-point detection time to detect an end-point of speaking of a user that varies vary for each user and for each domain. The voice end-point detection system is configured to perform voice recognition and a database (DB) is configured to store data for the voice recognition by the processor.
    Type: Grant
    Filed: November 13, 2017
    Date of Patent: November 12, 2019
    Assignees: Hyundai Motor Company, Kia Motors Corporation
    Inventors: Kyung Chul Lee, Jae Min Joh
  • Patent number: 10474946
    Abstract: Methods, systems, apparatuses, and computer program products are provided for altering the behavior of an electronic personal assistant based on a situation associated with a mobile device. A situation is sensed with a plurality of sensors to generate sensor data. A situation score is calculated based on the sensor data. Behavior of an electronic personal assistant is altered based on the calculated situation score. In one aspect, the situation is a driving situation in which a driver drives a vehicle on a roadway. In such case, a driving situation score is calculated based on the sensor data, and behavior of the electronic personal assistant is altered based on the calculated driving situation score, such as suspending interactions by the electronic personal assistant with the driver to avoid the driver being distracted.
    Type: Grant
    Filed: June 24, 2016
    Date of Patent: November 12, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Tae Hong Joo, Tarek Z. Elabbady, Mona Soliman Habib
  • Patent number: 10366693
    Abstract: Disclosed herein are methods of diarizing audio data using first-pass blind diarization and second-pass blind diarization that generate speaker statistical models, wherein the first pass-blind diarization is on a per-frame basis and the second pass-blind diarization is on a per-word basis, and methods of creating acoustic signatures for a common speaker based only on the statistical models of the speakers in each audio session.
    Type: Grant
    Filed: January 22, 2018
    Date of Patent: July 30, 2019
    Assignee: Verint Systems LTD.
    Inventors: Alex Gorodetski, Ido Shapira, Ron Wein, Oana Sidi
  • Patent number: 10304460
    Abstract: According to an embodiment, a conference support system includes a recognizer, a classifier, a first caption controller, a second caption controller, and a display controller. The recognizer is configured to recognize text data corresponding speech from a speech section and configured to distinguish between the speech section and a non-speech section in speech data. The classifier is configured to classify the text data into first utterance data representing a principal utterance and second utterance data representing another utterance. The first caption controller is configured to generate first caption data for displaying the first utterance data without waiting for identification of the first utterance data to finish. The second caption controller is configured to generate second caption data for displaying the second utterance data after identification of the second utterance data finishes. The display controller is configured to control a display of the first caption data and the second caption data.
    Type: Grant
    Filed: February 23, 2017
    Date of Patent: May 28, 2019
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Taira Ashikawa, Kosei Fume, Masayuki Ashikawa, Hiroshi Fujimura
  • Patent number: 10304445
    Abstract: A wearable utterance training system includes a wearable utterance training device. The system may, for example: (1) receive one or more target utterances from the user; (2) detect a use of one of the one or more target utterances by the user; and (3) in response, provide one or more responsive effects. The one or more responsive effects may include, for example: (1) providing one or more shocks to the user using the wearable utterance training device; (2) initiating a transfer of money between an account associated with the user and a third party account; (3) creating a public disclosure of the utterance (e.g., by posting the disclosure on one or more social media websites) and/or (4) playing a recording of the user's use of the target utterance or other sound.
    Type: Grant
    Filed: October 13, 2016
    Date of Patent: May 28, 2019
    Assignee: Viesoft, Inc.
    Inventor: Anthony Vierra
  • Patent number: 10276149
    Abstract: Systems, methods, and devices for dynamically outputting TTS content are disclosed. A speech-controlled device captures a spoken command, and sends audio data corresponding thereto to a server(s). The server(s) determines output content responsive to the spoken command. The server(s) may also determine a user that spoke the command and determine an average speech characteristic (e.g., tone, pitch, speed, number of words, etc.) used by the user when speaking commands. The server(s) may also determine a speech characteristic of the presently spoken command, as well as determine a difference between the speech characteristic of the presently spoken command and the average speech characteristic of the user. The server(s) may then cause the speech-controlled device to output audio based on the difference.
    Type: Grant
    Filed: December 21, 2016
    Date of Patent: April 30, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Nancy Yi Liang, Aaron Takayanagi Barnet
  • Patent number: 10269345
    Abstract: This relates to systems and processes for operating an automated assistant to process messages. In one example process, an electronic device receives a communication including a text string and determines whether a portion of the text string is associated with a data type of a plurality of data types. The data type is associated with at least one task. In accordance with a determination that the portion of the text string is associated with the data type, the electronic device receives a user input indicative of a task of the at least one task, and in response, causes the task to be performed based on the portion of the text string. In accordance with a determination that the portion of the text string is not associated with the data type, the electronic device foregoes causing the task to be performed based on the portion of the text string.
    Type: Grant
    Filed: September 19, 2016
    Date of Patent: April 23, 2019
    Assignee: Apple Inc.
    Inventors: Jose A. Castillo Sanchez, Garett R. Nell, Kimberly D. Beverett
  • Patent number: 10249314
    Abstract: A voice conversion system for generating realistic, natural-sounding target speech is disclosed. The voice conversion system preferably comprises a neural network for converting the source speech data to estimated target speech data; a global variance correction module; a modulation spectrum correction module; and a waveform generator. The global variance correction module is configured to scale and shift (or normalize and de-normalize) the estimated target speech based on (i) a mean and standard deviation of the source speech data, and further based on (ii) a mean and standard deviation of the estimated target speech data. The modulation spectrum correction module is configured to apply a plurality of filters to the estimated target speech data after it has been scaled and shifted by the global variance correction module. Each filter is designed to correct the trajectory representing the curve of one MCEP coefficient over time.
    Type: Grant
    Filed: July 21, 2017
    Date of Patent: April 2, 2019
    Assignee: OBEN, INC.
    Inventor: Sandesh Aryal
  • Patent number: 10176163
    Abstract: Embodiments herein include a natural language computing system that provides a diagnosis for a participant in the conversation which indicates the likelihood that the participant exhibited a symptom of autism. To provide the diagnosis, the computing system includes a diagnosis system that performs a training process to generate a machine learning model which is then used to evaluate a textual representation of the conversation. For example, the diagnosis system may receive one or more examples of baseline conversations that exhibit symptoms of autisms and those that do not. The diagnosis system may annotate and the baseline conversations and identify features that are used to identify the symptoms of autism. The system generates a machine learning model that weights the features according to whether the identified features are, or are not, an indicator of autism.
    Type: Grant
    Filed: December 19, 2014
    Date of Patent: January 8, 2019
    Assignee: International Business Machines Corporation
    Inventors: Adam T. Clark, Brian J. Cragun, Anthony W. Eichenlaub, John E. Petri, John C. Unterholzner
  • Patent number: 10158593
    Abstract: Non-limiting examples of the present disclosure describe proactive action by an intelligent personal assistant application/service to improve functionality of one or more applications. In one example, an intelligent personal assistant service may interface with a messaging application to analyze a message thread within the messaging application. The intelligent personal assistant service may analyze the message thread by evaluating context of message content within the message thread. Analysis of the message thread may occur proactively without requiring an explicit request for assistance from a user of a processing device. In response to the analyzing of the message thread, the intelligent personal assistant service may proactively provide a cue that includes content retrieved by the intelligent personal assistant service. An input may be received to include the cue within the message thread. In response to receiving the input, the cue may be displayed within the message thread.
    Type: Grant
    Filed: April 8, 2016
    Date of Patent: December 18, 2018
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Zachary Adam Pfriem, Mark Jozef Vitazko, Jared Frew, Jason Michael Nelson
  • Patent number: 10102760
    Abstract: This disclosure is directed to a system includes a receiver configured to receive audio data from a vehicle. In some examples, the system includes processing circuitry configured to determine an expected maneuver for the vehicle based on the audio data. In some examples, the processing circuitry is further configured to determine whether to output an alert based on the expected maneuver determined from the audio data.
    Type: Grant
    Filed: August 23, 2017
    Date of Patent: October 16, 2018
    Assignee: Honeywell International Inc.
    Inventors: Stanislav Foltan, Robert Sosovicka, Eva Josth Adamova
  • Patent number: 10048079
    Abstract: A destination determination device for a vehicle includes: a communication unit that performs a wireless communication with a mobile terminal used by an occupant in the vehicle and having a destination search function; a search condition acquisition unit that acquires destination search conditions from the mobile terminal through the communication unit; a search unit that performs a destination search based on an AND search of a combined search condition in which a plurality of acquired destination search conditions are combined together when the search condition acquisition unit acquires the plurality of destination search condition; and a search result output unit that outputs a search result of the destination search performed by the search unit under the combined search condition.
    Type: Grant
    Filed: June 2, 2015
    Date of Patent: August 14, 2018
    Assignee: DENSO CORPORATION
    Inventors: Takamitsu Suzuki, Takahira Katoh, Takeshi Yamamoto, Yuuko Nakamura
  • Patent number: 9967724
    Abstract: A method and apparatus for changing a persona of a digital assistant is provided herein. During operation a digital assistant will determine a public-safety incident type and then change its persona based on the public-safety incident type.
    Type: Grant
    Filed: May 8, 2017
    Date of Patent: May 8, 2018
    Assignee: MOTOROLA SOLUTIONS, INC.
    Inventors: Guo Dong Gan, Kong Yong Foo, Mun Yew Tham, Bing Qin Lim
  • Patent number: 9916830
    Abstract: Techniques for altering audio being output by a voice-controlled device, or another device, to enable more accurate automatic speech recognition (ASR) by the voice-controlled device. For instance, a voice-controlled device may output audio within an environment using a speaker of the device. While outputting the audio, a microphone of the device may capture sound within the environment and may generate an audio signal based on the captured sound. The device may then analyze the audio signal to identify speech of a user within the signal, with the speech indicating that the user is going to provide a subsequent command to the device. Thereafter, the device may alter the output of the audio (e.g., attenuate the audio, pause the audio, switch from stereo to mono, etc.) to facilitate speech recognition of the user's subsequent command.
    Type: Grant
    Filed: January 13, 2016
    Date of Patent: March 13, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Gregory Michael Hart, William Spencer Worley, III
  • Patent number: 9711145
    Abstract: The subject matter of this specification can be implemented in, among other things, a computer-implemented method for correcting words in transcribed text including receiving speech audio data from a microphone. The method further includes sending the speech audio data to a transcription system. The method further includes receiving a word lattice transcribed from the speech audio data by the transcription system. The method further includes presenting one or more transcribed words from the word lattice. The method further includes receiving a user selection of at least one of the presented transcribed words. The method further includes presenting one or more alternate words from the word lattice for the selected transcribed word. The method further includes receiving a user selection of at least one of the alternate words. The method further includes replacing the selected transcribed word in the presented transcribed words with the selected alternate word.
    Type: Grant
    Filed: November 14, 2016
    Date of Patent: July 18, 2017
    Assignee: Google Inc.
    Inventors: Michael J. LeBeau, William J. Byrne, John Nicholas Jitkoff, Brandon M. Ballinger, Trausti T. Kristjansson
  • Patent number: 9679569
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a dynamic threshold for speaker verification are disclosed. In one aspect, a method includes the actions of receiving, for each of multiple utterances of a hotword, a data set including at least a speaker verification confidence score, and environmental context data. The actions further include selecting from among the data sets, a subset of the data sets that are associated with a particular environmental context. The actions further include selecting a particular data set from among the subset of data sets based on one or more selection criteria. The actions further include selecting, as a speaker verification threshold for the particular environmental context, the speaker verification confidence score. The actions further include providing the speaker verification threshold for use in performing speaker verification of utterances that are associated with the particular environmental context.
    Type: Grant
    Filed: November 3, 2016
    Date of Patent: June 13, 2017
    Assignee: Google Inc.
    Inventors: Jakob Nicolaus Foerster, Diego Melendo Casado
  • Patent number: 9495127
    Abstract: Methods, computer program products and systems are described for converting speech to text. Sound information is received at a computer server system from an electronic device, where the sound information is from a user of the electronic device. A context identifier indicates a context within which the user provided the sound information. The context identifier is used to select, from among multiple language models, a language model appropriate for the context. Speech in the sound information is converted to text using the selected language model. The text is provided for use by the electronic device.
    Type: Grant
    Filed: December 22, 2010
    Date of Patent: November 15, 2016
    Assignee: Google Inc.
    Inventors: Brandon M. Ballinger, Johan Schalkwyk, Michael H. Cohen, Cyril Georges Luc Allauzen
  • Patent number: 9465794
    Abstract: Disclose is a mobile terminal and control method thereof for inputting a voice to automatically generate a message to be sent during conversation using a mobile messenger, and it may include a microphone for inputting a user's voice, a display unit for displaying a mobile messenger; and a controller for inputting and recognizing a user's voice when a mobile messenger is implemented and then converting into a message to display the message on a message input window of the mobile messenger, and sending the displayed message to the other party which has been preset, and displaying the message sent to the other party and a message received from the other party in the sending and receiving order on a send/receive display window of the mobile messenger.
    Type: Grant
    Filed: May 17, 2010
    Date of Patent: October 11, 2016
    Assignee: LG ELECTRONICS INC.
    Inventors: Sun-Hwa Cha, Jong-Keun Youn
  • Patent number: 9460722
    Abstract: In a method of diarization of audio data, audio data is segmented into a plurality of utterances. Each utterance is represented as an utterance model representative of a plurality of feature vectors. The utterance models are clustered. A plurality of speaker models are constructed from the clustered utterance models. A hidden Markov model is constructed of the plurality of speaker models. A sequence of identified speaker models is decoded.
    Type: Grant
    Filed: June 30, 2014
    Date of Patent: October 4, 2016
    Assignee: Verint Systems Ltd.
    Inventors: Oana Sidi, Ron Wein
  • Patent number: 9412392
    Abstract: An electronic device may capture a voice command from a user. The electronic device may store contextual information about the state of the electronic device when the voice command is received. The electronic device may transmit the voice command and the contextual information to computing equipment such as a desktop computer or a remote server. The computing equipment may perform a speech recognition operation on the voice command and may process the contextual information. The computing equipment may respond to the voice command. The computing equipment may also transmit information to the electronic device that allows the electronic device to respond to the voice command.
    Type: Grant
    Filed: January 27, 2014
    Date of Patent: August 9, 2016
    Assignee: Apple Inc.
    Inventor: Aram M. Lindahl
  • Patent number: 9378729
    Abstract: Features are disclosed for applying maximum likelihood methods to channel normalization in automatic speech recognition (“ASR”). Feature vectors computed from an audio input of a user utterance can be compared to a Gaussian mixture model. The Gaussian that corresponds to each feature vector can be determined, and statistics (e.g., constrained maximum likelihood linear regression statistics) can then be accumulated for each feature vector. Using these statistics, or some subset thereof, offsets and/or a diagonal transform matrix can be computed for each feature vector. The offsets and/or diagonal transform matrix can be applied to the corresponding feature vector to generate a feature vector normalized based on maximum likelihood methods. The ASR process can then proceed using the transformed feature vectors.
    Type: Grant
    Filed: March 12, 2013
    Date of Patent: June 28, 2016
    Assignee: Amazon Technologies, Inc.
    Inventor: Stan Weidner Salvador
  • Patent number: 9379884
    Abstract: A symbol clock recovery circuit comprising an ADC, a controllable inverter and a timing detector. A timing detector input terminal is configured to receive an ADC output signal from an ADC output terminal; a timing detector output terminal is configured to provide a digital output signal; and a first timing detector feedback terminal is configured to provide a first feedback signal to the inverter control terminal. The timing detector is configured to determine an error signal associated with the received ADC output signal, and set the first feedback signal in accordance with the error signal.
    Type: Grant
    Filed: May 1, 2015
    Date of Patent: June 28, 2016
    Assignee: NXP B.V.
    Inventors: Massimo Ciacci, Ghiath Al-kadi, Remco van de Beek
  • Patent number: 9223863
    Abstract: Disclosed are various systems, methods, and programs embodied in a computer-readable medium for sound analysis. The sound analysis involves transforming a sound print into a frequency domain in a memory to generate a frequency spectrum. A plurality of signatures are identified in the frequency spectrum. Also, a plurality of frequency ranges associated with the signatures are identified in the sound print. The frequencies associated with a physiological profile are cross-referenced with the frequency ranges to determine if the physiological profile is applicable to the sound print.
    Type: Grant
    Filed: December 5, 2012
    Date of Patent: December 29, 2015
    Assignee: Dean Enterprises, LLC
    Inventor: Vickie A. Dean
  • Patent number: 9122453
    Abstract: The disclosed embodiments illustrate methods and systems for processing one or more crowdsourced tasks. The method comprises converting an audio input received from a crowdworker to one or more phrases by one or more processors in at least one computing device. The audio input is at least a response to a crowdsourced task. A mode of the audio input is selected based on one or more parameters associated with the crowdworker. Thereafter, the one or more phrases are presented on a display of the at least one computing device by the one or more processors. Finally, one of the one or more phrases is selected by the crowdworker as a correct response to the crowdsourced task.
    Type: Grant
    Filed: July 16, 2013
    Date of Patent: September 1, 2015
    Assignee: Xerox Corporation
    Inventor: Shailesh Vaya
  • Patent number: 9075870
    Abstract: A system for detecting related topics and competition topics for a target topic includes an information extracting apparatus configured to create topic templates and association words from documents created online to generate topic templates and association words. The system also includes a related topic detecting apparatus configured to detect and trace related topics and competition topics for the target topic based on the topic templates and the association words.
    Type: Grant
    Filed: September 12, 2012
    Date of Patent: July 7, 2015
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventor: Chung Hee Lee
  • Patent number: 9043207
    Abstract: The present invention relates to a method for speaker recognition, comprising the steps of obtaining and storing speaker information for at least one target speaker; obtaining a plurality of speech samples from a plurality of telephone calls from at least one unknown speaker; classifying the speech samples according to the at least one unknown speaker thereby providing speaker-dependent classes of speech samples; extracting speaker information for the speech samples of each of the speaker-dependent classes of speech samples; combining the extracted speaker information for each of the speaker-dependent classes of speech samples; comparing the combined extracted speaker information for each of the speaker-dependent classes of speech samples with the stored speaker information for the at least one target speaker to obtain at least one comparison result; and determining whether one of the at least one unknown speakers is identical with the at least one target speaker based on the at least one comparison result.
    Type: Grant
    Filed: November 12, 2009
    Date of Patent: May 26, 2015
    Assignee: Agnitio S.L.
    Inventors: Johan Nikolaas Langehoven Brummer, Luis Buera Rodriguez, Marta Garcia Gomar
  • Patent number: 9026431
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for semantic parsing with multiple parsers. One of the methods includes obtaining one or more transcribed prompt n-grams from a speech to text recognizer, providing the transcribed prompt n-grams to a first semantic parser that executes on the user device and accesses a first knowledge base for results responsive to the spoken prompt, providing the transcribed prompt n-grams to a second semantic parser that accesses a second knowledge base for results responsive to the spoken prompt, the first knowledge base including first data not included in the second knowledge base, receiving a result responsive to the spoken prompt from the first semantic parser or the second semantic parser, wherein the result is selected from the knowledge base associated with the semantic parser that provided the result to the user device, and performing an operation based on the result.
    Type: Grant
    Filed: July 30, 2013
    Date of Patent: May 5, 2015
    Assignee: Google Inc.
    Inventors: Pedro J. Moreno Mengibar, Diego Melendo Casado, Fadi Biadsy
  • Patent number: 9009045
    Abstract: Methods and systems for model-driven candidate sorting for evaluating digital interviews are described. In one embodiment, a model-driven candidate-sorting tool selects a data set of digital interview data for sorting. The data set includes candidate for interviewing candidates (also referred to herein as interviewees). The model-driven candidate-sorting tool analyzes the candidate data for the respective interviewing candidate to identify digital interviewing cues and applies the digital interview cues to a prediction model to predict an achievement index for the respective interviewing candidate. This is performed without reviewer input at the model-driven candidate-sorting tool. The list of interview candidates is sorted according the predicted achievement indices and the sorted list is presented to the reviewer in a user interface.
    Type: Grant
    Filed: February 18, 2014
    Date of Patent: April 14, 2015
    Assignee: HireVue, Inc.
    Inventors: Loren Larsen, Benjamin Taylor
  • Patent number: 9002707
    Abstract: An information processing apparatus includes: a plurality of information input units; an event detection unit that generates event information including estimated position information and estimated identification information of users present in the real space based on analysis of the information from the information input unit; and an information integration processing unit that inputs the event information, and generates target information including a position of each user and user identification information based on the input event information, and signal information representing a probability value of the event generation source, wherein the information integration processing unit includes an utterance source probability calculation unit, and wherein the utterance source probability calculation unit performs a process of calculating an utterance source score as an index value representing an utterance source probability of each target by multiplying weights based on utterance situations by a plurality of d
    Type: Grant
    Filed: November 6, 2012
    Date of Patent: April 7, 2015
    Assignee: Sony Corporation
    Inventor: Keiichi Yamada
  • Patent number: 9002709
    Abstract: Provided is a voice recognition system capable of, while suppressing negative influences from sound not to be recognized, correctly estimating utterance sections that are to be recognized. A voice segmenting means calculates voice feature values, and segments voice sections or non-voice sections by comparing the voice feature values with a threshold value. Then, the voice segmenting means determines, to be first voice sections, those segmented sections or sections obtained by adding a margin to the front and rear of each of those segmented sections. On the basis of voice and non-voice likelihoods, a search means determines, to be second voice sections, sections to which voice recognition is to be applied. A parameter updating means updates the threshold value and the margin. The voice segmenting means determines the first voice sections by using the one of the threshold value and the margin which has been updated by the parameter updating means.
    Type: Grant
    Filed: November 26, 2010
    Date of Patent: April 7, 2015
    Assignee: NEC Corporation
    Inventor: Takayuki Arakawa
  • Patent number: 8996387
    Abstract: For clearing transaction data selected for a processing, there is generated in a portable data carrier (1) a transaction acoustic signal (003; 103; 203) (S007; S107; S207) upon whose acoustic reproduction by an end device (10) at least transaction data selected for the processing are reproduced superimposed acoustically with a melody specific to a user of the data carrier (1) (S009; S109; S209). The generated transaction acoustic signal (003; 103; 203) is electronically transferred to an end device (10) (S108; S208), which processes the selected transaction data (S011; S121; S216) only when the user of the data carrier (1) confirms vis-à-vis the end device (10) an at least partial match both of the acoustically reproduced melody with the user-specific melody and of the acoustically reproduced transaction data with the selected transaction data (S010; S110, S116; S210).
    Type: Grant
    Filed: September 8, 2009
    Date of Patent: March 31, 2015
    Assignee: Giesecke & Devrient GmbH
    Inventors: Thomas Stocker, Michael Baldischweiler
  • Patent number: 8996373
    Abstract: A state detection device includes: a first model generation unit to generate a first specific speaker model obtained by modeling speech features of a specific speaker in an undepressed state; a second model generation unit to generate a second specific speaker model obtained by modeling speech features of the specific speaker in the depressed state; a likelihood calculation unit to calculate a first likelihood as a likelihood of the first specific speaker model with respect to input voice, and a second likelihood as a likelihood of the second specific speaker model with respect to the input voice; and a state determination unit to determine a state of the speaker of the input voice using the first likelihood and the second likelihood.
    Type: Grant
    Filed: October 5, 2011
    Date of Patent: March 31, 2015
    Assignee: Fujitsu Limited
    Inventors: Shoji Hayakawa, Naoshi Matsuo
  • Patent number: 8977547
    Abstract: A voice recognition system includes: a voice input unit 11 for inputting a voice uttered a plurality of times; a registering voice data storage unit 12 for storing voice data uttered the plurality of times and input into the voice input unit 11; an utterance stability verification unit 13 for determining a similarity between the voice data uttered the plurality of times that are read from the registering voice data storage unit 12, and determining that registration of the voice data is acceptable when the similarity is greater than a threshold Tl; and a standard pattern creation unit 14 for creating a standard pattern by using the voice data where the utterance stability verification unit 13 determines that registration is acceptable.
    Type: Grant
    Filed: October 8, 2009
    Date of Patent: March 10, 2015
    Assignee: Mitsubishi Electric Corporation
    Inventors: Michihiro Yamazaki, Jun Ishii, Hiroki Sakashita, Kazuyuki Nogi
  • Patent number: 8972266
    Abstract: A speaker intent analysis system and method for validating the truthfulness and intent of a plurality of participants' responses to questions. A computer stores, retrieves, and transmits a series of questions to be answered audibly by participants. The participants' answers are received by a data processor. The data processor analyzes and records the participants' speech parameters for determining the likelihood of dishonesty. In addition to analyzing participants' speech parameters for distinguishing stress or other abnormality, the processor may be equipped with voice recognition software to screen responses that while not dishonest, are indicative of possible malfeasance on the part of the participants. Once the responses are analyzed, the processor produces an output that is indicative of the participant's credibility. The output may be sent to proper parties and/or devices such as a web page, computer, e-mail, PDA, pager, database, report, etc. for appropriate action.
    Type: Grant
    Filed: June 12, 2012
    Date of Patent: March 3, 2015
    Inventor: David Bezar
  • Patent number: 8954327
    Abstract: A voice data analyzing device comprises speaker model deriving means which derives speaker models as models each specifying character of voice of each speaker from voice data including a plurality of utterances to each of which a speaker label as information for identifying a speaker has been assigned and speaker co-occurrence model deriving means which derives a speaker co-occurrence model as a model representing the strength of co-occurrence relationship among the speakers from session data obtained by segmenting the voice data in units of sequences of conversation by use of the speaker models derived by the speaker model deriving means.
    Type: Grant
    Filed: June 3, 2010
    Date of Patent: February 10, 2015
    Assignee: NEC Corporation
    Inventor: Takafumi Koshinaka
  • Patent number: 8935169
    Abstract: According to one embodiment, an electronic apparatus includes an acquiring module and a display process module. The acquiring module is configured to acquire information regarding a plurality of persons using information of video content data, the plurality of persons appearing in a plurality of sections in the video content data. The display process module is configured to display (i) a time bar representative of a sequence of the video content data, (ii) information regarding a first person appearing in a first section of the sections, and (iii) information regarding a second person different from the first person, the second person appearing in a second section of the sections. The first area of the time bar corresponds to the first section is displayed in a first form, and a second area of the time bar corresponds to the second section is displayed in a second form different from the first form.
    Type: Grant
    Filed: September 14, 2012
    Date of Patent: January 13, 2015
    Assignee: Kabushiki Kaisha Toshiba
    Inventor: Tetsuya Fujii
  • Patent number: 8935151
    Abstract: A source language sentence is tagged with non-lexical tags, such as part-of-speech tags and is parsed using a lexicalized parser trained in the source language. A target language sentence that is a translation of the source language sentence is tagged with non-lexical labels (e.g., part-of speech tags) and is parsed using a delexicalized parser that has been trained in the source language to produce k-best parses. The best parse is selected based on the parse's alignment with lexicalized parse of the source language sentence. The selected best parse can be used to update the parameter vector of a lexicalized parser for the target language.
    Type: Grant
    Filed: December 7, 2011
    Date of Patent: January 13, 2015
    Assignee: Google Inc.
    Inventors: Slav Petrov, Ryan McDonald, Keith Hall
  • Publication number: 20140334682
    Abstract: A monitoring device is provided, which includes an inputter configured to receive an input of a plurality of images captured at separate positions and a plurality of sound sources heard at separate positions, a saliency map generator configured to generate a plurality of mono saliency maps for the plurality of images and to generate a dynamic saliency map using the plurality of mono saliency maps generated, a position determinator configured to determine the positions of the sound sources through analysis of the plurality of sound sources, a scan path recognizer configured to generate scan paths of the plurality of images based on the generated dynamic saliency map and the determined positions of the sound sources, and an outputter configured to output the generated scan paths.
    Type: Application
    Filed: December 5, 2012
    Publication date: November 13, 2014
    Applicants: KYUNGPOCK NATIONAL INDUSRTY ACADEMIC COOPERATION FOUNDATION, INDUSTRY-UNIVERSITY COOPERATION FOUNDATION SOGANG UNIVERSITY
    Inventors: Minho Lee, Young-Min Jang, Sungmoon Jeong, Bumhwi Kim, Hyung-Min Park, Minook Kim
  • Patent number: 8874442
    Abstract: Device, system, and method of liveness detection using voice biometrics. For example, a method comprises: generating a first matching score based on a comparison between: (a) a voice-print from a first text-dependent audio sample received at an enrollment stage, and (b) a second text-dependent audio sample received at an authentication stage; generating a second matching score based on a text-independent audio sample; and generating a liveness score by taking into account at least the first matching score and the second matching score.
    Type: Grant
    Filed: April 17, 2013
    Date of Patent: October 28, 2014
    Assignee: Nuance Communications, Inc.
    Inventors: Almog Aley-Raz, Nir Moshe Krause, Michael Itzhak Salmon, Ran Yehoshua Gazit
  • Patent number: 8868409
    Abstract: In some implementations, audio data for an utterance is provided over a network. At a client device and over the network, information is received that indicates candidate transcriptions for the utterance and semantic information for the candidate transcriptions. A semantic parser is used at the client device to evaluate each of at least a plurality of the candidate transcriptions. One of the candidate transcriptions is selected based on at least the received semantic information and the output of the semantic parser for the plurality of candidate transcriptions that are evaluated.
    Type: Grant
    Filed: January 16, 2014
    Date of Patent: October 21, 2014
    Assignee: Google Inc.
    Inventors: Pedro J. Moreno Mengibar, Fadi Biadsy, Diego Melendo Casado
  • Patent number: 8854232
    Abstract: A handheld electronic device includes a reduced QWERTY keyboard and is enabled with disambiguation software. The device provides output in the form of a default output and a number of variants. The output is based largely upon the frequency, i.e., the likelihood that a user intended a particular output, but various features of the device provide additional variants that are not based solely on frequency and rather are provided by various logic structures resident on the device. The device enables editing during text entry and also provides a learning function that allows the disambiguation function to adapt to provide a customized experience for the user. The disambiguation function can be selectively disabled and an alternate keystroke interpretation system provided.
    Type: Grant
    Filed: June 30, 2011
    Date of Patent: October 7, 2014
    Assignee: BlackBerry Limited
    Inventors: Vadim Fux, Michael G. Elizarov, Sergey V. Kolomiets