Clustering Patents (Class 704/245)
-
Patent number: 9860669Abstract: An audio apparatus includes a receiver configured to receive audio data and audio transducer position data for a plurality of audio transducers; and a renderer configured to render the audio data by generating audio transducer drive signals for the audio transducers from the audio data. Further, a clusterer is configured to cluster the audio transducers into a set of clusters in response to the audio transducer position data and to distances between audio transducers in accordance with a distance metric. A render controller is configured to adapt the rendering in response to the clustering. The apparatus is configured to select array processing techniques for specific subsets that contain audio transducers that are sufficiently close and allow automatic adaptation to audio transducer configurations thereby, e.g., allowing a user increased flexibility in positioning loudspeakers.Type: GrantFiled: May 6, 2014Date of Patent: January 2, 2018Assignee: KONINKLIJKE PHILIPS N.V.Inventors: Werner Paulus Josephus De Bruijn, Arnoldus Werner Johannes Oomen, Aki Sakari Haermae
-
Patent number: 9837068Abstract: A method for verifying at least one sound sample to be used in generating a sound detection model in an electronic device includes receiving a first sound sample; extracting a first acoustic feature from the first sound sample; receiving a second sound sample; extracting a second acoustic feature from the second sound sample; and determining whether the second acoustic feature is similar to the first acoustic feature.Type: GrantFiled: April 8, 2015Date of Patent: December 5, 2017Assignee: QUALCOMM IncorporatedInventors: Sunkuk Moon, Minho Jin, Haiying Xia, Hesu Huang, Warren Frederick Dale
-
Patent number: 9830931Abstract: One embodiment of the present invention sets forth a technique for determining a set of sound parameters associated with a sound type. The technique includes receiving, via a network and from each a first plurality of remote computing devices, an audio recording of a first sound type and a descriptor associated with the first sound type. The technique further includes processing the audio recordings via a processor to determine a first set of sound parameters associated with the first sound type. The technique further includes receiving a request associated with the descriptor from at least one remote computing device and, in response, transmitting the first set of sound parameters associated with the first sound type to the at least one remote computing device.Type: GrantFiled: December 31, 2015Date of Patent: November 28, 2017Assignee: HARMAN INTERNATIONAL INDUSTRIES, INCORPORATEDInventors: Donald Joseph Butts, Brandon Stacey
-
Patent number: 9714884Abstract: A statistical basic classification model of acoustic features generated for at least one reference object is automatically adapted by a data processing unit based on acoustic features of a noise generated by an object to be investigated to obtain an individually adapted statistical classification model. The data processing unit then classifies the state of the noise-generating object based on the individually adapted statistical classification model.Type: GrantFiled: April 29, 2009Date of Patent: July 25, 2017Assignee: SIEMENS AKTIENGESELLSCHAFTInventors: Joachim Hofer, Lutz Leutelt
-
Patent number: 9691391Abstract: Systems and methods to perform speaker clustering determine which audio segments appear to include sound generated by the same speaker. Speaker clustering is based on creating a graph in which a node represents an audio segment and an edge between two nodes represents a relationship and/or correspondence that reflects a probability, likelihood, or other indication that the two nodes represent audio segments of the same speaker. This graph is analyzed to detect individual communities of nodes that associate to an individual speaker.Type: GrantFiled: May 21, 2015Date of Patent: June 27, 2017Assignee: KnuEdge IncorporatedInventor: Rodney Gateau
-
Patent number: 9672814Abstract: Software that trains an artificial neural network for generating vector representations for natural language text, by performing the following steps: (i) receiving, by one or more processors, a set of natural language text; (ii) generating, by one or more processors, a set of first metadata for the set of natural language text, where the first metadata is generated using supervised learning method(s); (iii) generating, by one or more processors, a set of second metadata for the set of natural language text, where the second metadata is generated using unsupervised learning method(s); and (iv) training, by one or more processors, an artificial neural network adapted to generate vector representations for natural language text, where the training is based, at least in part, on the received natural language text, the generated set of first metadata, and the generated set of second metadata.Type: GrantFiled: May 8, 2015Date of Patent: June 6, 2017Assignee: International Business Machines CorporationInventors: Liangliang Cao, James J. Fan, Chang Wang, Bing Xiang, Bowen Zhou
-
Patent number: 9666192Abstract: Methods and apparatus for reducing latency in speech recognition applications. The method comprises receive first audio comprising speech from a user of a computing device, detecting an end of speech in the first audio, generating an ASR result based, at least in part, on a portion of the first audio prior to the detected end of speech, determining whether a valid action can be performed by a speech-enabled application installed on the computing device using the ASR result, and processing second audio when it is determined that a valid action cannot be performed by the speech-enabled application using the ASR result.Type: GrantFiled: May 26, 2015Date of Patent: May 30, 2017Assignee: Nuance Communications, Inc.Inventor: Mark Fanty
-
Patent number: 9659560Abstract: Software that trains an artificial neural network for generating vector representations for natural language text, by performing the following steps: (i) receiving, by one or more processors, a set of natural language text; (ii) generating, by one or more processors, a set of first metadata for the set of natural language text, where the first metadata is generated using supervised learning method(s); (iii) generating, by one or more processors, a set of second metadata for the set of natural language text, where the second metadata is generated using unsupervised learning method(s); and (iv) training, by one or more processors, an artificial neural network adapted to generate vector representations for natural language text, where the training is based, at least in part, on the received natural language text, the generated set of first metadata, and the generated set of second metadata.Type: GrantFiled: September 30, 2015Date of Patent: May 23, 2017Assignee: International Business Machines CorporationInventors: Liangliang Cao, James J. Fan, Chang Wang, Bing Xiang, Bowen Zhou
-
Patent number: 9641968Abstract: A system for sharing moment experiences is described. A system receives moment data from an input to a mobile device. The system receives geographic location information, time information, and contextual information that is local to the mobile device. The system creates a message about the moment data based on the geographic location information, the time information, and the contextual information. The system outputs the moment data with the message.Type: GrantFiled: May 15, 2015Date of Patent: May 2, 2017Assignee: Krumbs, Inc.Inventors: Neilesh Jain, Ramesh Jain, Pinaki Sinha
-
Patent number: 9620148Abstract: Systems, vehicles, and methods for limiting speech-based access to an audio metadata database are described herein. Audio metadata databases described herein include a plurality of audio metadata entries. Each audio metadata entry includes metadata information associated with at least one audio file. Embodiments described herein determine when a size of the audio metadata database reaches a threshold size, and limit which of the plurality of audio metadata entries may be accessed in response to the speech input signal when the size of the audio metadata database reaches the threshold size.Type: GrantFiled: July 1, 2013Date of Patent: April 11, 2017Assignee: Toyota Motor Engineering & Manufacturing North America, Inc.Inventor: Eric Randell Schmidt
-
Patent number: 9595260Abstract: A modeling device comprises a front end which receives enrollment speech data from each target speaker, a reference anchor set generation unit which generates a reference anchor set using the enrollment speech data based on an anchor space, and a voice print generation unit which generates voice prints based on the reference anchor set and the enrollment speech data. By taking the enrollment speech and speaker adaptation technique into account, anchor models with a smaller size can be generated, so reliable and robust speaker recognition with a smaller size reference anchor set is possible.Type: GrantFiled: December 10, 2010Date of Patent: March 14, 2017Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICAInventors: Haifeng Shen, Long Ma, Bingqi Zhang
-
Patent number: 9576582Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.Type: GrantFiled: February 23, 2016Date of Patent: February 21, 2017Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
-
Patent number: 9524291Abstract: Techniques involving visual display of information related to matching user utterances against graph patterns are described. In one or more implementations, an utterance of a user is obtained that has been indicated as corresponding to a graph pattern through linguistic analysis. The utterance is displayed in a user interface as a representation of the graph pattern.Type: GrantFiled: October 6, 2010Date of Patent: December 20, 2016Assignee: Virtuoz SAInventors: Dan Teodosiu, Elizabeth Ireland Powers, Pierre Serge Vincent LeRoy, Sebastien Jean-Marie Christian Saunier
-
Patent number: 9514391Abstract: In an image classification method, a feature vector representing an input image is generated by unsupervised operations including extracting local descriptors from patches distributed over the input image, and a classification value for the input image is generated by applying a neural network (NN) to the feature vector. Extracting the feature vector may include encoding the local descriptors extracted from each patch using a generative model, such as Fisher vector encoding, aggregating the encoded local descriptors to form a vector, projecting the vector into a space of lower dimensionality, for example using Principal Component Analysis (PCA), and normalizing the feature vector of lower dimensionality to produce the feature vector representing the input image. A set of mid-level features representing the input image may be generated as the output of an intermediate layer of the NN.Type: GrantFiled: April 20, 2015Date of Patent: December 6, 2016Assignee: XEROX CORPORATIONInventors: Florent C. Perronnin, Diane Larlus-Larrondo
-
Patent number: 9449051Abstract: According to one embodiment, a topic extracting apparatus extracts each term from a target document set, and calculates an appearance frequency of each term and a document frequency that each term appears. The topic extracting apparatus acquires a document set of appearance documents with respect to each extracted term, calculates a topic degree, extracts each term whose topic degree is not lower than a predetermined value as a topic word, and calculates freshness of the extracted topic word based on an appearance date and time. The topic extracting apparatus presents the extracted topic words in order of the freshness and also presents the number of appearance documents of each presented topic word per unit span.Type: GrantFiled: September 10, 2013Date of Patent: September 20, 2016Assignees: KABUSHIKI KAISHA TOSHIBA, TOSHIBA SOLUTIONS CORPORATIONInventors: Hideki Iwasaki, Kazuyuki Goto, Shigeru Matsumoto, Yasunari Miyabe, Mikito Kobayashi
-
Patent number: 9411829Abstract: Disclosed herein is a system and method that facilitate searching and/or browsing of images by clustering, or grouping, the images into a set of image clusters using facets, such as without limitation visual properties or visual characteristics, of the images, and representing each image cluster by a representative image selected for the image cluster. A map-reduce based probabilistic topic model may be used to identify one or more images belonging to each image cluster and update model parameters.Type: GrantFiled: June 10, 2013Date of Patent: August 9, 2016Assignee: Yahoo! Inc.Inventors: Jia Li, Nadav Golbandi, XianXing Zhang
-
Patent number: 9412381Abstract: A triple factor authentication in one step method and system is disclosed. According to one embodiment, an Integrated Voice Biometrics Cloud Security Gateway (IVCS Gateway) intercepts an access request to a resource server from a user using a user device. IVCS Gateway then authenticates the user by placing a call to the user device and sending a challenge message prompting the user to respond by voice. After receiving the voice sample of the user, the voice sample is compared against a stored voice biometrics record for the user. The voice sample is also converted into a text phrase and compared against a stored secret text phrase. In an alternative embodiment, an IVCS Gateway that is capable of making non-binary access decisions and associating multiple levels of access with a single user or group is described.Type: GrantFiled: March 30, 2011Date of Patent: August 9, 2016Assignee: ACK3 BIONETICS PRIVATE LTD.Inventor: Sajit Bhaskaran
-
Patent number: 9378742Abstract: Disclosed are an apparatus for recognizing voice using multiple acoustic models according to the present invention and a method thereof. An apparatus for recognizing voice using multiple acoustic models includes a voice data database (DB) configured to store voice data collected in various noise environments; a model generating means configured to perform classification for each speaker and environment based on the collected voice data, and to generate an acoustic model of a binary tree structure as the classification result; and a voice recognizing means configured to extract feature data of voice data when the voice data is received from a user, to select multiple models from the generated acoustic model based on the extracted feature data, to parallel recognize the voice data based on the selected multiple models, and to output a word string corresponding to the voice data as the recognition result.Type: GrantFiled: March 18, 2013Date of Patent: June 28, 2016Assignee: Electronics and Telecommunications Research InstituteInventor: Dong Hyun Kim
-
Patent number: 9373338Abstract: An automatic speech recognition engine receives an acoustic-echo processed signal from an acoustic-echo processing (AEP) module, where said echo processed signal contains mainly the speech from the near-end talker. The automatic speech recognition engine analyzes the content of the acoustic-echo processed signal to determine whether words or keywords are present. Based upon the results of this analysis, the automatic speech recognition engine produces a value reflecting the likelihood that some words or keywords are detected. Said value is provided to the AEP module. Based upon the value, the AEP module determines if there is double talk and processes the incoming signals accordingly to enhance its performance.Type: GrantFiled: June 25, 2012Date of Patent: June 21, 2016Assignee: Amazon Technologies, Inc.Inventors: Ramya Gopalan, Kavitha Velusamy, Wai C. Chu, Amit S. Chhetri
-
Patent number: 9336774Abstract: Methods, systems, and apparatus, for pattern recognition. One aspect includes a pattern recognizing engine that includes multiple pattern recognizer processors that form a hierarchy of pattern recognizer processors. The pattern recognizer processors include a child pattern recognizer processor at a lower level in the hierarch and a parent pattern recognizer processor at a higher level of the hierarchy, where the child pattern recognizer processor is configured to provide a first complex recognition output signal to a pattern recognizer processor at a higher level than the child pattern recognizer processor, and the parent pattern recognizer processor is configured to receive as an input a second complex recognition output signal from a pattern recognizer processor at a lower level than the parent pattern recognizer processor.Type: GrantFiled: April 22, 2013Date of Patent: May 10, 2016Assignee: Google Inc.Inventor: Raymond C. Kurzweil
-
Patent number: 9305547Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.Type: GrantFiled: April 28, 2015Date of Patent: April 5, 2016Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
-
Patent number: 9305553Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition. In one aspect, a computer-based method includes receiving a speech corpus at a speech management server system that includes multiple speech recognition engines tuned to different speaker types; using the speech recognition engines to associate the received speech corpus with a selected one of multiple different speaker types; and sending a speaker category identification code that corresponds to the associated speaker type from the speech management server system over a network. The speaker category identification code can be used by any one of speech-interactive applications coupled to the network to select one of an appropriate one of multiple application-accessible speech recognition engines tuned to the different speaker types in response to an indication that a user accessing the application is associated with a particular one of the speaker category identification codes.Type: GrantFiled: April 28, 2011Date of Patent: April 5, 2016Inventor: William S. Meisel
-
Patent number: 9275044Abstract: A method and system are provided for finding synonyms which are more contextually relevant to the intended use of a particular word. The system finds a list of synonyms for the input word and also finds a list of synonyms for an additional word entered by the user to approximate the intended usage of the input word. These two lists of synonyms are compared to find words common to both lists, and the common words are presented to the user as potential synonyms which are appropriate for the intended use.Type: GrantFiled: March 6, 2013Date of Patent: March 1, 2016Assignee: SearchLeaf, LLCInventors: Thomas Lund, Bryce Lund
-
Patent number: 9262694Abstract: Provided is a technology which enables further improvement of the accuracy of the determination in the pattern matching processing. A dictionary learning device 1 includes a score calculation unit 2 and a learning unit 3. The score calculation unit 2 calculates a matching score representing a similarity-degree between a sample pattern, which is a sample of a pattern which is likely to be subjected to a pattern matching processing, and a degradation pattern resulting from a degrading processing on the sample pattern. The learning unit 3 learns a quality dictionary based on the calculated matching score and the degradation pattern. The quality dictionary is a dictionary which is used in a processing to evaluate a degradation degree (quality) of a matching target pattern of being pattern of an object on which the pattern matching processing is carried out.Type: GrantFiled: December 12, 2012Date of Patent: February 16, 2016Assignee: NEC CorporationInventor: Masato Ishii
-
Patent number: 9122931Abstract: An object identification method is provided. The method includes dividing an input video into a number of video shots, each containing one or more video frames. The method also includes detecting target-class object occurrences and related-class object occurrences in each video shot. Further, the method includes generating hint information including a small subset of frames representing the input video and performing object tracking and recognition based on the hint information. The method also includes fusing tracking and recognition results and outputting labeled objects based on the combined tracking and recognition results.Type: GrantFiled: October 25, 2013Date of Patent: September 1, 2015Assignee: TCL RESEARCH AMERICA INC.Inventors: Liang Peng, Haohong Wang
-
Patent number: 9098576Abstract: Systems and methods for audio matching are disclosed herein. In one embodiment, a system includes both interest point mixing and fingerprint mixing by using multiple interest point detection methods in parallel. Since multiple interest point detection methods are used in parallel, accuracy of audio matching is improved across a wide variety of audio signals. In addition the scalability of the disclosed audio matching system is increased by matching the fingerprint of an audio sample with a fingerprint of a reference sample versus matching an entire spectrogram. Accordingly, a more accurate and more general solution to audio matching can be accomplished.Type: GrantFiled: October 17, 2011Date of Patent: August 4, 2015Assignee: Google Inc.Inventors: Matthew Sharifi, Gheorghe Postelnicu, George Tzanetakis, Dominik Roblek
-
Patent number: 9053579Abstract: A system and method generate a graph lattice from exemplary images. At least one processor receives exemplary data graphs of the exemplary images and generates graph lattice nodes of size one from primitives. Until a termination condition is met, the at least one processor repeatedly: 1) generates candidate graph lattice nodes from accepted graph lattice nodes; 2) selects one or more candidate graph lattice nodes preferentially discriminating exemplary data graphs which are less discriminable than other exemplary data graphs using the accepted graph lattice nodes; and 3) promotes the selected graph lattice nodes to accepted status. The graph lattice is formed from the accepted graph lattice nodes and relations between the accepted graph lattice nodes.Type: GrantFiled: June 19, 2012Date of Patent: June 9, 2015Assignee: Palo Alto Research Center IncorporatedInventor: Eric Saund
-
Patent number: 9047286Abstract: Content from multiple different stations can be divided into segments based on time. Matched segments associated with each station can be identified by comparing content included in a first segment associated with a first station, to content included in a second segment associated with a second station. Syndicated content can be identified and tagged based, at least in part, on a relationship between sequences of matched segments on different stations. Various embodiments also include identifying main sequences associated with each station under consideration, removing some of the main sequences, and consolidating remaining main sequences based on various threshold criteria.Type: GrantFiled: December 17, 2009Date of Patent: June 2, 2015Assignee: iHeartMedia Management Services, Inc.Inventors: Periklis Beltas, Philippe Generali, David C. Jellison, Jr.
-
Patent number: 9026442Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.Type: GrantFiled: August 14, 2014Date of Patent: May 5, 2015Assignee: AT&T Intellectual Property I, L.P.Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
-
Patent number: 9020816Abstract: A method, system and apparatus are shown for identifying non-language speech sounds in a speech or audio signal. An audio signal is segmented and feature vectors are extracted from the segments of the audio signal. The segment is classified using a hidden Markov model (HMM) that has been trained on sequences of these feature vectors. Post-processing components can be utilized to enhance classification. An embodiment is described in which the hidden Markov model is used to classify a segment as a language speech sound or one of a variety of non-language speech sounds. Another embodiment is described in which the hidden Markov model is trained using discriminative learning.Type: GrantFiled: August 13, 2009Date of Patent: April 28, 2015Assignee: 21CT, Inc.Inventor: Matthew McClain
-
Patent number: 9009038Abstract: A method for analyzing a digital audio signal associated with a baby cry, comprising the steps of: (a) processing the digital audio signal using a spectral analysis to generate a spectral data; (b) processing the digital audio signal using a time-frequency analysis to generate a time-frequency characteristic; (c) categorizing the baby cry into one of a basic type and a special type based on the spectral data; (d) if the baby cry is of the basic type, determining a basic need based on the time-frequency characteristic and a predetermined lookup table; and (e) if the baby cry is of the special type, determining a special need by inputting the time-frequency characteristic into a pre-trained artificial neural network.Type: GrantFiled: May 22, 2013Date of Patent: April 14, 2015Assignee: National Taiwan Normal UniversityInventors: Jon-Chao Hong, Chao-Hsin Wu, Mei-Yung Chen
-
Publication number: 20150081298Abstract: In a speech processing apparatus, an acquisition unit is configured to acquire a speech. A separation unit is configured to separate the speech into a plurality of sections in accordance with a prescribed rule. A calculation unit is configured to calculate a degree of similarity in each combination of the sections. An estimation unit is configured to estimate, with respect to the each section, a direction of arrival of the speech. A correction unit is configured to group the sections whose directions of arrival are mutually similar into a same group and correct the degree of similarity with respect to the combination of the sections in the same group. A clustering unit is configured to cluster the sections by using the corrected degree of similarity.Type: ApplicationFiled: September 12, 2014Publication date: March 19, 2015Applicant: Kabushiki Kaisha ToshibaInventors: Ning DING, Yusuke KIDA, Makoto HIROHATA
-
Publication number: 20150073798Abstract: Technologies for automatic domain model generation include a computing device that accesses an n-gram index of a web corpus. The computing device generates a semantic graph of the web corpus for a relevant domain using the n-gram index. The semantic graph includes one or more related entities that are related to a seed entity. The computing device performs similarity discovery to identify and rank contextual synonyms within the domain. The computing device maintains a domain model including intents representing actions in the domain and slots representing parameters of actions or entities in the domain. The computing device performs intent discovery to discover intents and intent patterns by analyzing the web corpus using the semantic graph. The computing device performs slot discovery to discover slots, slot patterns, and slot values by analyzing the web corpus using the semantic graph. Other embodiments are described and claimed.Type: ApplicationFiled: September 8, 2014Publication date: March 12, 2015Inventors: Yael Karov, Eran Levy, Sari Brosh-Lipstein
-
Patent number: 8972261Abstract: A computer-implemented system and method for voice transcription error reduction is provided. Speech utterances are obtained from a voice stream and each speech utterance is associated with a transcribed value and a confidence score. Those utterances with transcription values associated with lower confidence scores are identified as questionable utterances. One of the questionable utterances is selected from the voice stream. A predetermined number of questionable utterances from other voice streams and having transcribed values similar to the transcribed value of the selected questionable utterance are identified as a pool of related utterances. A further transcribed value is received for each of a plurality of the questionable utterances in the pool of related utterances. A transcribed message is generated for the voice stream using those transcribed values with higher confidence scores and the further transcribed value for the selected questionable utterance.Type: GrantFiled: February 3, 2014Date of Patent: March 3, 2015Assignee: Intellisist, Inc.Inventor: David Milstein
-
Publication number: 20150051910Abstract: A natural language understanding system performs automatic unsupervised clustering of dialog data from a natural language dialog application. A log parser automatically extracts structured dialog data from application logs. A dialog generalizing module generalizes the extracted dialog data to generalization identifier vectors. A data clustering module automatically clusters the dialog data based on the generalization identifier vectors using an unsupervised density-based clustering algorithm without a predefined number of clusters and without a predefined distance threshold in an iterative approach based on a hierarchical ordering of the generalization.Type: ApplicationFiled: August 19, 2013Publication date: February 19, 2015Applicant: Nuance Communications, Inc.Inventor: Jean-Francois Lavallée
-
Publication number: 20150032452Abstract: A method for identifying concepts in a plurality of interactions includes: filtering, on a processor, the interactions based on intervals; creating, on the processor, a plurality of sentences from the filtered interactions; computing, on the processor, a saliency of each the sentences; pruning away, on the processor, sentences with low saliency for generating a set of informative sentences; clustering, on the processor, the sentences of the set of informative sentences for generating a plurality of sentence clusters, each of the clusters corresponding to a concept of the concepts; computing, on the processor, a saliency of each of the clusters; and naming, on the processor, each of the clusters.Type: ApplicationFiled: July 26, 2013Publication date: January 29, 2015Applicant: GENESYS TELECOMMUNICATIONS LABORATORIES, INC.Inventors: Amir Lev-Tov, Avraham Faizakof, David Ollinger, Yochai Konig
-
Patent number: 8942979Abstract: An acoustic processing apparatus is provided. The acoustic processing apparatus including a first extracting unit configured to extract a first acoustic model that corresponds with a first position among positions set in a speech recognition target area, a second extracting unit configured to extract at least one second acoustic model that corresponds with, respectively, at least one second position in proximity to the first position, and an acoustic model generating unit configured to generate a third acoustic model based on the first acoustic model, the second acoustic model, or a combination thereof.Type: GrantFiled: July 28, 2011Date of Patent: January 27, 2015Assignee: Samsung Electronics Co., Ltd.Inventors: Nam-Hoon Kim, Jeong-Su Kim, Jeong-Mi Cho
-
Publication number: 20150025887Abstract: In a method of diarization of audio data, audio data is segmented into a plurality of utterances. Each utterance is represented as an utterance model representative of a plurality of feature vectors. The utterance models are clustered. A plurality of speaker models are constructed from the clustered utterance models. A hidden Markov model is constructed of the plurality of speaker models. A sequence of identified speaker models is decoded.Type: ApplicationFiled: June 30, 2014Publication date: January 22, 2015Applicant: VERINT SYSTEMS LTD.Inventors: Oana Sidi, Ron Wein
-
Patent number: 8938390Abstract: In one embodiment, a method for detecting autism in a natural language environment using a microphone, sound recorder, and a computer programmed with software for the specialized purpose of processing recordings captured by the microphone and sound recorder combination, the computer programmed to execute the method, includes segmenting an audio signal captured by the microphone and sound recorder combination using the computer programmed for the specialized purpose into a plurality recording segments. The method further includes determining which of the plurality of recording segments correspond to a key child. The method further includes determining which of the plurality of recording segments that correspond to the key child are classified as key child recordings.Type: GrantFiled: February 27, 2009Date of Patent: January 20, 2015Assignee: LENA FoundationInventors: Dongxin D. Xu, Terrance D. Paul
-
Patent number: 8930190Abstract: An audio processing device including a feature calculation unit, a boundary calculation unit and a judgment unit, detects points of change of audio features from an audio signal in an AV content. The feature calculation unit calculates, for each unit section of the audio signal, section feature data expressing features of the audio signal in the unit section. The boundary calculation unit calculates, for each target unit section among the unit sections of the audio signal, a piece of boundary information relating to at least one boundary of a similarity section. The similarity section consists of consecutive unit sections, inclusive of the target unit section, which each have similar section feature data. The judgment unit calculates a priority of each boundary indicated by one or more of the pieces of boundary information and judges whether the boundary is a scene change point based on the priority.Type: GrantFiled: March 11, 2013Date of Patent: January 6, 2015Assignee: Panasonic Intellectual Property Corporation Of AmericaInventors: Tomohiro Konuma, Tsutomu Uenoyama
-
Publication number: 20150006175Abstract: The present invention relates to an apparatus and a method for recognizing continuous speech having large vocabulary. In the present invention, large vocabulary in large vocabulary continuous speech having a lot of same kinds of vocabulary is divided to a reasonable number of clusters, then representative vocabulary for pertinent clusters is selected and first recognition is performed with the representative vocabulary, then if the representative vocabulary is recognized by use of the result of first recognition, re-recognition is performed against all words in the cluster where the recognized representative vocabulary belongs.Type: ApplicationFiled: June 13, 2014Publication date: January 1, 2015Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTEInventors: Ki-Young PARK, Yun-Keun LEE, Hoon CHUNG
-
Publication number: 20140358541Abstract: Reliable speaker-based clustering of speech utterances allows improved speaker recognition and speaker-based speech segmentation. According to at least one example embodiment, an iterative bottom-up speaker-based clustering approach employs voiceprints of speech utterances, such as i-vectors. At each iteration, a clustering confidence score in terms of Silhouette Width Criterion (SWC) values is evaluated, and a pair of nearest clusters is merged into a single cluster. The pair of nearest clusters merged is determined based on a similarity score indicative of similarity between voiceprints associated with different clusters. A final clustering pattern is then determined as a set of clusters associated with an iteration corresponding to the highest clustering confidence score evaluated. The SWC used may further be a modified SWC enabling detection of an early stop of the iterative approach.Type: ApplicationFiled: May 31, 2013Publication date: December 4, 2014Inventors: Daniele Ernesto Colibro, Claudio Vair, Kevin R. Farrell
-
Patent number: 8892436Abstract: A method of recognizing speech is provided. The method includes the operations of (a) dividing first speech that is input to a speech recognizing apparatus into frames; (b) converting the frames of the first speech into frames of second speech by applying conversion rules to the divided frames, respectively; and (c) recognizing, by the speech recognizing apparatus, the frames of the second speech, wherein (b) comprises converting the frames of the first speech into the frames of the second speech by reflecting at least one frame from among the frames that are previously positioned with respect to a frame of the first speech.Type: GrantFiled: October 19, 2011Date of Patent: November 18, 2014Assignees: Samsung Electronics Co., Ltd., Seoul National University Industry FoundationInventors: Ki-wan Eom, Chang-woo Han, Tae-gyoon Kang, Nam-soo Kim, Doo-hwa Hong, Jae-won Lee, Hyung-joon Lim
-
Patent number: 8892438Abstract: An apparatus, a method, and a machine-readable medium are provided for characterizing differences between two language models. A group of utterances from each of a group of time domains are examined. One of a significant word change or a significant word class change within the plurality of utterances is determined. A first cluster of utterances including a word or a word class corresponding to the one of the significant word change or the significant word class change is generated from the utterances. A second cluster of utterances not including the word or the word class corresponding to the one of the significant word change or the significant word class change is generated from the utterances.Type: GrantFiled: September 14, 2010Date of Patent: November 18, 2014Assignee: AT&T Intellectual Property II, L.P.Inventors: Allen Louis Gorin, John Grothendieck, Jeremy Huntley Greet Wright
-
Publication number: 20140337027Abstract: A voice processing device includes: an acquirer which acquires feature quantities of vowel sections included in voice data; a classifier which classifies, among the acquired feature quantities, feature quantities corresponding to a plurality of same vowels into a plurality of clusters for respective vowels with unsupervised classification; and a determiner which determines a combination of clusters corresponding to the same speaker from clusters classified for the plurality of vowels.Type: ApplicationFiled: April 11, 2014Publication date: November 13, 2014Applicant: CASIO COMPUTER CO., LTD.Inventor: Hiroyasu IDE
-
Patent number: 8886535Abstract: A method of optimizing the calculation of matching scores between phone states and acoustic frames across a matrix of an expected progression of phone states aligned with an observed progression of acoustic frames within an utterance is provided. The matrix has a plurality of cells associated with a characteristic acoustic frame and a characteristic phone state. A first set and second set of cells that meet a threshold probability of matching a first phone state or a second phone state, respectively, are determined. The phone states are stored on a local cache of a first core and a second core, respectively. The first and second sets of cells are also provided to the first core and second core, respectively. Further, matching scores of each characteristic state and characteristic observation of each cell of the first set of cells and of the second set of cells are calculated.Type: GrantFiled: January 23, 2014Date of Patent: November 11, 2014Assignee: Accumente, LLCInventors: Jike Chong, Ian Richard Lane, Senaka Wimal Buthpitiya
-
Patent number: 8880107Abstract: In one embodiment, a method provides for monitoring and analyzing communications of a monitored user on behalf of a monitoring user, to determine whether the communication includes a violation. For example, SMS messages, MMS messages, IMs, e-mails, social network site postings or voice mails of a child may be monitored on behalf of a parent. In one embodiment, an algorithm is used to analyze a normalized version of the communication, which algorithm is retrained using results of past analysis, to determine a probability of a communication including a violation.Type: GrantFiled: January 28, 2011Date of Patent: November 4, 2014Assignee: Protext Mobility, Inc.Inventors: Edward Movsesyan, Igor Slavinsky
-
Publication number: 20140316784Abstract: Technology for improving the predictive accuracy of input word recognition on a device by dynamically updating the lexicon of recognized words based on the word choices made by similar users. The technology collects users' vocabulary choices (e.g., words that each user uses, or adds to or removes from a word recognition dictionary), associates users who make similar choices, aggregates related vocabulary choices, filters the words, and sends words identified as likely choices for that user to the user's device. Clusters may include, for example, users in a particular location (e.g., sets of people who use words such as “Puyallup,” “Gloucester,” or “Waiheke”), users with a particular professional or hobby vocabulary, or application-specific vocabulary (e.g., word choices in map searches or email messages).Type: ApplicationFiled: April 24, 2013Publication date: October 23, 2014Inventors: Ethan R. Bradford, Simon Corston, David J. Kay, Donni McCray, Keith Trnka
-
Publication number: 20140303978Abstract: A method and apparatus are provided for automatically acquiring grammar fragments for recognizing and understanding fluently spoken language. Grammar fragments representing a set of syntactically and semantically similar phrases may be generated using three probability distributions: of succeeding words, of preceding words, and of associated call-types. The similarity between phrases may be measured by applying Kullback-Leibler distance to these tree probability distributions. Phrases being close in all three distances may be clustered into a grammar fragment.Type: ApplicationFiled: March 4, 2014Publication date: October 9, 2014Applicant: AT&T INTELLECTUAL PROPERTY I, L.P.Inventors: Kazuhiro Arai, Allen L. Gorin, Giuseppe Riccardi, Jeremy H. Wright
-
Patent number: 8838448Abstract: A method is described for use with automatic speech recognition using discriminative criteria for speaker adaptation. An adaptation evaluation is performed of speech recognition performance data for speech recognition system users. Adaptation candidate users are identified based on the adaptation evaluation for whom an adaptation process is likely to improve system performance.Type: GrantFiled: April 5, 2012Date of Patent: September 16, 2014Assignee: Nuance Communications, Inc.Inventors: Dan Ning Jiang, Vaibhava Goel, Dimitri Kanevsky, Yong Qin