Clustering Patents (Class 704/245)
  • Patent number: 10789958
    Abstract: Methods, computer program products, and systems are presented. The methods include, for instance: obtaining the media file with a speech and identifying speakers on clusters separated by disfluencies and change of speakers. Clusters are re-segmented rearranged during diarization. Speaker identifications for the clusters in the media file is produced.
    Type: Grant
    Filed: September 30, 2019
    Date of Patent: September 29, 2020
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Aaron K. Baughman, Stephen C. Hammer
  • Patent number: 10607111
    Abstract: Described is a system for classifying novel objects in imagery. In operation, the system extracts salient patches from a plurality of unannotated images using a multi-layer network. Activations of the multi-layer network are clustered into key attribute, with the key attributes being displayed to a user on a display, thereby prompting the user to annotate the key attributes with class label. An attribute database is then generated based on user prompted annotations of the key attributes. A test image can then be passed through the system, allowing the system to classify at least one object in the test image by identifying an object class in the attribute database. Finally, a device can be caused to operate or maneuver based on the classification of the at least one object in the test image.
    Type: Grant
    Filed: February 4, 2019
    Date of Patent: March 31, 2020
    Assignee: HRL Laboratories, LLC
    Inventors: Soheil Kolouri, Charles E. Martin, Kyungnam Kim, Heiko Hoffmann
  • Patent number: 10592611
    Abstract: Embodiments of the present invention provide a system for automatically extracting conversational structure from a voice record based on lexical and acoustic features. The system also aggregates business-relevant statistics and entities from a collection of spoken conversations. The system may infer a coarse-level conversational structure based on fine-level activities identified from extracted acoustic features. The system improves significantly over previous systems by extracting structure based on lexical and acoustic features. This enables extracting conversational structure on a larger scale and finer level of detail than previous systems, and can feed an analytics and business intelligence platform, e.g. for customer service phone calls. During operation, the system obtains a voice record. The system then extracts a lexical feature using automatic speech recognition (ASR). The system extracts an acoustic feature.
    Type: Grant
    Filed: October 24, 2016
    Date of Patent: March 17, 2020
    Assignee: Conduent Business Services, LLC
    Inventors: Jesse Vig, Harish Arsikere, Margaret H. Szymanski, Luke R. Plurkowski, Kyle D. Dent, Daniel G. Bobrow, Daniel Davies, Eric Saund
  • Patent number: 10559303
    Abstract: The method comprises receive first audio comprising speech from a user of a computing device, detecting an end of speech in the first audio, generating an ASR result based, at least in part, on a portion of the first audio prior to the detected end of speech, determining whether a valid action can be performed by a speech-enabled application installed on the computing device using the ASR result, and processing second audio when it is determined that a valid action cannot be performed by the speech-enabled application using the ASR result.
    Type: Grant
    Filed: May 23, 2016
    Date of Patent: February 11, 2020
    Assignee: Nuance Communications, Inc.
    Inventor: Mark Fanty
  • Patent number: 10559311
    Abstract: Methods, computer program products, and systems are presented. The methods include, for instance: obtaining the media file with a speech and identifying speakers on clusters separated by disfluencies and change of speakers. Clusters are re-segmented rearranged during diarization. Speaker identifications for the clusters in the media file is produced.
    Type: Grant
    Filed: March 31, 2017
    Date of Patent: February 11, 2020
    Assignee: International Business Machines Corporation
    Inventors: Aaron K. Baughman, Stephen C. Hammer
  • Patent number: 10553206
    Abstract: According to one embodiment, a voice keyword detection apparatus includes a memory and a circuit coupled with the memory. The circuit calculates a first score for a first sub-keyword and a second score for a second sub-keyword. The circuit detects the first and second sub-keywords based on the first and second scores. The circuit determines, when the first sub-keyword is detected from one or more first frames, to accept the first sub-keyword. The circuit determines, when the second sub-keyword is detected from one or more second frames, whether to accept the second sub-keyword based on a start time and/or an end time of the one or more first frames and a start time and/or an end time of the one or more second frames.
    Type: Grant
    Filed: August 30, 2017
    Date of Patent: February 4, 2020
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventor: Hiroshi Fujimura
  • Patent number: 10535339
    Abstract: According to an embodiment, a speech recognition result output device includes a storage and processing circuitry. The storage is configured to store a language model for speech recognition. The processing circuitry is coupled to the storage and configured to acquire a phonetic sequence, convert the phonetic sequence into a phonetic sequence feature vector, convert the phonetic sequence feature vector into graphemes using the language model, and output the graphemes.
    Type: Grant
    Filed: June 15, 2016
    Date of Patent: January 14, 2020
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventor: Hiroshi Fujimura
  • Patent number: 10496930
    Abstract: With reference to information storing a co-occurrence probability of each of plural words in association with each of distribution-destinations, the apparatus extracts, from a message to be distributed, an unknown-word that is not included in the plural words, where the co-occurrence probability indicates a probability that each word is included in a message to be distributed to each distribution-destination.
    Type: Grant
    Filed: September 11, 2017
    Date of Patent: December 3, 2019
    Assignee: FUJITSU LIMITED
    Inventors: Yukihiro Watanabe, Ken Yokoyama, Masahiro Asaoka, Hiroshi Otsuka, Reiko Kondo
  • Patent number: 10459980
    Abstract: A display system for an issue comprises an input unit, a display unit and an processing unit. The input unit receives an initial keyword corresponding to an issue. The display unit displays at least a derivative issue generated from the issue during a time period according to time-based characteristics. The processing unit coupled to the input unit and the display unit obtains tags of subject contents of web pages, and obtains a present keywords group according to co-occurrence correlation of the tags. The processing unit analyzes the correlation between the present keywords calculated based on social voice, analyzing overlap rate for the present keywords compared with the initial keywords, and compares correlation between the present keywords with correlation between the initial keywords calculated based on social voice, in order to determine whether at least one of the derivative issue is generated.
    Type: Grant
    Filed: April 20, 2016
    Date of Patent: October 29, 2019
    Assignee: Institute For Information Industry
    Inventors: Tai-Ta Kuo, Ping-I Chen
  • Patent number: 10402742
    Abstract: A method includes accessing a first sensor log and a corresponding first reference log. Each of the first sensor log and the first reference log includes a series of measured values of a parameter according to a first time series. The method also includes accessing a second sensor log and a corresponding second reference log. Each of the second sensor log and the second reference log includes a series of measured values of a parameter according to a second time series. The method also includes dynamically time warping the first reference log and/or second reference log by a first transformation between the first time series and a common time-frame and/or a second transformation between the second time series and the common time-frame. The method also includes generating first and second warped sensor logs by applying the or each transformation to the corresponding ones of the first and second sensor logs.
    Type: Grant
    Filed: December 12, 2017
    Date of Patent: September 3, 2019
    Assignee: Palantir Technologies Inc.
    Inventors: Ezra Spiro, Andre Frederico Cavalheiro Menck, Peter Maag, Thomas Powell
  • Patent number: 10380997
    Abstract: Systems and methods are disclosed for generating internal state representations of a neural network during processing and using the internal state representations for classification or search. In some embodiments, the internal state representations are generated from the output activation functions of a subset of nodes of the neural network. The internal state representations may be used for classification by training a classification model using internal state representations and corresponding classifications. The internal state representations may be used for search, by producing a search feature from an search input and comparing the search feature with one or more feature representations to find the feature representation with the highest degree of similarity.
    Type: Grant
    Filed: August 22, 2018
    Date of Patent: August 13, 2019
    Assignee: Deepgram, Inc.
    Inventors: Jeff Ward, Adam Sypniewski, Scott Stephenson
  • Patent number: 10325602
    Abstract: Systems, methods, devices, and other techniques for training and using a speaker verification neural network. A computing device may receive data that characterizes a first utterance. The computing device provides the data that characterizes the utterance to a speaker verification neural network. Subsequently, the computing device obtains, from the speaker verification neural network, a speaker representation that indicates speaking characteristics of a speaker of the first utterance. The computing device determines whether the first utterance is classified as an utterance of a registered user of the computing device. In response to determining that the first utterance is classified as an utterance of the registered user of the computing device, the device may perform an action for the registered user of the computing device.
    Type: Grant
    Filed: August 2, 2017
    Date of Patent: June 18, 2019
    Assignee: Google LLC
    Inventors: Hasim Sak, Ignacio Lopez Moreno, Alan Sean Papir, Li Wan, Quan Wang
  • Patent number: 10269356
    Abstract: There is provided a system comprising a microphone, configured to receive an input speech from an individual, an analog-to-digital (A/D) converter to convert the input speech to digital form and generate a digitized speech, a memory storing an executable code and an age estimation database, a hardware processor executing the executable code to receive the digitized speech, identify a plurality of boundaries in the digitized speech delineating a plurality of phonemes in the digitized speech, extract a plurality of formant-based feature vectors from each phoneme in the digitized speech based on at least one of a formant position, a formant bandwidth, and a formant dispersion, compare the plurality of formant-based feature vectors with age determinant formant-based feature vectors of the age estimation database, determine the age of the individual when the comparison finds a match in the age estimation database, and communicate an age-appropriate response to the individual.
    Type: Grant
    Filed: August 22, 2016
    Date of Patent: April 23, 2019
    Assignee: Disney Enterprises, Inc.
    Inventors: Rita Singh, Jill Fain Lehman
  • Patent number: 10249294
    Abstract: A speech recognition method capable of automatic generation of phones according to the present invention includes: unsupervisedly learning a feature vector of speech data; generating a phone set by clustering acoustic features selected based on an unsupervised learning result; allocating a sequence of phones to the speech data on the basis of the generated phone set; and generating an acoustic model on the basis of the sequence of phones and the speech data to which the sequence of phones is allocated.
    Type: Grant
    Filed: July 11, 2017
    Date of Patent: April 2, 2019
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Dong Hyun Kim, Young Jik Lee, Sang Hun Kim, Seung Hi Kim, Min Kyu Lee, Mu Yeol Choi
  • Patent number: 10199036
    Abstract: A network device for implementing voice input comprises an input-obtaining module for obtaining voice input information, a sequence-determining module for determining an input character sequence corresponding to the voice input information based on a voice recognition model, an accuracy-determining module for determining appearance-probability information corresponding to word segments in the input character sequence so as to obtain accuracy information of the word segments, and a transmitting module for transmitting, to a user device, the input character sequence and the accuracy information of the word segments corresponding to the voice input information.
    Type: Grant
    Filed: December 17, 2013
    Date of Patent: February 5, 2019
    Assignee: Baidu Online Network Technology (Beijing) Co., LTD.
    Inventors: Yangyang Lu, Lei Jia
  • Patent number: 10147438
    Abstract: Embodiments of the invention include method, systems and computer program products for role modeling. Aspects of the invention include receiving, by a processor, audio data, wherein the audio data includes a plurality of audio conversation for one or more speakers. The one or more segments for each of the plurality of audio conversations are partitioned. A speaker is associated with each of the one or more segments. The one or more segments for each of the plurality of audio conversations are labeled with roles utilizing a speaker recognition engine. Speakers are clustered based at least in part on a number of times the speakers are present in an audio conversation.
    Type: Grant
    Filed: March 2, 2017
    Date of Patent: December 4, 2018
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Kenneth W. Church, Jason W. Pelecanos, Josef Vopicka, Weizhong Zhu
  • Patent number: 10141009
    Abstract: Methods, systems, and apparatuses for audio event detection, where the determination of a type of sound data is made at the cluster level rather than at the frame level. The techniques provided are thus more robust to the local behavior of features of an audio signal or audio recording. The audio event detection is performed by using Gaussian mixture models (GMMs) to classify each cluster or by extracting an i-vector from each cluster. Each cluster may be classified based on an i-vector classification using a support vector machine or probabilistic linear discriminant analysis. The audio event detection significantly reduces potential smoothing error and avoids any dependency on accurate window-size tuning. Segmentation may be performed using a generalized likelihood ratio and a Bayesian information criterion, and the segments may be clustered using hierarchical agglomerative clustering. Audio frames may be clustered using K-means and GMMs.
    Type: Grant
    Filed: May 31, 2017
    Date of Patent: November 27, 2018
    Assignee: Pindrop Security, Inc.
    Inventors: Elie Khoury, Matthew Garland
  • Patent number: 10121466
    Abstract: Speech recognition systems that use voice templates may create (or update) voice templates for a particular user by training (or re-training). If a training results in a vocabulary with similar voice templates, then the speech recognition system's performance may suffer. The present invention provides embraces methods for training a speech recognition system to prevent voice template similarity. In these methods, a trained word's voice template may be evaluated for similarity to other vocabulary templates prior to enrolling the voice template into the vocabulary. If template similarity is found, then a user may be prompted to retrain the system using an alternate word. Alternatively, the user may be prompted to retrain the system with the word spoken more clearly. This dynamic enrollment training analysis insures that all templates in the vocabulary are distinct.
    Type: Grant
    Filed: February 11, 2015
    Date of Patent: November 6, 2018
    Assignee: Hand Held Products, Inc.
    Inventor: John Pecorari
  • Patent number: 10109280
    Abstract: In a method of diarization of audio data, audio data is segmented into a plurality of utterances. Each utterance is represented as an utterance model representative of a plurality of feature vectors. The utterance models are clustered. A plurality of speaker models are constructed from the clustered utterance models. A hidden Markov model is constructed of the plurality of speaker models. A sequence of identified speaker models is decoded.
    Type: Grant
    Filed: December 12, 2017
    Date of Patent: October 23, 2018
    Assignee: VERINT SYSTEMS LTD.
    Inventors: Oana Sidi, Ron Wein
  • Patent number: 9947314
    Abstract: Software that trains an artificial neural network for generating vector representations for natural language text, by performing the following steps: (i) receiving, by one or more processors, a set of natural language text; (ii) generating, by one or more processors, a set of first metadata for the set of natural language text, where the first metadata is generated using supervised learning method(s); (iii) generating, by one or more processors, a set of second metadata for the set of natural language text, where the second metadata is generated using unsupervised learning method(s); and (iv) training, by one or more processors, an artificial neural network adapted to generate vector representations for natural language text, where the training is based, at least in part, on the received natural language text, the generated set of first metadata, and the generated set of second metadata.
    Type: Grant
    Filed: February 21, 2017
    Date of Patent: April 17, 2018
    Assignee: International Business Machines Corporation
    Inventors: Liangliang Cao, James J. Fan, Chang Wang, Bing Xiang, Bowen Zhou
  • Patent number: 9875743
    Abstract: Disclosed herein are methods of diarizing audio data using first-pass blind diarization and second-pass blind diarization that generate speaker statistical models, wherein the first pass-blind diarization is on a per-frame basis and the second pass-blind diarization is on a per-word basis, and methods of creating acoustic signatures for a common speaker based only on the statistical models of the speakers in each audio session.
    Type: Grant
    Filed: January 26, 2016
    Date of Patent: January 23, 2018
    Assignee: VERINT SYSTEMS LTD.
    Inventors: Alex Gorodetski, Ido Shapira, Ron Wein, Oana Sidi
  • Patent number: 9860669
    Abstract: An audio apparatus includes a receiver configured to receive audio data and audio transducer position data for a plurality of audio transducers; and a renderer configured to render the audio data by generating audio transducer drive signals for the audio transducers from the audio data. Further, a clusterer is configured to cluster the audio transducers into a set of clusters in response to the audio transducer position data and to distances between audio transducers in accordance with a distance metric. A render controller is configured to adapt the rendering in response to the clustering. The apparatus is configured to select array processing techniques for specific subsets that contain audio transducers that are sufficiently close and allow automatic adaptation to audio transducer configurations thereby, e.g., allowing a user increased flexibility in positioning loudspeakers.
    Type: Grant
    Filed: May 6, 2014
    Date of Patent: January 2, 2018
    Assignee: KONINKLIJKE PHILIPS N.V.
    Inventors: Werner Paulus Josephus De Bruijn, Arnoldus Werner Johannes Oomen, Aki Sakari Haermae
  • Patent number: 9837068
    Abstract: A method for verifying at least one sound sample to be used in generating a sound detection model in an electronic device includes receiving a first sound sample; extracting a first acoustic feature from the first sound sample; receiving a second sound sample; extracting a second acoustic feature from the second sound sample; and determining whether the second acoustic feature is similar to the first acoustic feature.
    Type: Grant
    Filed: April 8, 2015
    Date of Patent: December 5, 2017
    Assignee: QUALCOMM Incorporated
    Inventors: Sunkuk Moon, Minho Jin, Haiying Xia, Hesu Huang, Warren Frederick Dale
  • Patent number: 9830931
    Abstract: One embodiment of the present invention sets forth a technique for determining a set of sound parameters associated with a sound type. The technique includes receiving, via a network and from each a first plurality of remote computing devices, an audio recording of a first sound type and a descriptor associated with the first sound type. The technique further includes processing the audio recordings via a processor to determine a first set of sound parameters associated with the first sound type. The technique further includes receiving a request associated with the descriptor from at least one remote computing device and, in response, transmitting the first set of sound parameters associated with the first sound type to the at least one remote computing device.
    Type: Grant
    Filed: December 31, 2015
    Date of Patent: November 28, 2017
    Assignee: HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED
    Inventors: Donald Joseph Butts, Brandon Stacey
  • Patent number: 9714884
    Abstract: A statistical basic classification model of acoustic features generated for at least one reference object is automatically adapted by a data processing unit based on acoustic features of a noise generated by an object to be investigated to obtain an individually adapted statistical classification model. The data processing unit then classifies the state of the noise-generating object based on the individually adapted statistical classification model.
    Type: Grant
    Filed: April 29, 2009
    Date of Patent: July 25, 2017
    Assignee: SIEMENS AKTIENGESELLSCHAFT
    Inventors: Joachim Hofer, Lutz Leutelt
  • Patent number: 9691391
    Abstract: Systems and methods to perform speaker clustering determine which audio segments appear to include sound generated by the same speaker. Speaker clustering is based on creating a graph in which a node represents an audio segment and an edge between two nodes represents a relationship and/or correspondence that reflects a probability, likelihood, or other indication that the two nodes represent audio segments of the same speaker. This graph is analyzed to detect individual communities of nodes that associate to an individual speaker.
    Type: Grant
    Filed: May 21, 2015
    Date of Patent: June 27, 2017
    Assignee: KnuEdge Incorporated
    Inventor: Rodney Gateau
  • Patent number: 9672814
    Abstract: Software that trains an artificial neural network for generating vector representations for natural language text, by performing the following steps: (i) receiving, by one or more processors, a set of natural language text; (ii) generating, by one or more processors, a set of first metadata for the set of natural language text, where the first metadata is generated using supervised learning method(s); (iii) generating, by one or more processors, a set of second metadata for the set of natural language text, where the second metadata is generated using unsupervised learning method(s); and (iv) training, by one or more processors, an artificial neural network adapted to generate vector representations for natural language text, where the training is based, at least in part, on the received natural language text, the generated set of first metadata, and the generated set of second metadata.
    Type: Grant
    Filed: May 8, 2015
    Date of Patent: June 6, 2017
    Assignee: International Business Machines Corporation
    Inventors: Liangliang Cao, James J. Fan, Chang Wang, Bing Xiang, Bowen Zhou
  • Patent number: 9666192
    Abstract: Methods and apparatus for reducing latency in speech recognition applications. The method comprises receive first audio comprising speech from a user of a computing device, detecting an end of speech in the first audio, generating an ASR result based, at least in part, on a portion of the first audio prior to the detected end of speech, determining whether a valid action can be performed by a speech-enabled application installed on the computing device using the ASR result, and processing second audio when it is determined that a valid action cannot be performed by the speech-enabled application using the ASR result.
    Type: Grant
    Filed: May 26, 2015
    Date of Patent: May 30, 2017
    Assignee: Nuance Communications, Inc.
    Inventor: Mark Fanty
  • Patent number: 9659560
    Abstract: Software that trains an artificial neural network for generating vector representations for natural language text, by performing the following steps: (i) receiving, by one or more processors, a set of natural language text; (ii) generating, by one or more processors, a set of first metadata for the set of natural language text, where the first metadata is generated using supervised learning method(s); (iii) generating, by one or more processors, a set of second metadata for the set of natural language text, where the second metadata is generated using unsupervised learning method(s); and (iv) training, by one or more processors, an artificial neural network adapted to generate vector representations for natural language text, where the training is based, at least in part, on the received natural language text, the generated set of first metadata, and the generated set of second metadata.
    Type: Grant
    Filed: September 30, 2015
    Date of Patent: May 23, 2017
    Assignee: International Business Machines Corporation
    Inventors: Liangliang Cao, James J. Fan, Chang Wang, Bing Xiang, Bowen Zhou
  • Patent number: 9641968
    Abstract: A system for sharing moment experiences is described. A system receives moment data from an input to a mobile device. The system receives geographic location information, time information, and contextual information that is local to the mobile device. The system creates a message about the moment data based on the geographic location information, the time information, and the contextual information. The system outputs the moment data with the message.
    Type: Grant
    Filed: May 15, 2015
    Date of Patent: May 2, 2017
    Assignee: Krumbs, Inc.
    Inventors: Neilesh Jain, Ramesh Jain, Pinaki Sinha
  • Patent number: 9620148
    Abstract: Systems, vehicles, and methods for limiting speech-based access to an audio metadata database are described herein. Audio metadata databases described herein include a plurality of audio metadata entries. Each audio metadata entry includes metadata information associated with at least one audio file. Embodiments described herein determine when a size of the audio metadata database reaches a threshold size, and limit which of the plurality of audio metadata entries may be accessed in response to the speech input signal when the size of the audio metadata database reaches the threshold size.
    Type: Grant
    Filed: July 1, 2013
    Date of Patent: April 11, 2017
    Assignee: Toyota Motor Engineering & Manufacturing North America, Inc.
    Inventor: Eric Randell Schmidt
  • Patent number: 9595260
    Abstract: A modeling device comprises a front end which receives enrollment speech data from each target speaker, a reference anchor set generation unit which generates a reference anchor set using the enrollment speech data based on an anchor space, and a voice print generation unit which generates voice prints based on the reference anchor set and the enrollment speech data. By taking the enrollment speech and speaker adaptation technique into account, anchor models with a smaller size can be generated, so reliable and robust speaker recognition with a smaller size reference anchor set is possible.
    Type: Grant
    Filed: December 10, 2010
    Date of Patent: March 14, 2017
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventors: Haifeng Shen, Long Ma, Bingqi Zhang
  • Patent number: 9576582
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: February 23, 2016
    Date of Patent: February 21, 2017
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 9524291
    Abstract: Techniques involving visual display of information related to matching user utterances against graph patterns are described. In one or more implementations, an utterance of a user is obtained that has been indicated as corresponding to a graph pattern through linguistic analysis. The utterance is displayed in a user interface as a representation of the graph pattern.
    Type: Grant
    Filed: October 6, 2010
    Date of Patent: December 20, 2016
    Assignee: Virtuoz SA
    Inventors: Dan Teodosiu, Elizabeth Ireland Powers, Pierre Serge Vincent LeRoy, Sebastien Jean-Marie Christian Saunier
  • Patent number: 9514391
    Abstract: In an image classification method, a feature vector representing an input image is generated by unsupervised operations including extracting local descriptors from patches distributed over the input image, and a classification value for the input image is generated by applying a neural network (NN) to the feature vector. Extracting the feature vector may include encoding the local descriptors extracted from each patch using a generative model, such as Fisher vector encoding, aggregating the encoded local descriptors to form a vector, projecting the vector into a space of lower dimensionality, for example using Principal Component Analysis (PCA), and normalizing the feature vector of lower dimensionality to produce the feature vector representing the input image. A set of mid-level features representing the input image may be generated as the output of an intermediate layer of the NN.
    Type: Grant
    Filed: April 20, 2015
    Date of Patent: December 6, 2016
    Assignee: XEROX CORPORATION
    Inventors: Florent C. Perronnin, Diane Larlus-Larrondo
  • Patent number: 9449051
    Abstract: According to one embodiment, a topic extracting apparatus extracts each term from a target document set, and calculates an appearance frequency of each term and a document frequency that each term appears. The topic extracting apparatus acquires a document set of appearance documents with respect to each extracted term, calculates a topic degree, extracts each term whose topic degree is not lower than a predetermined value as a topic word, and calculates freshness of the extracted topic word based on an appearance date and time. The topic extracting apparatus presents the extracted topic words in order of the freshness and also presents the number of appearance documents of each presented topic word per unit span.
    Type: Grant
    Filed: September 10, 2013
    Date of Patent: September 20, 2016
    Assignees: KABUSHIKI KAISHA TOSHIBA, TOSHIBA SOLUTIONS CORPORATION
    Inventors: Hideki Iwasaki, Kazuyuki Goto, Shigeru Matsumoto, Yasunari Miyabe, Mikito Kobayashi
  • Patent number: 9411829
    Abstract: Disclosed herein is a system and method that facilitate searching and/or browsing of images by clustering, or grouping, the images into a set of image clusters using facets, such as without limitation visual properties or visual characteristics, of the images, and representing each image cluster by a representative image selected for the image cluster. A map-reduce based probabilistic topic model may be used to identify one or more images belonging to each image cluster and update model parameters.
    Type: Grant
    Filed: June 10, 2013
    Date of Patent: August 9, 2016
    Assignee: Yahoo! Inc.
    Inventors: Jia Li, Nadav Golbandi, XianXing Zhang
  • Patent number: 9412381
    Abstract: A triple factor authentication in one step method and system is disclosed. According to one embodiment, an Integrated Voice Biometrics Cloud Security Gateway (IVCS Gateway) intercepts an access request to a resource server from a user using a user device. IVCS Gateway then authenticates the user by placing a call to the user device and sending a challenge message prompting the user to respond by voice. After receiving the voice sample of the user, the voice sample is compared against a stored voice biometrics record for the user. The voice sample is also converted into a text phrase and compared against a stored secret text phrase. In an alternative embodiment, an IVCS Gateway that is capable of making non-binary access decisions and associating multiple levels of access with a single user or group is described.
    Type: Grant
    Filed: March 30, 2011
    Date of Patent: August 9, 2016
    Assignee: ACK3 BIONETICS PRIVATE LTD.
    Inventor: Sajit Bhaskaran
  • Patent number: 9378742
    Abstract: Disclosed are an apparatus for recognizing voice using multiple acoustic models according to the present invention and a method thereof. An apparatus for recognizing voice using multiple acoustic models includes a voice data database (DB) configured to store voice data collected in various noise environments; a model generating means configured to perform classification for each speaker and environment based on the collected voice data, and to generate an acoustic model of a binary tree structure as the classification result; and a voice recognizing means configured to extract feature data of voice data when the voice data is received from a user, to select multiple models from the generated acoustic model based on the extracted feature data, to parallel recognize the voice data based on the selected multiple models, and to output a word string corresponding to the voice data as the recognition result.
    Type: Grant
    Filed: March 18, 2013
    Date of Patent: June 28, 2016
    Assignee: Electronics and Telecommunications Research Institute
    Inventor: Dong Hyun Kim
  • Patent number: 9373338
    Abstract: An automatic speech recognition engine receives an acoustic-echo processed signal from an acoustic-echo processing (AEP) module, where said echo processed signal contains mainly the speech from the near-end talker. The automatic speech recognition engine analyzes the content of the acoustic-echo processed signal to determine whether words or keywords are present. Based upon the results of this analysis, the automatic speech recognition engine produces a value reflecting the likelihood that some words or keywords are detected. Said value is provided to the AEP module. Based upon the value, the AEP module determines if there is double talk and processes the incoming signals accordingly to enhance its performance.
    Type: Grant
    Filed: June 25, 2012
    Date of Patent: June 21, 2016
    Assignee: Amazon Technologies, Inc.
    Inventors: Ramya Gopalan, Kavitha Velusamy, Wai C. Chu, Amit S. Chhetri
  • Patent number: 9336774
    Abstract: Methods, systems, and apparatus, for pattern recognition. One aspect includes a pattern recognizing engine that includes multiple pattern recognizer processors that form a hierarchy of pattern recognizer processors. The pattern recognizer processors include a child pattern recognizer processor at a lower level in the hierarch and a parent pattern recognizer processor at a higher level of the hierarchy, where the child pattern recognizer processor is configured to provide a first complex recognition output signal to a pattern recognizer processor at a higher level than the child pattern recognizer processor, and the parent pattern recognizer processor is configured to receive as an input a second complex recognition output signal from a pattern recognizer processor at a lower level than the parent pattern recognizer processor.
    Type: Grant
    Filed: April 22, 2013
    Date of Patent: May 10, 2016
    Assignee: Google Inc.
    Inventor: Raymond C. Kurzweil
  • Patent number: 9305553
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition. In one aspect, a computer-based method includes receiving a speech corpus at a speech management server system that includes multiple speech recognition engines tuned to different speaker types; using the speech recognition engines to associate the received speech corpus with a selected one of multiple different speaker types; and sending a speaker category identification code that corresponds to the associated speaker type from the speech management server system over a network. The speaker category identification code can be used by any one of speech-interactive applications coupled to the network to select one of an appropriate one of multiple application-accessible speech recognition engines tuned to the different speaker types in response to an indication that a user accessing the application is associated with a particular one of the speaker category identification codes.
    Type: Grant
    Filed: April 28, 2011
    Date of Patent: April 5, 2016
    Inventor: William S. Meisel
  • Patent number: 9305547
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: April 28, 2015
    Date of Patent: April 5, 2016
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 9275044
    Abstract: A method and system are provided for finding synonyms which are more contextually relevant to the intended use of a particular word. The system finds a list of synonyms for the input word and also finds a list of synonyms for an additional word entered by the user to approximate the intended usage of the input word. These two lists of synonyms are compared to find words common to both lists, and the common words are presented to the user as potential synonyms which are appropriate for the intended use.
    Type: Grant
    Filed: March 6, 2013
    Date of Patent: March 1, 2016
    Assignee: SearchLeaf, LLC
    Inventors: Thomas Lund, Bryce Lund
  • Patent number: 9262694
    Abstract: Provided is a technology which enables further improvement of the accuracy of the determination in the pattern matching processing. A dictionary learning device 1 includes a score calculation unit 2 and a learning unit 3. The score calculation unit 2 calculates a matching score representing a similarity-degree between a sample pattern, which is a sample of a pattern which is likely to be subjected to a pattern matching processing, and a degradation pattern resulting from a degrading processing on the sample pattern. The learning unit 3 learns a quality dictionary based on the calculated matching score and the degradation pattern. The quality dictionary is a dictionary which is used in a processing to evaluate a degradation degree (quality) of a matching target pattern of being pattern of an object on which the pattern matching processing is carried out.
    Type: Grant
    Filed: December 12, 2012
    Date of Patent: February 16, 2016
    Assignee: NEC Corporation
    Inventor: Masato Ishii
  • Patent number: 9122931
    Abstract: An object identification method is provided. The method includes dividing an input video into a number of video shots, each containing one or more video frames. The method also includes detecting target-class object occurrences and related-class object occurrences in each video shot. Further, the method includes generating hint information including a small subset of frames representing the input video and performing object tracking and recognition based on the hint information. The method also includes fusing tracking and recognition results and outputting labeled objects based on the combined tracking and recognition results.
    Type: Grant
    Filed: October 25, 2013
    Date of Patent: September 1, 2015
    Assignee: TCL RESEARCH AMERICA INC.
    Inventors: Liang Peng, Haohong Wang
  • Patent number: 9098576
    Abstract: Systems and methods for audio matching are disclosed herein. In one embodiment, a system includes both interest point mixing and fingerprint mixing by using multiple interest point detection methods in parallel. Since multiple interest point detection methods are used in parallel, accuracy of audio matching is improved across a wide variety of audio signals. In addition the scalability of the disclosed audio matching system is increased by matching the fingerprint of an audio sample with a fingerprint of a reference sample versus matching an entire spectrogram. Accordingly, a more accurate and more general solution to audio matching can be accomplished.
    Type: Grant
    Filed: October 17, 2011
    Date of Patent: August 4, 2015
    Assignee: Google Inc.
    Inventors: Matthew Sharifi, Gheorghe Postelnicu, George Tzanetakis, Dominik Roblek
  • Patent number: 9053579
    Abstract: A system and method generate a graph lattice from exemplary images. At least one processor receives exemplary data graphs of the exemplary images and generates graph lattice nodes of size one from primitives. Until a termination condition is met, the at least one processor repeatedly: 1) generates candidate graph lattice nodes from accepted graph lattice nodes; 2) selects one or more candidate graph lattice nodes preferentially discriminating exemplary data graphs which are less discriminable than other exemplary data graphs using the accepted graph lattice nodes; and 3) promotes the selected graph lattice nodes to accepted status. The graph lattice is formed from the accepted graph lattice nodes and relations between the accepted graph lattice nodes.
    Type: Grant
    Filed: June 19, 2012
    Date of Patent: June 9, 2015
    Assignee: Palo Alto Research Center Incorporated
    Inventor: Eric Saund
  • Patent number: 9047286
    Abstract: Content from multiple different stations can be divided into segments based on time. Matched segments associated with each station can be identified by comparing content included in a first segment associated with a first station, to content included in a second segment associated with a second station. Syndicated content can be identified and tagged based, at least in part, on a relationship between sequences of matched segments on different stations. Various embodiments also include identifying main sequences associated with each station under consideration, removing some of the main sequences, and consolidating remaining main sequences based on various threshold criteria.
    Type: Grant
    Filed: December 17, 2009
    Date of Patent: June 2, 2015
    Assignee: iHeartMedia Management Services, Inc.
    Inventors: Periklis Beltas, Philippe Generali, David C. Jellison, Jr.
  • Patent number: 9026442
    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
    Type: Grant
    Filed: August 14, 2014
    Date of Patent: May 5, 2015
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal