Machine Learning Patents (Class 706/12)

Genetic algorithm and genetic programming system (Class 706/13)

Adding prototype information into probabilistic models

Patent number: 8010341

Abstract: Mechanisms are disclosed for incorporating prototype information into probabilistic models for automated information processing, mining, and knowledge discovery. Examples of these models include Hidden Markov Models (HMMs), Latent Dirichlet Allocation (LDA) models, and the like. The prototype information injects prior knowledge to such models, thereby rendering them more accurate, effective, and efficient. For instance, in the context of automated word labeling, additional knowledge is encoded into the models by providing a small set of prototypical words for each possible label. The net result is that words in a given corpus are labeled and are therefore in condition to be summarized, identified, classified, clustered, and the like.

Type: Grant

Filed: September 13, 2007

Date of Patent: August 30, 2011

Assignee: Microsoft Corporation

Inventors: Kannan Achan, Moises Goldszmidt, Lev Ratinov
Method, apparatus, and system for clustering and classification

Patent number: 8010466

Abstract: The invention provides a method, apparatus and system for classification and clustering electronic data streams such as email, images and sound files for identification, sorting and efficient storage. The method further utilizes learning machines in combination with hashing schemes to cluster and classify documents. In one embodiment hash apparatuses and methods taxonomize clusters. In yet another embodiment, clusters of documents utilize geometric hash to contain the documents in a data corpus without the overhead of search and storage.

Type: Grant

Filed: June 10, 2009

Date of Patent: August 30, 2011

Assignee: TW Vericept Corporation

Inventor: Seth Patinkin
Correlating data indicating subjective user states associated with multiple users with data indicating objective occurrences

Patent number: 8010663

Abstract: A computationally implemented method includes, but is not limited to acquiring subjective user state data including data indicating incidence of at least a first subjective user state associated with a first user and data indicating incidence of at least a second subjective user state associated with a second user; acquiring objective occurrence data including data indicating incidence of at least a first objective occurrence and data indicating incidence of at least a second objective occurrence; and correlating the subjective user state data with the objective occurrence data. In addition to the foregoing, other method aspects are described in the claims, drawings, and text forming a part of the present disclosure.

Type: Grant

Filed: March 25, 2009

Date of Patent: August 30, 2011

Assignee: The Invention Science Fund I, LLC

Inventors: Shawn P. Firminger, Jason Garms, Edward K. Y. Jung, Chris D. Karkanias, Eric C. Leuthardt, Royce A. Levien, Robert W. Lord, Mark A. Malamud, John D. Rinaldo, Jr., Clarence T. Tegreene, Kristin M. Tolle, Lowell L. Wood, Jr.
Soliciting data indicating at least one subjective user state in response to acquisition of data indicating at least one objective occurrence

Patent number: 8010662

Abstract: A computationally implemented method includes, but is not limited to: acquiring objective occurrence data including data indicating occurrence of at least one objective occurrence; soliciting, in response to the acquisition of the objective occurrence data, subjective user state data including data indicating occurrence of at least one subjective user state associated with a user; acquiring the subjective user state data and correlating the subjective user state data with the objective occurrence data. In addition to the foregoing, other method aspects are described in the claims, drawings, and text forming a part of the present disclosure.

Type: Grant

Filed: February 25, 2009

Date of Patent: August 30, 2011

Assignee: The Invention Science Fund I, LLC

Inventors: Shawn P. Firminger, Jason Garms, Edward K. Y. Jung, Chris D. Karkanias, Eric C. Leuthardt, Royce A. Levien, Robert W. Lord, Mark A. Malamud, John D. Rinaldo, Jr., Clarence T. Tegreene, Kristin M. Tolle, Lowell L. Wood, Jr.
Combining active and semi-supervised learning for spoken language understanding

Patent number: 8010357

Abstract: Combined active and semi-supervised learning to reduce an amount of manual labeling when training a spoken language understanding model classifier. The classifier may be trained with human-labeled utterance data. Ones of a group of unselected utterance data may be selected for manual labeling via active learning. The classifier may be changed, via semi-supervised learning, based on the selected ones of the unselected utterance data.

Type: Grant

Filed: January 12, 2005

Date of Patent: August 30, 2011

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Dilek Z. Hakkani-Tur, Robert Elias Schapire, Gokhan Tur
Learning and community-based web aggregation techniques

Patent number: 8010674

Abstract: Some embodiments of the present invention provide a system that facilitates access to a website from an application. During operation, the system obtains community data associated with interactions between a set of users and the website and examines the community data to identify an interactivity request made by the website to users of the website. Next, the system obtains user-specific data from a new user of the application, which includes a response to the interactivity request from the new user. Finally, the system uses the user-specific data to automate access to the website for the new user.

Type: Grant

Filed: March 31, 2008

Date of Patent: August 30, 2011

Assignee: Intuit Inc.

Inventor: Spencer W. Fong
Hypothesis development based on selective reported events

Patent number: 8010664

Abstract: A computationally implemented method includes, but is not limited to: acquiring events data including data indicating incidence of a first one or more reported events and data indicating incidence of a second one or more reported events, at least one of the first one or more reported events and the second one or more reported events being associated with a user; determining an events pattern based selectively on the incidences of the first one or more reported events and the second one or more reported events; and developing a hypothesis associated with the user based, at least in part, on the determined events pattern. In addition to the foregoing, other method aspects are described in the claims, drawings, and text forming a part of the present disclosure.

Type: Grant

Filed: May 28, 2009

Date of Patent: August 30, 2011

Assignee: The Invention Science Fund I, LLC

Inventors: Shawn P. Firminger, Jason Garms, Edward K. Y. Jung, Chris D. Karkanias, Eric C. Leuthardt, Royce A. Levien, Robert W. Lord, Mark A. Malamud, John D. Rinaldo, Jr., Clarence T. Tegreene, Kristin M. Tolle, Lowell L. Wood, Jr.
Learning Term Weights from the Query Click Field for Web Search

Publication number: 20110208735

Abstract: Described is a technology by which a term frequency function for web click data is machine learned from raw click features extracted from a query log or the like and training data. Also described is using combining the term frequency function with other functions/click features to learn a relevance function for use in ranking document relevance to a query.

Type: Application

Filed: February 23, 2010

Publication date: August 25, 2011

Applicant: Microsoft Corporation

Inventors: Jianfeng Gao, Krysta M. Svore
Mining Correlation Between Locations Using Location History

Publication number: 20110208425

Abstract: Techniques describe determining a correlation between identified locations to recommend a location that may be of interest to an individual user. The process constructs a location model to identify locations. To construct the model, the process uses global positioning system (GPS) logs of geospatial locations collected over time and identifies trajectories representing trips of the individual user and extracts stay points from the trajectories. Each stay point represents a geographical region where the individual user stayed over a time threshold within a distance threshold. A location history is formulated for the individual user based on a sequence of the extracted stay points to identify locations. The process determines a correlation between identified locations. The process integrates travel experiences of individual users who have visited the locations in a weighted manner and identifies a common travel sequence which the individual users followed between the locations.

Type: Application

Filed: February 23, 2010

Publication date: August 25, 2011

Applicant: Microsoft Corporation

Inventors: Yu Zheng, Lizhu Zhang, Xing Xie
ASSISTING WITH UPDATING A MODEL FOR DIAGNOSING FAILURES IN A SYSTEM

Publication number: 20110208680

Abstract: The method includes obtaining system model data representing a set of failures in a system including a plurality of components, a set of symptoms and relationships between at least some of the failures and symptoms. The system model data is used to create a Bayesian Network. Failure cases data is also obtained, where each failure case describes the presence/absence of at least one of the symptoms and the presence/absence of at least one of the failures. A learning operation on the Bayesian Network using the failure cases data is then performed and the contribution made by at least some of the failure cases to updating the parameters of the Bayesian Network during the learning operation is assessed. Information representing the assessed contribution of the at least some failure cases is displayed.

Type: Application

Filed: September 30, 2009

Publication date: August 25, 2011

Applicant: BAE SYSTEMS plc

Inventors: Richard Lee Bovey, Erdem Turker Senalp
TROUBLE PATTERN CREATING PROGRAM AND TROUBLE PATTERN CREATING APPARATUS

Publication number: 20110208679

Abstract: A computer readable, non-transitory medium has stored therein a trouble pattern creating program. The program causes a computer to execute: (a) extracting, from a plurality of log messages that are output from an information system having a plurality of configuration items and that are output in a predetermined period of time, configuration items that output the log messages; (b) calculating a degree of relationship between the configuration items extracted in the (a) extracting; (c) executing learning of the rate of the number of occurrences of troubles in the information system in the number of times the log messages are output, the learning is executed by a number of times corresponding to the degree of relationship calculated in the (b) calculating; and (d) creating, in accordance with a result of the learning in the (c) executing, a trouble pattern message that is output when a trouble occurs.

Type: Application

Filed: February 17, 2011

Publication date: August 25, 2011

Applicant: Fujitsu Limited

Inventors: Yukihiro WATANABE, Masazumi Matsubara, Atsuji Sekiguchi, Yuji Wada, Yasuhide Matsumoto
INTRUSION DETECTION SYSTEM ALERTS MECHANISM

Publication number: 20110208677

Abstract: A system and method for analyzing Intrusion Detection System (IDS) alert data associated with a computer network is described. The method includes applying first association rules to obtained IDS alert data associated with a computer network and processing the obtained IDS alert data with the first association rules. Analyst feedback data associated with the processed obtained IDS alert data is received, and a training data set from the analyst feedback data is received. New association rules are determined based upon the training data set, and the new association rules are outputted to a display of a computing device. Outputting the new association rules may include outputting patterns within the IDS alert data of false positive alerts. The new association rules may be applied back to the obtained IDS alert data.

Type: Application

Filed: May 4, 2011

Publication date: August 25, 2011

Applicant: BANK OF AMERICA LEGAL DEPARTMENT

Inventors: Mian Zhou, Sean Kenric Catlett
MECHANICAL SHOCK FEATURE EXTRACTION FOR OVERSTRESS EVENT REGISTRATION

Publication number: 20110208678

Abstract: An electronic system includes an accelerometer. A method for excessive mechanical shock feature extraction for overstress event registration and cumulative tracking includes obtaining a sample from the accelerometer. Feature extraction is performed on the sample using empirical mode decomposition (EMD) to produce a plurality of modes. A pattern classifier is utilized for processing the plurality of modes to determine if the sample classifies as a shock event. If the sample classifies as a shock event, a shock event counter is incremented. If the shock event counter reaches a specified count, an indication to a user is generated.

Type: Application

Filed: February 19, 2010

Publication date: August 25, 2011

Applicant: ORACLE INTERNATIONAL CORPORATION

Inventors: Anton A. Bougaev, Aleksey M. Urmanov, David K. McElfresh, Kenny C. Gross
Multimedia file reproducing apparatus and method

Patent number: 8005768

Abstract: An apparatus and method to check whether a user likes a multimedia file based on the user's emotional reaction index of the multimedia file and repeatedly reproducing the multimedia file if the user likes the multimedia file. The multimedia file reproducing apparatus can include an emotional reaction index calculation unit to calculate an emotional reaction index based on a physical reaction signal of a user; a like/dislike checking unit to check whether the user likes or dislikes a corresponding audio file based on the calculated emotional reaction index; a list generation unit to generate a list of audio files that the user likes based on an average of emotional reaction indices for each audio file and the user's preference for each audio file; and a reproduction management unit to control the reproduction of the corresponding audio file based on whether the user likes or dislikes the corresponding audio file and to reproduce the audio files in the generated list.

Type: Grant

Filed: November 27, 2007

Date of Patent: August 23, 2011

Assignee: SAMSUNG Electronics Co., Ltd.

Inventors: Gyung-Hye Yang, Seung-Nyung Chung
Gradient based training method for a support vector machine

Patent number: 8005293

Abstract: A training method for a support vector machine, including executing an iterative process on a training set of data to determine parameters defining the machine, the iterative process being executed on the basis of a differentiable form of a primal optimization problem for the parameters, the problem being defined on the basis of the parameters and the data set.

Type: Grant

Filed: April 11, 2001

Date of Patent: August 23, 2011

Assignee: Telestra New Wave Pty Ltd

Inventors: Adam Kowalczyk, Trevor Bruce Anderson
Correlating subjective user states with objective occurrences associated with a user

Patent number: 8005948

Abstract: A computationally implemented method includes, but is not limited to: acquiring subjective user state data including at least a first subjective user state and a second subjective user state; acquiring objective context data including at least a first context data indicative of a first objective occurrence associated with a user and a second context data indicative of a second objective occurrence associated with the user; and correlating the subjective user state data with the objective context data. In addition to the foregoing, other method aspects are described in the claims, drawings, and text forming a part of the present disclosure.

Type: Grant

Filed: November 26, 2008

Date of Patent: August 23, 2011

Assignee: The Invention Science Fund I, LLC

Inventors: Shawn P. Firminger, Jason Garms, Edward K. Y. Jung, Chris D. Karkanias, Eric C. Leuthardt, Royce A. Levien, Robert W. Lord, Mark A. Malamud, John D. Rinaldo, Jr., Clarence T. Tegreene, Kristin M. Tolle, Lowell L. Wood, Jr.
Segment-based change detection method in multivariate data stream

Patent number: 8005771

Abstract: A method and framework are described for detecting changes in a multivariate data stream. A training set is formed by sampling time windows in a data stream containing data reflecting normal conditions. A histogram is created to summarize each window of data, and data within the histograms are clustered to form test distribution representatives to minimize the bulk of training data. Test data is then summarized using histograms representing time windows of data and data within the test histograms are clustered. The test histograms are compared to the training histograms using nearest neighbor techniques on the clustered data. Distances from the test histograms to the test distribution representatives are compared to a threshold to identify anomalies.

Type: Grant

Filed: September 24, 2008

Date of Patent: August 23, 2011

Assignee: Siemens Corporation

Inventors: Terrence Chen, Chao Yuan, Abdul Saboor Sheikh, Claus Neubauer
Computer-implemented method of generating association rules from data stream and data mining system

Patent number: 8005769

Abstract: A method of generating association rules from a data stream, which is a non-limited data set composed of transactions, includes: when itemsets in the generated transactions and the counts of the itemsets are managed using a prefix tree, and each node of the prefix tree has information on the count of a specific itemset corresponding to the node and a specific item, updating the information of a node corresponding to the itemset or adding a new node on the basis of the itemset included in the generated transaction and the count of the itemset; comparing the support of the itemset corresponding to each of the nodes of the prefix tree with a minimum support to select frequent itemsets; and visiting all or some of the nodes corresponding to the selected frequent itemsets and generating the association rule on the basis of the information of each of the visited nodes.

Type: Grant

Filed: February 19, 2008

Date of Patent: August 23, 2011

Assignee: Lee, Won Suk

Inventor: Won Suk Lee
System and method of classifying events

Patent number: 8005767

Abstract: The present invention enables identification of events such as target. From training target event data the present a very large number of clusters are formed for each class based on Euclidean distance using a repetitive k-means clustering process. Features from each cluster are identified by extracting out their dominant eigenvectors. Once all of the dominant eigenvectors have been identified, they define the relevant space of the cluster. New target event data is compared to each cluster by projecting it onto the relevant and noise spaces. The more the data lies within the relevant space and the less it lies within the noise space the more similar the data is to a cluster. The new target event data is then classified based on the training target event data.

Type: Grant

Filed: June 1, 2007

Date of Patent: August 23, 2011

Assignee: The United States of America as represented by the Secretary of the Navy

Inventor: Vincent A. Cassella
Parallel generation of a bayesian network

Patent number: 8005770

Abstract: A method for generating a Bayesian network in a parallel manner is based on an initial model having a plurality of nodes. Each node corresponds to a variable of a data set and has a local distribution associated therewith. The method includes assigning a plurality of subsets of the nodes to a respective plurality of constructors. The plurality of constructors is operated in a parallel manner to identify edges to add between nodes in the initial model. The identified edges are added to the initial model to generate the Bayesian network. The edges indicate dependency between nodes connected by the edges.

Type: Grant

Filed: June 9, 2008

Date of Patent: August 23, 2011

Assignee: Microsoft Corporation

Inventors: Chi Cao Minh, Max Chickering, John Feo, Jaime Hwacinski, Anitha Panapakkam, Khaled Sedky
SYSTEM AND METHOD FOR DETERMINING AN AUTHORITY RANK FOR REAL TIME SEARCHING

Publication number: 20110202513

Abstract: The present invention is directed towards a method and system for processing a real time increase in search requests for a common event. The method and system includes detecting an activity spike in user search request activity based on monitoring of user search requests over a defined period of time and determining source locations associated with the activity spike based on user search result activities. The method and system further includes associating the source locations with the user search request and thereupon applying a machine-learning model to determine a plurality of common features operative to cause the activity spike, including determining associations between the source locations and the activity spike.

Type: Application

Filed: February 16, 2010

Publication date: August 18, 2011

Applicant: YAHOO! INC.

Inventor: Vik Singh
ANALYZING PARALLEL TOPICS FROM CORRELATED DOCUMENTS

Publication number: 20110202484

Abstract: Access is obtained to a parallel corpus including a problem corpus and a solution corpus. A first plurality of topics are mined from the problem corpus and a second plurality of topics are mined from the solution corpus. A transition probability from the first plurality of topics to the second plurality of topics is determined, to identify a most appropriate one of the topics from the solution corpus for a given one of the topics from the problem corpus.

Type: Application

Filed: February 18, 2010

Publication date: August 18, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Nikolaos Anerousis, Abhijit Bose, Jimeng Sun, Duo Zhang
STATISTICAL MODEL LEARNING DEVICE, STATISTICAL MODEL LEARNING METHOD, AND PROGRAM

Publication number: 20110202487

Abstract: A statistical model learning device is provided to efficiently select data effective in improving the quality of statistical models. A data classification means 601 refers to structural information 611 generally possessed by a data which is a learning object, and extracts a plurality of subsets 613 from the training data 612. A statistical model learning means 602 utilizes the plurality of subsets 613 to create statistical models 614 respectively. A data recognition means 603 utilizes the respective statistical models 614 to recognize other data 615 different from the training data 612 and acquires each recognition result 616. An information amount calculation means 604 calculates information amounts of the other data 615 from a degree of discrepancy among the statistical models of the recognition results. A data selection means 605 selects the data with a large information amount and adds the same to the training data 612.

Type: Application

Filed: July 22, 2009

Publication date: August 18, 2011

Applicant: NEC CORPORATION

Inventor: Takafumi Koshinaka
PCC/QOS RULE CREATION

Publication number: 20110202485

Abstract: Various exemplary embodiments relate to a method and related network node and machine-readable storage medium including one or more of the following: receiving, at the PCRN, the application request message; determining at least one requested service flow from the application request message; for each requested service flow of the at least one requested service flow, generating a new PCC rule based on the application request message; and providing each new PCC rule to a Policy and Charging Enforcement Node (PCEN). Various exemplary embodiments further include an application request message including at least one media component and at least one media subcomponent and the step of for each media subcomponent, determining a requested service flow from the media subcomponent.

Type: Application

Filed: February 18, 2010

Publication date: August 18, 2011

Applicant: Alcatel-Lucent Canada Inc.

Inventors: Kevin Scott Cutler, Fernando Cuervo, Mike Vihtari, Ajay Kirit Pandya
Healthcare Information Technology System for Predicting Development of Cardiovascular Conditions

Publication number: 20110202486

Abstract: Described herein is a framework for predicting development of a cardiovascular condition of interest in a patient. The framework involves determining, based on prior domain knowledge relating to the cardiovascular condition of interest, a risk score as a function of patient data. The patient data may include both genetic data and non-genetic data. In one implementation, the risk score is used to categorize the patient into at least one of multiple risk categories, the multiple risk categories being associated with different strategies to prevent the onset of the cardiovascular condition. The results generated by the framework may be presented to a physician to facilitate interpretation, risk assessment and/or clinical decision support.

Type: Application

Filed: March 14, 2011

Publication date: August 18, 2011

Inventors: Glenn Fung, Faisal Farooq, Bharat R. Rao, Stephan B. Felix, Till Ittermann, Heyo K. Kroemer, Rainer Rettig, Henry Volzke
Method And Apparatus For Creating State Estimation Models In Machine Condition Monitoring

Publication number: 20110202488

Abstract: In a machine condition monitoring technique, related sensors are grouped together in clusters to improve the performance of state estimation models. To form the clusters, the entire set of sensors is first analyzed using a Gaussian process regression (GPR) to make a prediction of each sensor from the others in the set. A dependency analysis of the GPR then uses thresholds to determine which sensors are related. Related sensors are then placed together in clusters. State estimation models utilizing the clusters of sensors may then be trained.

Type: Application

Filed: September 25, 2009

Publication date: August 18, 2011

Applicant: Siemens Corporation

Inventor: Chao Yuan
USER-CENTRIC SOFT KEYBOARD PREDICTIVE TECHNOLOGIES

Publication number: 20110202876

Abstract: An apparatus and method are disclosed for providing feedback and guidance to touch screen device users to improve text entry user experience and performance by generating input history data including character probabilities, word probabilities, and touch models. According to one embodiment, a method comprises receiving first input data, automatically learning user tendencies based on the first input data to generate input history data, receiving second input data, and generating auto-corrections or suggestion candidates for one or more words of the second input data based on the input history data. The user can then select one of the suggestion candidates to replace a selected word with the selected suggestion candidate.

Type: Application

Filed: March 22, 2010

Publication date: August 18, 2011

Applicant: Microsoft Corporation

Inventors: Eric Norman Badger, Drew Elliot Linerud, Itai Almog, Timothy S. Paek, Parthasarathy Sundararajan, Dmytro Rudchenko, Asela J Gunawardana
Classification for small collections of high-value entities

Patent number: 8001060

Abstract: A method and system for classifying small collections of hi-value entities with missing data. The invention includes: collecting measurement variables for a set of entity cases for which classifications are known; calibrating standard weights for each measurement variable based on historical data; computing compensating weights for each entity case that has missing data, computing case scores for each of one or more dimensions as a sum-product of compensating weights and variables associated with each dimension; executing an iterative process that finds a specific combination of compensation weights that best classify the entity cases in terms of distinct scores; and applying a resulting model, which is determined by the specific combination of compensation weights, to classify other entity cases for which the classifications are unknown.

Type: Grant

Filed: May 9, 2007

Date of Patent: August 16, 2011

Assignee: International Business Machines Corporation

Inventor: John A. Ricketts
Method and apparatus for reward-based learning of improved policies for management of a plurality of application environments supported by a data processing system

Patent number: 8001063

Abstract: In one embodiment, the present invention is a method for reward-based learning of improved systems management policies. One embodiment of the inventive method involves obtaining a decision-making entity and a reward mechanism. The decision-making entity manages a plurality of application environments supported by a data processing system, where each application environment operates on data input to the data processing system. The reward mechanism generates numerical measures of value responsive to actions performed in states of the application environments. The decision-making entity and the reward mechanism are applied to the application environments, and results achieved through this application are processed in accordance with reward-based learning to derive a policy. The reward mechanism and the policy are then applied to the application environments, and the results of this application are processed in accordance with reward-based learning to derive a new policy.

Type: Grant

Filed: June 30, 2008

Date of Patent: August 16, 2011

Assignee: International Business Machines Corporation

Inventors: Gerald James Tesauro, Rajarshi Das, Nicholas K. Jong, Jeffrey O. Kephart
First and second unsupervised learning processes combined using a supervised learning apparatus

Patent number: 8001061

Abstract: A data processing apparatus includes first and second unsupervised learning process units and a supervised learning process unit. The first unsupervised learning process unit classifies data of a first data group according to unsupervised learning, to perform dimension reduction for the first data group and to obtain first classified data. The second unsupervised learning process unit classifies data of a second data group according to the unsupervised learning, to perform dimension reduction for the second data group and to obtain a second classified data group. The supervised learning process unit performs supervised learning using, as a teacher, the first classified data group obtained by the first unsupervised learning process unit and the second classified data group obtained by the second unsupervised learning process unit to determine a mapping relation between the first classified data group and the second classified data group.

Type: Grant

Filed: June 26, 2007

Date of Patent: August 16, 2011

Assignee: Fuji Xerox Co., Ltd.

Inventors: Shinichiro Serizawa, Tomoyuki Ito
Fuzzy-learning-based extraction of time-series behavior

Patent number: 8001074

Abstract: Systems and methods for extracting or analyzing time-series behavior are described. Some embodiments of computer-implemented methods include generating fuzzy rules from time series data. Certain embodiments also include resolving conflicts between fuzzy rules according to how the data is clustered. Some embodiments further include extracting a model of the time-series behavior via defuzzification and making that model accessible. Advantageously, to resolve conflicts between fuzzy rules, some embodiments define Gaussian functions for each conflicting data point, sum the Gaussian functions according to how the conflicting data points are clustered, and resolve the conflict based on the results of summing the Gaussian functions. Some embodiments use both crisp and non-trivially fuzzy regions and/or both crisp and non-trivially fuzzy membership functions.

Type: Grant

Filed: January 31, 2008

Date of Patent: August 16, 2011

Assignee: Quest Software, Inc.

Inventor: Wai Yip To
Supervised learning using multi-scale features from time series events and scale space decompositions

Patent number: 8001062

Abstract: Disclosed herein is a method, a system and a computer program product for generating a statistical classification model used by a computer system to determine a class associated with an unlabeled time series event. Initially, a set of labeled time series events is received. A set of time series features is identified for a selected set of the labeled time series events. A plurality of scale space decompositions is generated based on the set of time series features. A plurality of multi-scale features is generated based on the plurality of scale space decompositions. A first subset of the plurality of multi-scale features that correspond at least in part to a subset of space or time points within a time series event that contain feature data that distinguish the time series event as belonging to a class of time series events that corresponds to the class label are identified.

Type: Grant

Filed: December 7, 2007

Date of Patent: August 16, 2011

Assignee: Google Inc.

Inventors: Ullas Gargi, Jay Yagnik
SYSTEMS AND METHODS FOR EFFICIENTLY RANKING ADVERTISEMENTS BASED ON RELEVANCY AND CLICK FEEDBACK

Publication number: 20110196739

Abstract: The present invention provides a method and system for ranking and selecting advertisements based on relevancy, click feedback and click over expected click (COEC) data. Advertisements may be described as contextual, page-embedded advertisements appearing on publisher websites. The method and system includes storing page-advertisement relevancy features in a vector space model and historical impression and click features in a click feedback model and analyzing data in the vector space model and click feedback model. The method and system further includes storing empirical click-through data in a serving log and analyzing data therein. The method and system then generates a regression model based on the analyzed data, which is stored in a regression storage module. The method and system receives requests for advertisement content from client devices, selects a plurality of candidate advertisements based on the generated regression model and provides a plurality of advertisements to a client device.

Type: Application

Filed: February 5, 2010

Publication date: August 11, 2011

Inventors: Ruofei Zhang, Wei Li, Jianchang Mao
SYSTEM, METHOD, AND APPARATUS FOR GENERATING A SCRIPT TO PERFORM A TASK AT A TARGET WEB INTERFACE INSTANCE

Publication number: 20110196853

Abstract: A computer-implemented method for automatically generating a script for a target web interface instance. Embodiments include receiving a task description of a task to be completed on a target web interface instance. The computer-implemented method also includes repeating steps until the task is completed. The repeating steps include determining from the target web interface instance a plurality of actions that may be performed on the target web interface instance and using the task description, predicting which action of the plurality of actions from the target web interface instance is an action most likely to be selected. The repeating steps also include performing the action most likely to be selected, thus proceeding to a first web interface instance and setting the first web interface instance as the target web interface instance.

Type: Application

Filed: February 8, 2010

Publication date: August 11, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Jeffrey P. Bigham, Clemens Drews, Tessa A. Lau, Ian A. R. Li, Jeffrey W. Nichols
DATA CLASSIFICATION USING MACHINE LEARNING TECHNIQUES

Publication number: 20110196870

Abstract: Systems, methods and computer program products for classifying documents are presented. Systems, methods and computer program products for analyzing documents, e.g., associated with legal discovery are also presented. Systems, methods and computer program products for cleaning up data are also presented. Systems, methods and computer program products for verifying an association of an invoice with an entity are also presented. Systems, methods and computer program products for managing medical records are presented. Systems, methods and computer program products for face recognition are presented.

Type: Application

Filed: April 19, 2011

Publication date: August 11, 2011

Applicant: KOFAX, INC.

Inventors: Mauritius A.R. Schmidtler, Roland Borrey, Anthony Sarah
System and method for content-based object ranking to facilitate information lifecycle management

Patent number: 7996409

Abstract: A method to manage objects in an information lifecycle management system is provided. The method includes determining a score for each of the objects based on a score of at least one feature within respective ones of each of the objects where the score of the at least one feature being associated with a valuation of the at least one feature. The method also includes managing each of the objects based on the score for each of the objects wherein higher scored objects are managed preferentially.

Type: Grant

Filed: December 28, 2006

Date of Patent: August 9, 2011

Assignee: International Business Machines Corporation

Inventors: Windsor Wee Sun Hsu, Shauchi Ong
Classification via semi-riemannian spaces

Patent number: 7996343

Abstract: Described is using semi-Riemannian geometry in supervised learning to learn a discriminant subspace for classification, e.g., labeled samples are used to learn the geometry of a semi-Riemannian submanifold. For a given sample, the K nearest classes of that sample are determined, along with the nearest samples that are in other classes, and the nearest samples in that sample's same class. The distances between these samples are computed, and used in computing a metric matrix. The metric matrix is used to compute a projection matrix that corresponds to the discriminant subspace. In online classification, as a new sample is received, it is projected into a feature space by use of the projection matrix and classified accordingly.

Type: Grant

Filed: September 30, 2008

Date of Patent: August 9, 2011

Assignee: Microsoft Corporation

Inventors: Deli Zhao, Zhouchen Lin, Xiaoou Tang
Methods and systems for searching for color themes, suggesting color theme tags, and estimating tag descriptiveness

Patent number: 7996341

Abstract: Embodiments of the present disclosure assess how well a tag describes a color theme by estimating a descriptiveness value for the tag for the color theme. Some embodiments determine descriptiveness values for a tag based on weighted color attributes determined from the tag's existing use in a color theme collection. Descriptiveness values are used generally in color theme searching and to suggest tags for a color theme, among other things.

Type: Grant

Filed: December 20, 2007

Date of Patent: August 9, 2011

Assignee: Adobe Systems Incorporated

Inventor: Hendrik Kueck
Method and system for L-based robust distribution clustering of multinomial distributions

Patent number: 7996340

Abstract: A workforce analysis method for solving L1-based clustering problem of multinomial distributions of workforce data includes acquiring workforce allocation data, arranging the workforce allocation data in sets of fraction data with respect to the L1 distances, clustering the sets of fraction data t corresponding set of cluster centers, or L1 distances for each set, minimizing the sets of fraction data based on the cluster centers or L1 distances and outputting analysis results of the clustering problem.

Type: Grant

Filed: December 19, 2007

Date of Patent: August 9, 2011

Assignee: International Business Machines Corporation

Inventor: Hisashi Kashima
Systems, methods and computer program products for supervised dimensionality reduction with mixed-type features and labels

Patent number: 7996342

Abstract: Systems, methods and computer program products for supervised dimensionality reduction. Exemplary embodiments include a method including receiving an input in the form of a data matrix X of size N×D, wherein N is a number of samples, D is a dimensionality, a vector Y of size N×1, hidden variables U of a number K, a data type of the matrix X and the vector Y, and a trade-off constant alpha; selecting loss functions in the form of Lx(X,UV) and Ly(Y,UW) appropriate for the type of data in the matrix X and the vector Y, where U, V and W are matrices, selecting corresponding sets of update rules RU, RV and RW for updating the matrices U,V and W, learning U, V and W that provide a minimum total loss L(U,V,W)=Lx(X,UV)+alpha*Ly(Y,UW), and returning matrices U, V and W.

Type: Grant

Filed: February 15, 2008

Date of Patent: August 9, 2011

Assignee: International Business Machines Corporation

Inventors: Genady Grabarnik, Irina Rish
Method and system for generating object classification models

Patent number: 7996339

Abstract: A method for generating object classification models is disclosed. Initially, a set of training data is fed into a training algorithm to generate a first object classification model. A set of field data is then applied to the first object classification model to produce a set of field object classifications. The number of data in the set of field data is significantly less than the number of data in the set of training data. Finally, the set of field object classifications and the set of field data are fed into the training algorithm to generate a second object classification model. The second object classification model can be utilized for predicting object classifications.

Type: Grant

Filed: September 17, 2004

Date of Patent: August 9, 2011

Assignee: International Business Machines Corporation

Inventors: Ameha Aklilu, Raed Hijer, Wilson Velez
SIMILARITY FUNCTION IN ONLINE ADVERTISING BID OPTIMIZATION

Publication number: 20110191170

Abstract: The present invention provides methods and systems for use in bid optimization in connection with advertisement serving impression opportunities available in an auction-based online advertising exchange. Methods are presented in which, based in part on historical advertisement performance information, a Kalman filter-based model is used in forecasting performance of a set of possible advertisement impressions served over a future period of time. Forecasted performance information is used in determining an optimized bid in connection with an available opportunity. A similarity function, including non-linearly determined feature weighting, can be used in determining most similar forecasted impressions to the available opportunity.

Type: Application

Filed: February 2, 2010

Publication date: August 4, 2011

Applicant: Yahoo! Inc.

Inventors: Ruofei Zhang, Ying Cui
SYSTEM AND METHOD TO ESTIMATE REGION OF TISSUE ACTIVATION

Publication number: 20110191275

Abstract: A computer-implemented method for determining the volume of activation of neural tissue. In one embodiment, the method uses one or more parametric equations that define a volume of activation, wherein the parameters for the one or more parametric equations are given as a function of an input vector that includes stimulation parameters. After receiving input data that includes values for the stimulation parameters and defining the input vector using the input data, the input vector is applied to the function to obtain the parameters for the one or more parametric equations. The parametric equation is solved to obtain a calculated volume of activation.

Type: Application

Filed: August 26, 2010

Publication date: August 4, 2011

Applicant: THE CLEVELAND CLINIC FOUNDATION

Inventors: J. Luis Lujan, Ashu Chaturvedi, Cameron C. McIntyre
AUTOMATIC DATA MINING PROCESS CONTROL

Publication number: 20110191277

Abstract: A data mining system includes a planning and learning module which receives as input a knowledge model and a set of goals and automatically produces as output a plurality of plans. The system includes a data mining processing unit which receives the plans as instructions and automatically creates results which are provided back to the planning and learning module as feedback. A method for data mining includes the steps of receiving as input at a planning and learning module a knowledge model and a set of goals. There is the step of automatically producing as output of the planning and learning module a plurality of plans from the input. There is the step of receiving by a data mining processing unit the plans as instructions. There is the step of automatically creating results by the data mining processing unit. There is the step of providing back to the planning and learning module the results as feedback.

Type: Application

Filed: August 29, 2008

Publication date: August 4, 2011

Inventors: José Luis Agúndez Dominguez, Jesus Renero Quintero
L1 Projections with Box Constraints

Publication number: 20110191400

Abstract: Similarities between simplex projection with upper bounds and L1 projection are explored. Criteria for a-priori determination of sequence in which various constraints become active are derived, and this sequence is used to develop efficient algorithms for projecting a vector onto the L1-ball while observing box constraints. Three projection methods are presented. The first projection method performs exact projection in O(n2) worst case complexity, where n is the space dimension. Using a novel criteria for ordering constraints, the second projection method has a worst case complexity of O(n log n). The third projection method is a worst case linear time algorithm having O(n) complexity. The upper bounds defined for the projected entries guide the L1-ball projection to more meaningful predictions.

Type: Application

Filed: August 10, 2010

Publication date: August 4, 2011

Inventors: Mithun Das Gupta, Jing Xiao, Sanjeev Kumar
Joint Embedding for Item Association

Publication number: 20110191374

Abstract: Methods and systems to associate semantically-related items of a plurality of item types using a joint embedding space are disclosed. The disclosed methods and systems are scalable to large, web-scale training data sets. According to an embodiment, a method for associating semantically-related items of a plurality of item types includes embedding training items of a plurality of item types in a joint embedding space configured in a memory coupled to at least one processor, learning one or more mappings into the joint embedding space for each of the item types to create a trained joint embedding space and one or more learned mappings, and associating one or more embedded training items with a first item based upon a distance in the trained joint embedding space from the first item to each said associated embedded training items. Exemplary item types that may be embedded in the joint embedding space include images, annotations, audio and video.

Type: Application

Filed: February 1, 2011

Publication date: August 4, 2011

Applicant: Google Inc.

Inventors: Samy BENGIO, Jason Weston
OPEN INFORMATION EXTRACTION FROM THE WEB

Publication number: 20110191276

Abstract: To implement open information extraction, a new extraction paradigm has been developed in which a system makes a single data-driven pass over a corpus of text, extracting a large set of relational tuples without requiring any human input. Using training data, a Self-Supervised Learner employs a parser and heuristics to determine criteria that will be used by an extraction classifier (or other ranking model) for evaluating the trustworthiness of candidate tuples that have been extracted from the corpus of text, by applying heuristics to the corpus of text. The classifier retains tuples with a sufficiently high probability of being trustworthy. A redundancy-based assessor assigns a probability to each retained tuple to indicate a likelihood that the retained tuple is an actual instance of a relationship between a plurality of objects comprising the retained tuple. The retained tuples comprise an extraction graph that can be queried for information.

Type: Application

Filed: December 16, 2010

Publication date: August 4, 2011

Applicant: University of Washington through its Center for Commercialization

Inventors: Michael J. Cafarella, Michele Banko, Oren Etzioni
Evaluating ontologies

Publication number: 20110191273

Abstract: A method for providing an evaluation/verification of the correctness of an ontology is described. The method includes loading a first ontology associated with a first rule set. an extended ontology and an extended rule set are generated based at least in part on the first ontology and the first rule set. The extended rule set is applied to the extended ontology. The method also includes determining (e.g., by a data processor) a correctness of the extended ontology. Results are generated which include the correctness. Apparatus and computer readable media are also described.

Type: Application

Filed: February 2, 2010

Publication date: August 4, 2011

Applicant: International Business Machines Corporation

Inventors: Genady Grabarnik, Zhen Liu, Anand Ranganathan, Anton V. Riabov, Irina Rish, Larisa Shwartz
IMAGE TAGGING BASED UPON CROSS DOMAIN CONTEXT

Publication number: 20110191271

Abstract: A method described herein includes receiving a digital image, wherein the digital image includes a first element that corresponds to a first domain and a second element that corresponds to a second domain. The method also includes automatically assigning a label to the first element in the digital image based at least in part upon a computed probability that the label corresponds to the first element, wherein the probability is computed through utilization of a first model that is configured to infer labels for elements in the first domain and a second model that is configured to infer labels for elements in the second domain. The first model receives data that identifies learned relationships between elements in the first domain and elements in the second domain, and the probability is computed by the first model based at least in part upon the learned relationships.

Type: Application

Filed: February 4, 2010

Publication date: August 4, 2011

Applicant: Microsoft Corporation

Inventors: Simon John Baker, Ashish Kapoor, Gang Hua, Dahua Lin
Deep-Structured Conditional Random Fields for Sequential Labeling and Classification

Publication number: 20110191274

Abstract: Described is a technology by which a deep-structured (multiple layered) conditional random field model is trained and used for classification of sequential data. Sequential data is processed at each layer, from the lowest layer to a final (highest) layer, to output data in the form of conditional probabilities of classes given the sequential input data. Each higher layer inputs the conditional probability data and the sequential data jointly to output further probability data, and so forth, until the final layer which outputs the classification data. Also described is layer-by-layer training, supervised or unsupervised. Unsupervised training may process raw features to minimize average frame-level conditional entropy while maximizing state occupation entropy, or to minimize reconstruction error.

Type: Application

Filed: January 29, 2010

Publication date: August 4, 2011

Applicant: Microsoft Corporation

Inventors: Dong Yu, Li Deng, Shizhen Wang

prev … 103 104 105 106 107 108 109 110 111 … next