Data Mining Patents (Class 707/776)

Taxonomy discovery (Class 707/777)

Hierarchical structures (Class 707/778)

System and method for ranking a posting

Patent number: 8311792

Abstract: A method for training a ranking application. The method includes ranking the help postings to create an initial ranking using initial parameter values, and storing user interactions with the help postings to obtain stored interactions. Simulations are performed using the stored interactions to generate revised parameter values for the ranking application. Performing the simulations includes calculating relevance values from the stored interactions, creating a test posting, assigning, to the test posting, an initial score and a relevance value randomly selected from the relevance values to generate a test ranking, and simulating user interactions with the test ranking to generate simulated rankings. The simulated rankings are analyzed to obtain revised parameter values. The method further includes ranking, using the revised parameter values, the help postings to generate a revised ranking, and displaying the help postings in the forum according to the revised ranking.

Type: Grant

Filed: December 23, 2009

Date of Patent: November 13, 2012

Assignee: Intuit Inc.

Inventors: Igor A. Podgorny, Floyd J. Morgan, Derek Szydlowski
Resource description framework network construction device and method using an ontology schema having class dictionary and mining rule

Patent number: 8312041

Abstract: RDF network construction device and method using an ontology schema having class dictionaries and mining rules. The RDF network construction device includes an ontology schema storing module, a class managing module, a mining rule managing module, a mining pattern creating module, and an RDF triple creating module.

Type: Grant

Filed: October 5, 2010

Date of Patent: November 13, 2012

Assignee: Korea Institute of Science and Technology Information

Inventors: Han Min Jung, Pyung Kim, Seung Woo Lee, Mi Kyung Lee, Dong Min Seo, Won Kyung Sung
Systems, Methods and Computer Program Products for Identifying a Potentially Valuable Patent for Acquisition

Publication number: 20120284198

Abstract: Systems, methods and computer program products are provided for identifying patents of value for acquisition. Through analysis of actions by patent holders, accused infringers, competitors to patent holders, non-practicing entities and challengers of patents, patents of potential economic value can be identified. The acquisition of these patent rights can facilitate the development of a valuable patent portfolio.

Type: Application

Filed: May 2, 2011

Publication date: November 8, 2012

Applicant: ARTICLE ONE PARTNERS HOLDINGS

Inventor: Cheryl Milone
Method and computer program product for using data mining tools to automatically compare an investigated unit and a benchmark unit

Patent number: 8306997

Abstract: Sources of operational problems in business transactions often show themselves in relatively small pockets of data, which are called trouble hot spots. Identifying these hot spots from internal company transaction data is generally a fundamental step in the problem's resolution, but this analysis process is greatly complicated by huge numbers of transactions and large numbers of transaction variables to analyze. A suite of practical modifications are provided to data mining techniques and logistic regressions to tailor them for finding trouble hot spots. This approach thus allows the use of efficient automated data mining tools to quickly screen large numbers of candidate variables for their ability to characterize hot spots. One application is the screening of variables which distinguish a suspected hot spot from a reference set.

Type: Grant

Filed: May 27, 2011

Date of Patent: November 6, 2012

Assignee: Verizon Services Corp.

Inventor: James Howard Drew
Associating database log records into logical groups

Patent number: 8306945

Abstract: A method, system and medium for organizing and associating log records into logically related groups is described. One or more input sources from, possibly, different systems/subsystems are input to a log correlation method. As the log records are processed the fields are interrogated to determine which log records are related to each other. As further log records are processed more information about previously unidentifiable relationships is determined. After this later information is known, log records that could previously not be associated with any other log records are added to the existing association. The system engineer is therefore presented with the pertinent information for monitoring, administrating and diagnosing system activities.

Type: Grant

Filed: October 8, 2007

Date of Patent: November 6, 2012

Assignee: BMC Software, Inc.

Inventors: Larry Morris, Dale G. Wood
USING WEB-MINING TO ENRICH DIRECTORY SERVICE DATABASES AND SOLICITING SERVICE SUBSCRIPTIONS

Publication number: 20120278361

Abstract: A system and method are provided for augmenting information on business directory databases and communicating with businesses is disclosed. Using the enriched business directory database and Web mining technology, customized email message are sent inviting businesses to enter their enriched business information into the directory or even subscribe to other paid services provided by the directory service.

Type: Application

Filed: July 8, 2012

Publication date: November 1, 2012

Inventors: Narendra Gupta, Mazin Gilbert, Benjamin J. Stern
System and method for capturing and reporting online sessions

Patent number: 8301653

Abstract: The present invention discloses a computer system for reporting online sessions and a computer enabled method utilizing the same. The computer system is made up of an icon that preferably appears on a user screen. The icon is capable of capturing a screen session on the user screen and saving it within a recording. The recording may then be communicated to a database server that is capable of extracting a plurality of target components from said recording, and is capable of storing them in a database. The database may contain a benchmark content of the plurality of target components. Target components may then be compared against the benchmark content in a variety of ways to determine whether the level of target components is above or below reasonable and socially accepted levels.

Type: Grant

Filed: January 25, 2010

Date of Patent: October 30, 2012

Inventors: Glenn Adamousky, Dennis Nagy
Visualization recommendations based on measure metadata

Patent number: 8301585

Abstract: A system includes reception of metadata associated with one or more measures, determination of a compatibility of the two or more measures based on the metadata, and determination of a first score associated with a first visualization based on the compatibility. Some aspects include determination of a second score associated with a second visualization based on the compatibility, comparison of the first score and the second score, and recommendation of one of the first visualization or the second visualization based on the comparison.

Type: Grant

Filed: December 7, 2009

Date of Patent: October 30, 2012

Assignee: Business Objects Software Limited

Inventors: Nicolas Mourey, Aurélien Theraud
Method for reconstructing protein database and a method for screening proteins by using the same method

Patent number: 8296300

Abstract: The present invention relates to a method for reconstructing protein database for identifying a protein and a method for screening a protein using the same, more precisely a method for reconstructing protein database, and a method for identifying a protein using the same. The method for reconstructing protein database and the method for identifying the protein of the invention are very useful for the investigation of endogenous proteins and their functions and interactions, and are further effectively used for the development of diagnostic and therapeutic agents for various diseases.

Type: Grant

Filed: August 18, 2006

Date of Patent: October 23, 2012

Assignee: Korea Basic Science Institute

Inventors: Kyung-Hoon Kwon, Jong Shin Yoo
High speed network data extractor

Patent number: 8291058

Abstract: The present invention describes a system and method of extracting and storing data elements from network packets, thus performing the task of data mining. In one embodiment of the present invention incoming packets are decomposed one protocol layer at a time to extract data elements contained in the protocol headers. Layer-specific parsers perform deep packet inspection in order to extract data elements from upper-level protocols. Extracted data is arranged in rows, which are subsequently stored into a memory-based accumulator. After some length of time the accumulator is flushed to disk files. Another process reads the flushed disk files row-by-row, inserting each row into a relational database. Standard SQL operations are performed on the relational database in order to generate and display reports of the collected data.

Type: Grant

Filed: February 19, 2010

Date of Patent: October 16, 2012

Assignee: Intrusion, Inc.

Inventors: Tommy Joe Head, Daris A. Nevil
POTENTIAL COMMUNICATION RECIPIENT PREDICTION

Publication number: 20120260188

Abstract: One or more techniques and/or systems are provided for identifying potential recipients for a communication (e.g., email, instant message, content sharing platform, etc.) a user is presently preparing based at least in part upon a user's communication history. That is, information about the user's interactions with past recipients of his/her communications are compared with information known about the present communication and potential recipients of the present communication are identified based upon this comparison. Moreover, in one embodiment, based upon the past interactions of the user with others, one or more communication groups can be identified and presented to the user. In this way, a user may select a communication group and recipients included in the communication group can be added as recipients for the communication the user is presently preparing without the user having to manually create such groups, for example.

Type: Application

Filed: April 6, 2011

Publication date: October 11, 2012

Applicant: Microsoft Corporation

Inventors: Seung-Hae Park, Suraj Samaranayake, Arcadiy Gregory Kantor, Sarah Filman, Omar H. Shahine, Piero Sierra, Stephen Liffick, Anthony Frey, Siddhartha Parmar
KNOWLEDGE-BASED DATA MINING SYSTEM

Publication number: 20120259890

Abstract: In a data mining system, data is gathered into a data store using, e.g., a Web crawler. The data is classified into entities. Data miners use rules to process the entities and append respective keys to the entities representing characteristics of the entities as derived from rules embodied in the miners. With these keys, characteristics of entities as defined by disparate expert authors of the data miners are identified for use in responding to complex data requests from customers.

Type: Application

Filed: June 18, 2012

Publication date: October 11, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Matthew Denesuk, Daniel Frederick Gruhl, Sridhar Rajagopalan, Andrew S. Tomkins
Knowledge discovery and dissemination of text by mining with words

Patent number: 8285728

Abstract: The present invention enables the construction and deployment of semantic search engines, which learn. Here, the learning necessarily occurs in unsupervised mode given the huge amount of text being searched and the resultant impracticality of providing human feedback. That is, it acquires a lexicon of synonyms and the context for their proper application by searching a moving window of text and compiling the results. No human operator is involved. This allows for the practical search, using machine learning techniques, of very large textual bases.

Type: Grant

Filed: August 24, 2010

Date of Patent: October 9, 2012

Assignee: The United States of America as represented by the Secretary of the Navy

Inventor: Stuart Rubin
Indexing and searching manufacturing process related information

Patent number: 8285744

Abstract: Indexing agents and/or data brokers are leveraged to provide search query results related to manufacturing processes. The indexing agents allow different manufacturing configuration data types to be “sub-indexed,” allowing them to be easily searched. In one instance, the sub-indices can be aggregated together to create an overall index to facilitate in query searches of the configuration data. Separate indexing agents can be utilized for indexing contents of the configuration components for the human-machine interface (HMI) and control system and the like. Data brokers can be employed to facilitate in responding to query searches by indexing/searching real-time process variables (tags) and historical data in persistent storage. A search engine can then be employed to aggregate the search results and present them to a user in a selectable fashion. User selected results are then rendered in the proper format and displayed to the user.

Type: Grant

Filed: September 30, 2005

Date of Patent: October 9, 2012

Assignee: Rockwell Automation Technologies, Inc.

Inventors: Eric G. Dorgelo, Kevin G. Gordon, Clifton H. Bromley, Douglas J. Reichard, Marc D. Semkow, Shafin A. Virji
System and method for the structured display of items

Patent number: 8285715

Abstract: A system and method for displaying items by receiving a query, identifying query feature values for similarity features associated with the query, identifying items each having item feature values for similarity features thereof, for each of the identified items and for each of the similarity features, determining a feature distance between the query feature value and the item feature value, and presenting the identified items in a two dimensional array of cells. The array defines reference vectors corresponding to the similarity features. The identified items are positioned within the array relative to the origin cell, for any of the cells, by determining difference function values. for the identified items each based on the feature distances of the item, the reference vectors and the coordinates of the cell, and placing one of the items in the cell based upon the determined difference function values.

Type: Grant

Filed: August 17, 2009

Date of Patent: October 9, 2012

Assignee: Ugmode, Inc.

Inventors: Arlo Mukai Faria, Ajeet Ganesh Shankar
User query mining for advertising matching

Patent number: 8285745

Abstract: Systems and methods to determine relevant keywords from a user's search query sessions are disclosed. The described method includes identifying search session logs of a user, segmenting the search session logs into one or more search sessions. After the segmentation, the search sessions are analyzed to compose a list of semantically relevant keyword sets including at least a first keyword set and a second keyword set. The described method further includes determining a semantic relevance between the first and second keyword sets according to the frequency at which the first and second keyword sets are reported in the query results and displaying one or more semantically high relevant keyword sets after being filtered by a threshold.

Type: Grant

Filed: August 31, 2007

Date of Patent: October 9, 2012

Assignee: Microsoft Corporation

Inventors: Hua Li, HuaJun Zeng, Jian Hu, Zheng Chen, Jian Wang
Proactive information security management

Patent number: 8285748

Abstract: A method and apparatus for proactive information security management is described. In one embodiment, for example, a computer-implemented method for controlling access to sensitive information, the method comprising: maintaining access constraint data that can be used to control access to the sensitive information, wherein the access constraint data includes match pattern data and apply pattern data; receiving a semantic query from a querier requesting access to the sensitive information; based on the match pattern data, determining whether the semantic query should be constrained according to the apply pattern data; where said semantic query should be constrained according to the apply pattern data, rewriting the semantic query according to the apply pattern data to produce a rewritten query; executing the rewritten query against a database that contains the sensitive information; and returning any results of executing the rewritten query.

Type: Grant

Filed: May 26, 2009

Date of Patent: October 9, 2012

Assignee: Oracle International Corporation

Inventors: John S. Thomas, Aravind Yalamanchi, Idriss Mekrez, Matt Topper
SYSTEMS, METHODS, AND MEDIA FOR GENERATING HIERARCHICAL FUSED RISK SCORES

Publication number: 20120254243

Abstract: Systems, methods, and media for generating fused risk scores for determining fraud in call data are provided herein. Some exemplary methods include generating a fused risk score used to determine fraud from call data by generating a fused risk score for a leg of call data, via a fuser module of an analysis system, the fused risk score being generated by fusing together two or more uniquely calculated fraud risk scores, each of the uniquely calculated fraud risk scores being generated by a sub-module of the analysis system; and storing the fused risk score in a storage device that is communicatively couplable with the fuser module.

Type: Application

Filed: March 8, 2012

Publication date: October 4, 2012

Inventors: Torsten Zeppenfeld, N. Nikki Mirghafori, Lisa Guerra, Richard Gutierrez, Anthony Rajakumar
METHODS AND SYSTEMS FOR MINING ASSOCIATION RULES

Publication number: 20120254242

Abstract: Systems, methods, and computer-readable code stored on a non-transitory media for mining association rules include determining a minimum support threshold and a minimum confidence threshold for association rule mining; determining a sampling model; sampling transactions from a transaction dataset; mining association rules from the sampled transactions; and transmitting mined association rules.

Type: Application

Filed: May 19, 2011

Publication date: October 4, 2012

Applicant: INFOSYS TECHNOLOGIES LIMITED

Inventors: Balasubramanian Kanagasabapathi, K. Antony Arokia Durai Raj
MULTIPLE CRITERIA DECISION ANALYSIS

Publication number: 20120254241

Abstract: Embodiments of the present disclosure set forth a method for selecting a preferred data set. The method includes generating a candidate data set based on a first data set having a first join attribute, and a first aggregate attribute and a second data set having a second join attribute compatible with the first join attribute, and a second aggregate attribute, wherein the candidate data set includes a total attribute having a value that is based on aggregating a value associated with the first aggregate attribute and a value associated with the second aggregate attribute; and selecting the preferred data set from the candidate data set based on the total attribute.

Type: Application

Filed: March 28, 2011

Publication date: October 4, 2012

Applicant: INDIAN INSTITUDE OF TECHNOLOGY KANPUR

Inventors: ARNAB BHATTACHARYA, PALVALI TEJA B
Travelogue-based contextual map generation

Patent number: 8281246

Abstract: A map user interface control provides functionality for displaying a map in conjunction with the display of a Web page. The map control operates in combination with a location extraction component that analyzes the contents of the Web page to identify locations mentioned therein. Once the location extraction component has identified the locations mentioned in the Web page, a map is generated that encompasses the locations identified in the Web page. Once the map has been generated, the map control displays the map in conjunction with the display of the Web page. The map might include visual indicators corresponding to the locations mentioned in the Web page. The map might also include visual indicators corresponding to other locations near the locations identified in the Web page that have been identified using co-occurrence values generated through an analysis of a set of travelogues.

Type: Grant

Filed: September 29, 2009

Date of Patent: October 2, 2012

Assignee: Microsoft Corporation

Inventors: Rong Xiao, Jiangming Yang, Lei Zhang, Xingrong Chen
Collaboration swarming

Patent number: 8280846

Abstract: A swarm can develop around a piece of content. The swarm can include the original content, changes to the original content, the persons contributing the changes, and metadata, such as comments contributed by members of the swarm. A swarm can also include statistics generated about the content, such as the size of the swarm, the growth and/or death rates of the swarm, the longevity of the swarm, the intensity of the swarm, the persistence of the swarm, and the direction of the swarm. Swarms and their behaviors can be used to validate or invalidate content.

Type: Grant

Filed: January 19, 2010

Date of Patent: October 2, 2012

Assignee: Novell, Inc.

Inventors: Andrew Fox, David Marshall LaPalomento, Ian Edward Roughley, Scott A. Isaacson
System, method, and computer program product for retrieving presentation settings from a database

Patent number: 8280864

Abstract: A system, method, and computer program product are provided for retrieving presentation settings from a database. In use, presentation capabilities information associated with media hardware is received. Further, a plurality of presentation settings is retrieved from a database, utilizing the presentation capabilities information.

Type: Grant

Filed: December 17, 2007

Date of Patent: October 2, 2012

Assignee: NVIDIA Corporation

Inventors: William S. Herz, Alexander E. Soohoo
Abstracting events for data mining

Patent number: 8280899

Abstract: An event is described herein as being representable by a quantified abstraction of the event. The event includes at least one predicate, and the at least one predicate has at least one constant symbol corresponding thereto. An instance of the constant symbol corresponding to the event is identified, and the instance of the constant symbol is replaced by a free variable to obtain an abstracted predicate. Thus, a quantified abstraction of the event is composed as a pair: the abstracted predicate and a mapping between the free variable and an instance of the constant symbol that corresponds to the predicate. A data mining algorithm is executed over abstracted, quantified events to ascertain a correlation between the event and another event.

Type: Grant

Filed: October 14, 2009

Date of Patent: October 2, 2012

Assignee: Microsoft Corporation

Inventors: David Lo, Ganesan Ramalingam, Venkatesh-Prasad Ranganath, Kapil Vaswani
NETWORK GRAPH EVOLUTION RULE GENERATION

Publication number: 20120246196

Abstract: A network's evolution is characterized by graph evolution rules. A graph that represents an evolutionary network is mined to identify evolutional patterns of the network, and graph evolution rules are generated using identified evolutional patterns. The generated graph evolution rules represent the evolutional patterns of the network.

Type: Application

Filed: June 6, 2012

Publication date: September 27, 2012

Inventors: Francesco Bonchi, Aristides Gionis, Michele Berlingerio, Björn Bringmann
Clique based clustering for named entity recognition system

Patent number: 8275608

Abstract: A soft clustering method comprises (i) grouping items into non-exclusive cliques based on features associated with the items, and (ii) clustering the non-exclusive cliques using a hard clustering algorithm to generate item groups on the basis of mutual similarity of the features of the items constituting the cliques. In some named entity recognition embodiments illustrated herein as examples, named entities together with contexts are grouped into cliques based on mutual context similarity. Each clique includes a plurality of different named entities having mutual context similarity. The cliques are clustered to generate named entity groups on the basis of mutual similarity of the contexts of the named entities constituting the cliques.

Type: Grant

Filed: July 3, 2008

Date of Patent: September 25, 2012

Assignee: Xerox Corporation

Inventors: Julien Ah-Pine, Guillaume Jacquet
NAMED ENTITY DATABASE OR MINING RULE DATABASE UPDATE APPARATUS AND METHOD USING NAMED ENTITY DATABASE AND MINING RULE MERGED ONTOLOGY SCHEMA

Publication number: 20120233213

Abstract: A mining rule database update apparatus using a named entity dictionary and a mining rule combined with an ontology schema includes: a named entity dictionary and mining rule database storage module storing the named entity dictionary and a mining rule database; a mining pattern generation module recognizing terminology from text and converting the terminology into the mining pattern; a named entity and mining rule search module searching for a corresponding named entity and a mining rule from the named entity dictionary and the mining rule database using the recognized terminology and the mining pattern; and a mining rule database update module estimating a relationship name using a named entity of the recognized terminology and the ontology schema, generating a corresponding mining rule, and storing the generated mining rule in the mining rule database depending on a user's selection.

Type: Application

Filed: May 22, 2012

Publication date: September 13, 2012

Applicant: KOREA INSTITUTE OF SCIENCE & TECHNOLOGY INFORMATION

Inventors: Han Min JUNG, Pyung KIM, Seung Woo LEE, Mi Kyung LEE, Dong Min SEO, Won Kyung SUNG
Processing Medical Records

Publication number: 20120233215

Abstract: Processing medical records may be provided. First, medical records may be received from a plurality of sources. The medical records may then be converted into a computer-readable form. Once converted, the medical records may be searched for certain key words, phrases, or symbols. These searched key words, phrases and symbols may correspond to data of interest within the medical records. Once located, the searched key words, phrases and symbols may be extracted from the medical records, as well as an area of the records surrounding the located key words, phrases and symbols. Finally, the extracted data may be used to generate a summary report.

Type: Application

Filed: March 10, 2011

Publication date: September 13, 2012

Inventor: Everett Darryl Walker
NAMED ENTITY DATABASE OR MINING RULE DATABASE UPDATE APPARATUS AND METHOD USING NAMED ENTITY DATABASE AND MINING RULE MERGED ONTOLOGY SCHEMA

Publication number: 20120233214

Abstract: A mining rule database update apparatus using a named entity dictionary and a mining rule combined with an ontology schema includes: a named entity dictionary and mining rule database storage module storing the named entity dictionary and a mining rule database; a named entity and mining rule search module searching for a corresponding mining rule and a named entity from the mining rule database and the named entity dictionary using a terminology included in an inputted mining pattern and the mining pattern; and a mining rule database update module estimating a relationship name using a named entity of the terminology and the ontology schema, generating a corresponding mining rule, and storing the generated mining rule in the mining rule database depending on user's selection.

Type: Application

Filed: May 22, 2012

Publication date: September 13, 2012

Applicant: KOREA INSTITUTE OF SCIENCE & TECHNOLOGY INFORMATION

Inventors: Han Min Jung, Pyung Kim, Seung Woo Lee, Mi Kyung Lee, Dong Min Seo, Won Kyung Sung
METHOD AND APPARATUS FOR WORD QUALITY MINING AND EVALUATING

Publication number: 20120221602

Abstract: A method and an apparatus for word quality mining and evaluating are disclosed. The method includes: calculating a Document Frequency (DF) of a word in mass categorized data; evaluating the word in multiple single-aspects according to the DF of the word; and evaluating the word in multiple aspects according to the multiple single aspect evaluations to obtain an importance weight of the word. According to the solution of the present invention, the importance of the word in the mass categorized data may be evaluated, and words with high quality may be obtained through an integrated evaluation.

Type: Application

Filed: May 7, 2012

Publication date: August 30, 2012

Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Huaijun Liu, Zhongbo Jiang, Gaolin Fang
Search relevance in business intelligence systems through networked ranking

Patent number: 8250024

Abstract: Method and system for optimizing search results in a business intelligence system. An member is selected in the business intelligence system having a user space, a content space, a data space, a master-data space and a metadata space. A relationship is determined between the member and a plurality of objects in the user space, the content space, the data space, the master-data space, or the metadata space. A ranking of the member is calculated based on the relationship. A relevance of the member in the business intelligence system is calculated using the ranking, thereby optimizing search results of the business intelligence system using the relevance of the object.

Type: Grant

Filed: May 12, 2009

Date of Patent: August 21, 2012

Assignee: International Business Machines Corporation

Inventors: Graham Douglas Mackintosh, John Andrew Kowal
REAL-TIME INFORMATION MINING

Publication number: 20120209879

Abstract: Embodiments of the invention are related to identifying a user's intent dynamically from at least a set of metadata associated with the user, wherein the set of metadata is associated with a user input, and providing to the user a set of labeled instances on determination of a user's intent, the set of labeled instances being directly related to user's intent, where the set of labeled instances are obtained in real-time from a set of information repositories.

Type: Application

Filed: February 11, 2011

Publication date: August 16, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Nilanjan Banerjee, Dipanjan Chakraborty, Anupam Joshi, Sumit Mittal, Seema Nagar, Angshu Rai, Koustuv Dasgupta
METHOD OF CONSTRUCTING A MIXTURE MODEL

Publication number: 20120209880

Abstract: A method of constructing a general mixture model of a dataset includes partitioning the dataset into at least two subsets according to predefined criteria, generating a subset mixture model for each of the at least two subsets, and then combining the mixture models from each subset to generate a general mixture model.

Type: Application

Filed: February 15, 2011

Publication date: August 16, 2012

Applicant: GENERAL ELECTRIC COMPANY

Inventors: Robert Edward Callan, Brian Larder
Identifying Task Groups for Organizing Search Results

Publication number: 20120209835

Abstract: Computer-readable media and computerized methods for automatically organizing search results according to task groups are provided. The methods involve aggregating a gallery of entities (e.g., search queries that share a common categorization) into a query class and assigning a dictionary (e.g., list of terms that are drawn from various sources) to the query class. The task groups are identified from the list of terms within the dictionary. The process of identification includes analyzing patterns of user search behavior to select terms from the list of terms, which reflect popular user search intents, and ranking the selected terms based on predetermined parameters to produce an ordering. Based on the ordering, a set of the selected terms that are highest ranked are declared the task groups. The task groups are employed to arrange the search results on a UI display and to provide a consistent and intuitive format for refining a search.

Type: Application

Filed: April 24, 2012

Publication date: August 16, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Sanaz Ahari, Xiaoxin Yin, Farid Hosseini, Sarthak Shah, Adam Troy, Dan Fain, Brian MacDonald, Nikhil Dandekar, Michael Cameron
SYSTEM, METHOD, AND COMPUTER READABLE MEDIA FOR IDENTIFYING A USER-INITIATED LOG FILE RECORD IN A LOG FILE

Publication number: 20120209882

Abstract: A system, a method, and a computer readable media for identifying a user-initiated log file record in a log file are provided. The log file has a user-initiated log file record and a repeating pattern of log file records automatically generated by a software program. The system allows a user to identify first and second timestamp values corresponding to first and second times which identify a time interval of interest in the log file. The system further analyzes the log file to identify the user-initiated log file record having a timestamp value between the first and second timestamp values. The system further identifies the repeating pattern of log file records in the log file.

Type: Application

Filed: April 26, 2012

Publication date: August 16, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Danny Yen-Fu Chen, David A. Cox, Sheryl S. Kinstler, Fabian F. Morgan
METHODS, SYSTEMS AND SOFTWARE FOR SEARCHING ACRONYMS, PHRASES, AND WORD GROUPINGS IN ELECTRONIC DOCUMENTS

Publication number: 20120209881

Abstract: Methods, systems and software for searching electronic documents allow a user to enter a single or multiple letters of a word group, name, phrase, or the like, and see hits that include the data inputted. The search tool solves the problem of finding the full words and ultimately the meaning of an acronym when reading a web page, word processing document, or other electronic searchable material. The search tool also solves the problem of searching through a document for particular word groups, phrases, and names, and may be especially useful where the exact spelling is unknown. The search tool may allow consumers the ability to search based on one or more characters of each name or word independently of the remaining characters in the name, phrase or word search.

Type: Application

Filed: February 10, 2012

Publication date: August 16, 2012

Inventor: Bruce Eliot Ross
Network architecture with load balancing, fault tolerance and distributed querying

Patent number: 8239535

Abstract: A network architecture with load balancing, fault tolerance and distributed querying comprises a plurality of front-end servers, a plurality of back-end servers, and a database. The front-end servers are coupled to a network to receive data requests from client devices. The front-end servers are each coupled to the plurality of back-end servers. The front-end servers handle data requests at a macro level and divide the request into sub-requests that are sent to the plurality of back-end servers. The back-end servers are coupled to the database to retrieve data. Each data request is distributed across the plurality of back-end servers according to workload. The front-end servers are fault tolerant in that they can respond to a request for data without all of the back-end servers being responsive or providing data.

Type: Grant

Filed: December 20, 2005

Date of Patent: August 7, 2012

Assignee: Adobe Systems Incorporated

Inventors: Christopher Reid Error, Michael Paul Bailey
System and method for determining driving factor in a data cube

Patent number: 8234298

Abstract: Method and system for determining a driving factor for a data value of interest in a multidimensional database, by collecting a context for the data value of interest in the multidimensional database. The data value of interest has dimensional levels with dimensional members outside the drill path of the data value of interest. The dimensional levels are enumerated in a list. A query using the dimensional members of the dimensional level is executed. A variance is calculated for the set of query results. A driving factor for the data value of interest is determined based on the variance. The driving factor is added to the context of the data value of interest.

Type: Grant

Filed: July 25, 2007

Date of Patent: July 31, 2012

Assignee: International Business Machines Corporation

Inventors: Stewart James Winter, Randy Mark Westman, Murray John Reid, Andrew Alexander Leikucs, William Todd MacCulloch
Cross-domain clusterability evaluation for cross-guided data clustering based on alignment between data domains

Patent number: 8229929

Abstract: A system and associated method for evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively.

Type: Grant

Filed: January 6, 2010

Date of Patent: July 24, 2012

Assignee: International Business Machines Corporation

Inventors: Jeffrey M. Achtermann, Indrajit Bhattacharya, Kevin W. English, Jr., Shantanu R. Godbole, Sachindra Joshi, Ashwin Srinivasan, Ashish Verma
Text mining device, text mining method, and text mining program

Patent number: 8229956

Abstract: With respect to each part at which a word included in a characteristic condition defining a characteristic text set designated by a user through the input device appears in text, the characteristic condition assurance degree calculating unit of the text mining device obtains a reliability of the word from the word reliability storage unit to operate a value of a characteristic condition assurance degree for each text by predetermined operation based on all the obtained reliabilities. The characteristic condition assurance degree calculating unit executes operation such that when a value of each reliability is large, a value of a degree of assurance becomes large. The representative text output unit outputs text whose characteristic condition assurance degree is the highest among texts whose characteristic condition assurance degrees are calculated together with its characteristic condition assurance degree.

Type: Grant

Filed: November 30, 2006

Date of Patent: July 24, 2012

Assignee: NEC Corporation

Inventors: Takahiro Ikeda, Yoshihiro Ikeda, legal representative, Satoshi Nakazawa, Yousuke Sakao, Kenji Satoh
Method and apparatus for distribution-independent outlier detection in streaming data

Patent number: 8224622

Abstract: The present invention relates to an iterative method and an apparatus for distribution-independent detection of intermediate outliers and outliers in the distribution tail of streamed data. A considerable sequence of streamed data is sequentially read and subsequently assigned to matching bins. The bins are adaptively allocated when, where and if they are needed. Each bin range expands concurrently with the distribution range of the accumulating items assigned to the bin, adding a margin. For every N'th read item, overlapping or adjoining bins are merged, whereupon the bins are assessed for insider preclusion. Information regarding outliers is extracted from the remaining outlier bins when the entire data sequence has been processed.

Type: Grant

Filed: July 27, 2009

Date of Patent: July 17, 2012

Assignee: Telefonaktiebolaget L M Ericsson (Publ)

Inventors: N Hari Kumar, J Mohamed Zahoor
System, method, and computer readable media for identifying a user-initiated log file record in a log file

Patent number: 8219582

Abstract: A system, a method, and a computer readable media for identifying a user-initiated log file record in a log file are provided. The log file has a user-initiated log file record and a repeating pattern of log file records automatically generated by a software program. The system allows a user to identify first and second timestamp values corresponding to first and second times which identify a time interval of interest in the log file. The system further analyzes the log file to identify the user-initiated log file record having a timestamp value between the first and second timestamp values. The system further identifies the repeating pattern of log file records in the log file.

Type: Grant

Filed: April 25, 2008

Date of Patent: July 10, 2012

Assignee: International Business Machines Corporation

Inventors: Danny Yen-Fu Chen, David A. Cox, Sheryl S. Kinstler, Fabian F. Morgan
Methods and systems for mining websites

Patent number: 8219583

Abstract: Mining of websites that in one embodiment includes obtaining web usage data of user sessions of a website, wherein the website has a hierarchical structure with granular levels and has mapping from each webpage of the website into the hierarchical structure, mapping the user sessions to the hierarchical structure of the website resulting in hierarchical user sessions, initiating an edit distance metrics to determine similarity in the hierarchical user sessions, and clustering similar hierarchical user sessions into groups.

Type: Grant

Filed: November 10, 2008

Date of Patent: July 10, 2012

Assignee: NBCUniversal Media, LLC

Inventors: Abha Moitra, Steven Matt Gustafson, Feng Xue
System and method for determining valid citation patterns in electronic documents

Patent number: 8219566

Abstract: A system and method are provided for comparing portions of document text with potential citation components, determining if individual portions correspond to a citation component, and determining if a set of portions correspond to a valid citation pattern. A set of valid citation patterns is provided. Each citation pattern may include a specified combination of citation components. The invention further relates to identifying potential citation components from text in a document, analyzing a pattern of the identified citation components by comparing the pattern to a set of stored citation patterns to determine if the potential citation is a type of citation, and if so, is it a valid (and/or invalid) citation pattern. Once citation patterns have been determined in the document, annotations may be inserted into the document, and subsequent action may be taken, for example, generating a list of citations, providing research services, error-handling, and/or providing other options related to the citations.

Type: Grant

Filed: August 30, 2011

Date of Patent: July 10, 2012

Assignee: Litera Corp.

Inventor: Tony Rolle
Domain independent system and method of automating data aggregation

Patent number: 8214392

Abstract: A computer automated method and system of presenting data. The method may include the steps of inputting a set of user-defined instructions into a remotely located computer database system via a public network connection, inputting a user query into the computer database system via the public network connection, mining the computer database system for data relevant to the user query, creating a data set comprising the data relevant to the user query, and aggregating data in the data set using domain metrics selected based on any of predefined and configurable rules and past user usage. The aggregation may further include tagging all data attributes in the data set based on database metadata and inputs from a user, wherein the data attributes comprise any of data identifications (IDs), data grouping attributes, and data measure attributes.

Type: Grant

Filed: December 22, 2010

Date of Patent: July 3, 2012

Assignee: Semantifi, Inc.

Inventors: Sreenivasa R Pragada, Viswanath Dasari
Knowledge-based data mining system

Patent number: 8214391

Abstract: In a data mining system, data is gathered into a data store using, e.g., a Web crawler. The data is classified into entities. Data miners use rules to process the entities and append respective keys to the entities representing characteristics of the entities as derived from rules embodied in the miners. With these keys, characteristics of entities as defined by disparate expert authors of the data miners are identified for use in responding to complex data requests from customers.

Type: Grant

Filed: May 8, 2002

Date of Patent: July 3, 2012

Assignee: International Business Machines Corporation

Inventors: Matthew Denesuk, Daniel Frederick Gruhl, Kevin Snow McCurley, Sridhar Rajagopalan, Andrew S. Tomkins
SYSTEM, METHOD AND COMPUTER PROGRAM FOR MULTI-DIMENSIONAL TEMPORAL DATA MINING

Publication number: 20120166484

Abstract: The present invention provides a system, method and computer program for multi-dimensional temporal abstraction and data mining. The invention comprises collecting and optionally cleaning multi-dimensional data, the multi-dimensional data including a plurality of data streams; temporally abstracting the multi-dimensional data; and relatively aligning the temporally abstracted multi-dimensional data based on a shared time point of interest.

Type: Application

Filed: July 22, 2010

Publication date: June 28, 2012

Inventor: Carlolyn Patricia McGregor
LARGE-SCALE EVENT EVALUATION USING REALTIME PROCESSORS

Publication number: 20120158783

Abstract: Large-scale event processing systems are often designed to perform data mining operations by storing a large set of events in a massive database, applying complex queries to the records of the events, and generating reports and notifications. However, because such queries are performed on very large data sets, the processing of the queries often introduces a significant delay between the occurrence of the events and the reporting or notification thereof. Instead, a large-scale event processing system may be devised as a large state machine organized according to an evaluation plan, comprising a graph of event processors that, in realtime, evaluate each event in an event stream to update an internal state of the event processor, and to perform responses when response conditions are met. The continuous monitoring and evaluation of the stream of events may therefore enable the event processing system to provide realtime responses and notifications of complex queries.

Type: Application

Filed: December 20, 2010

Publication date: June 21, 2012

Applicant: Microsoft Corporation

Inventors: Nir Nice, Daniel Sitton, Dror Kremer, Michael Feldman
Network graph evolution rule generation

Patent number: 8204904

Abstract: A network's evolution is characterized by graph evolution rules. A graph, formed by merging multiple graphs representing the multiple snapshots of the network, that represents an evolutionary network is mined to identify evolutional patterns of the network. A pattern is selected from the identified patterns. Graph evolution rules are generated using identified evolutional patterns. The generated graph evolution rules represent the evolutional patterns of the network, the rules indicating that any occurrence of a child pattern of the selected pattern implies a corresponding occurrence of the selected pattern.

Type: Grant

Filed: September 30, 2009

Date of Patent: June 19, 2012

Assignee: Yahoo! Inc.

Inventors: Francesco Bonchi, Aristides Gionis, Michele Berlingerio, Björn Bringmann
Encoding Data Stored in a Column-Oriented Manner

Publication number: 20120143913

Abstract: Data stored in a column-oriented manner is encoded using a data mining algorithm for finding column patterns among a set of data tuples, where each data tuple contains a set of columns, and the data mining algorithm treats all columns and all column combinations and column ordering similarly or in the same manner when looking for column patterns. Column values are ordered occurring in the column patterns based on their frequencies into a prefix tree, where the prefix tree defines a pattern order. The data tuples are sorted according to the pattern order, resulting in sorted data tuples, and columns of the sorted data tuples are encoded using run-length encoding.

Type: Application

Filed: August 10, 2011

Publication date: June 7, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Felix Beier, Oliver Draese, Knut Stolze

prev … 5 6 7 8 9 10 11 12 13 … next