Data Mining Patents (Class 707/776)
-
Patent number: 8311792Abstract: A method for training a ranking application. The method includes ranking the help postings to create an initial ranking using initial parameter values, and storing user interactions with the help postings to obtain stored interactions. Simulations are performed using the stored interactions to generate revised parameter values for the ranking application. Performing the simulations includes calculating relevance values from the stored interactions, creating a test posting, assigning, to the test posting, an initial score and a relevance value randomly selected from the relevance values to generate a test ranking, and simulating user interactions with the test ranking to generate simulated rankings. The simulated rankings are analyzed to obtain revised parameter values. The method further includes ranking, using the revised parameter values, the help postings to generate a revised ranking, and displaying the help postings in the forum according to the revised ranking.Type: GrantFiled: December 23, 2009Date of Patent: November 13, 2012Assignee: Intuit Inc.Inventors: Igor A. Podgorny, Floyd J. Morgan, Derek Szydlowski
-
Patent number: 8312041Abstract: RDF network construction device and method using an ontology schema having class dictionaries and mining rules. The RDF network construction device includes an ontology schema storing module, a class managing module, a mining rule managing module, a mining pattern creating module, and an RDF triple creating module.Type: GrantFiled: October 5, 2010Date of Patent: November 13, 2012Assignee: Korea Institute of Science and Technology InformationInventors: Han Min Jung, Pyung Kim, Seung Woo Lee, Mi Kyung Lee, Dong Min Seo, Won Kyung Sung
-
Publication number: 20120284198Abstract: Systems, methods and computer program products are provided for identifying patents of value for acquisition. Through analysis of actions by patent holders, accused infringers, competitors to patent holders, non-practicing entities and challengers of patents, patents of potential economic value can be identified. The acquisition of these patent rights can facilitate the development of a valuable patent portfolio.Type: ApplicationFiled: May 2, 2011Publication date: November 8, 2012Applicant: ARTICLE ONE PARTNERS HOLDINGSInventor: Cheryl Milone
-
Patent number: 8306997Abstract: Sources of operational problems in business transactions often show themselves in relatively small pockets of data, which are called trouble hot spots. Identifying these hot spots from internal company transaction data is generally a fundamental step in the problem's resolution, but this analysis process is greatly complicated by huge numbers of transactions and large numbers of transaction variables to analyze. A suite of practical modifications are provided to data mining techniques and logistic regressions to tailor them for finding trouble hot spots. This approach thus allows the use of efficient automated data mining tools to quickly screen large numbers of candidate variables for their ability to characterize hot spots. One application is the screening of variables which distinguish a suspected hot spot from a reference set.Type: GrantFiled: May 27, 2011Date of Patent: November 6, 2012Assignee: Verizon Services Corp.Inventor: James Howard Drew
-
Patent number: 8306945Abstract: A method, system and medium for organizing and associating log records into logically related groups is described. One or more input sources from, possibly, different systems/subsystems are input to a log correlation method. As the log records are processed the fields are interrogated to determine which log records are related to each other. As further log records are processed more information about previously unidentifiable relationships is determined. After this later information is known, log records that could previously not be associated with any other log records are added to the existing association. The system engineer is therefore presented with the pertinent information for monitoring, administrating and diagnosing system activities.Type: GrantFiled: October 8, 2007Date of Patent: November 6, 2012Assignee: BMC Software, Inc.Inventors: Larry Morris, Dale G. Wood
-
Publication number: 20120278361Abstract: A system and method are provided for augmenting information on business directory databases and communicating with businesses is disclosed. Using the enriched business directory database and Web mining technology, customized email message are sent inviting businesses to enter their enriched business information into the directory or even subscribe to other paid services provided by the directory service.Type: ApplicationFiled: July 8, 2012Publication date: November 1, 2012Inventors: Narendra Gupta, Mazin Gilbert, Benjamin J. Stern
-
Patent number: 8301653Abstract: The present invention discloses a computer system for reporting online sessions and a computer enabled method utilizing the same. The computer system is made up of an icon that preferably appears on a user screen. The icon is capable of capturing a screen session on the user screen and saving it within a recording. The recording may then be communicated to a database server that is capable of extracting a plurality of target components from said recording, and is capable of storing them in a database. The database may contain a benchmark content of the plurality of target components. Target components may then be compared against the benchmark content in a variety of ways to determine whether the level of target components is above or below reasonable and socially accepted levels.Type: GrantFiled: January 25, 2010Date of Patent: October 30, 2012Inventors: Glenn Adamousky, Dennis Nagy
-
Patent number: 8301585Abstract: A system includes reception of metadata associated with one or more measures, determination of a compatibility of the two or more measures based on the metadata, and determination of a first score associated with a first visualization based on the compatibility. Some aspects include determination of a second score associated with a second visualization based on the compatibility, comparison of the first score and the second score, and recommendation of one of the first visualization or the second visualization based on the comparison.Type: GrantFiled: December 7, 2009Date of Patent: October 30, 2012Assignee: Business Objects Software LimitedInventors: Nicolas Mourey, Aurélien Theraud
-
Patent number: 8296300Abstract: The present invention relates to a method for reconstructing protein database for identifying a protein and a method for screening a protein using the same, more precisely a method for reconstructing protein database, and a method for identifying a protein using the same. The method for reconstructing protein database and the method for identifying the protein of the invention are very useful for the investigation of endogenous proteins and their functions and interactions, and are further effectively used for the development of diagnostic and therapeutic agents for various diseases.Type: GrantFiled: August 18, 2006Date of Patent: October 23, 2012Assignee: Korea Basic Science InstituteInventors: Kyung-Hoon Kwon, Jong Shin Yoo
-
Patent number: 8291058Abstract: The present invention describes a system and method of extracting and storing data elements from network packets, thus performing the task of data mining. In one embodiment of the present invention incoming packets are decomposed one protocol layer at a time to extract data elements contained in the protocol headers. Layer-specific parsers perform deep packet inspection in order to extract data elements from upper-level protocols. Extracted data is arranged in rows, which are subsequently stored into a memory-based accumulator. After some length of time the accumulator is flushed to disk files. Another process reads the flushed disk files row-by-row, inserting each row into a relational database. Standard SQL operations are performed on the relational database in order to generate and display reports of the collected data.Type: GrantFiled: February 19, 2010Date of Patent: October 16, 2012Assignee: Intrusion, Inc.Inventors: Tommy Joe Head, Daris A. Nevil
-
Publication number: 20120260188Abstract: One or more techniques and/or systems are provided for identifying potential recipients for a communication (e.g., email, instant message, content sharing platform, etc.) a user is presently preparing based at least in part upon a user's communication history. That is, information about the user's interactions with past recipients of his/her communications are compared with information known about the present communication and potential recipients of the present communication are identified based upon this comparison. Moreover, in one embodiment, based upon the past interactions of the user with others, one or more communication groups can be identified and presented to the user. In this way, a user may select a communication group and recipients included in the communication group can be added as recipients for the communication the user is presently preparing without the user having to manually create such groups, for example.Type: ApplicationFiled: April 6, 2011Publication date: October 11, 2012Applicant: Microsoft CorporationInventors: Seung-Hae Park, Suraj Samaranayake, Arcadiy Gregory Kantor, Sarah Filman, Omar H. Shahine, Piero Sierra, Stephen Liffick, Anthony Frey, Siddhartha Parmar
-
Publication number: 20120259890Abstract: In a data mining system, data is gathered into a data store using, e.g., a Web crawler. The data is classified into entities. Data miners use rules to process the entities and append respective keys to the entities representing characteristics of the entities as derived from rules embodied in the miners. With these keys, characteristics of entities as defined by disparate expert authors of the data miners are identified for use in responding to complex data requests from customers.Type: ApplicationFiled: June 18, 2012Publication date: October 11, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Matthew Denesuk, Daniel Frederick Gruhl, Sridhar Rajagopalan, Andrew S. Tomkins
-
Patent number: 8285728Abstract: The present invention enables the construction and deployment of semantic search engines, which learn. Here, the learning necessarily occurs in unsupervised mode given the huge amount of text being searched and the resultant impracticality of providing human feedback. That is, it acquires a lexicon of synonyms and the context for their proper application by searching a moving window of text and compiling the results. No human operator is involved. This allows for the practical search, using machine learning techniques, of very large textual bases.Type: GrantFiled: August 24, 2010Date of Patent: October 9, 2012Assignee: The United States of America as represented by the Secretary of the NavyInventor: Stuart Rubin
-
Patent number: 8285744Abstract: Indexing agents and/or data brokers are leveraged to provide search query results related to manufacturing processes. The indexing agents allow different manufacturing configuration data types to be “sub-indexed,” allowing them to be easily searched. In one instance, the sub-indices can be aggregated together to create an overall index to facilitate in query searches of the configuration data. Separate indexing agents can be utilized for indexing contents of the configuration components for the human-machine interface (HMI) and control system and the like. Data brokers can be employed to facilitate in responding to query searches by indexing/searching real-time process variables (tags) and historical data in persistent storage. A search engine can then be employed to aggregate the search results and present them to a user in a selectable fashion. User selected results are then rendered in the proper format and displayed to the user.Type: GrantFiled: September 30, 2005Date of Patent: October 9, 2012Assignee: Rockwell Automation Technologies, Inc.Inventors: Eric G. Dorgelo, Kevin G. Gordon, Clifton H. Bromley, Douglas J. Reichard, Marc D. Semkow, Shafin A. Virji
-
Patent number: 8285715Abstract: A system and method for displaying items by receiving a query, identifying query feature values for similarity features associated with the query, identifying items each having item feature values for similarity features thereof, for each of the identified items and for each of the similarity features, determining a feature distance between the query feature value and the item feature value, and presenting the identified items in a two dimensional array of cells. The array defines reference vectors corresponding to the similarity features. The identified items are positioned within the array relative to the origin cell, for any of the cells, by determining difference function values. for the identified items each based on the feature distances of the item, the reference vectors and the coordinates of the cell, and placing one of the items in the cell based upon the determined difference function values.Type: GrantFiled: August 17, 2009Date of Patent: October 9, 2012Assignee: Ugmode, Inc.Inventors: Arlo Mukai Faria, Ajeet Ganesh Shankar
-
Patent number: 8285745Abstract: Systems and methods to determine relevant keywords from a user's search query sessions are disclosed. The described method includes identifying search session logs of a user, segmenting the search session logs into one or more search sessions. After the segmentation, the search sessions are analyzed to compose a list of semantically relevant keyword sets including at least a first keyword set and a second keyword set. The described method further includes determining a semantic relevance between the first and second keyword sets according to the frequency at which the first and second keyword sets are reported in the query results and displaying one or more semantically high relevant keyword sets after being filtered by a threshold.Type: GrantFiled: August 31, 2007Date of Patent: October 9, 2012Assignee: Microsoft CorporationInventors: Hua Li, HuaJun Zeng, Jian Hu, Zheng Chen, Jian Wang
-
Patent number: 8285748Abstract: A method and apparatus for proactive information security management is described. In one embodiment, for example, a computer-implemented method for controlling access to sensitive information, the method comprising: maintaining access constraint data that can be used to control access to the sensitive information, wherein the access constraint data includes match pattern data and apply pattern data; receiving a semantic query from a querier requesting access to the sensitive information; based on the match pattern data, determining whether the semantic query should be constrained according to the apply pattern data; where said semantic query should be constrained according to the apply pattern data, rewriting the semantic query according to the apply pattern data to produce a rewritten query; executing the rewritten query against a database that contains the sensitive information; and returning any results of executing the rewritten query.Type: GrantFiled: May 26, 2009Date of Patent: October 9, 2012Assignee: Oracle International CorporationInventors: John S. Thomas, Aravind Yalamanchi, Idriss Mekrez, Matt Topper
-
Publication number: 20120254243Abstract: Systems, methods, and media for generating fused risk scores for determining fraud in call data are provided herein. Some exemplary methods include generating a fused risk score used to determine fraud from call data by generating a fused risk score for a leg of call data, via a fuser module of an analysis system, the fused risk score being generated by fusing together two or more uniquely calculated fraud risk scores, each of the uniquely calculated fraud risk scores being generated by a sub-module of the analysis system; and storing the fused risk score in a storage device that is communicatively couplable with the fuser module.Type: ApplicationFiled: March 8, 2012Publication date: October 4, 2012Inventors: Torsten Zeppenfeld, N. Nikki Mirghafori, Lisa Guerra, Richard Gutierrez, Anthony Rajakumar
-
Publication number: 20120254242Abstract: Systems, methods, and computer-readable code stored on a non-transitory media for mining association rules include determining a minimum support threshold and a minimum confidence threshold for association rule mining; determining a sampling model; sampling transactions from a transaction dataset; mining association rules from the sampled transactions; and transmitting mined association rules.Type: ApplicationFiled: May 19, 2011Publication date: October 4, 2012Applicant: INFOSYS TECHNOLOGIES LIMITEDInventors: Balasubramanian Kanagasabapathi, K. Antony Arokia Durai Raj
-
Publication number: 20120254241Abstract: Embodiments of the present disclosure set forth a method for selecting a preferred data set. The method includes generating a candidate data set based on a first data set having a first join attribute, and a first aggregate attribute and a second data set having a second join attribute compatible with the first join attribute, and a second aggregate attribute, wherein the candidate data set includes a total attribute having a value that is based on aggregating a value associated with the first aggregate attribute and a value associated with the second aggregate attribute; and selecting the preferred data set from the candidate data set based on the total attribute.Type: ApplicationFiled: March 28, 2011Publication date: October 4, 2012Applicant: INDIAN INSTITUDE OF TECHNOLOGY KANPURInventors: ARNAB BHATTACHARYA, PALVALI TEJA B
-
Patent number: 8281246Abstract: A map user interface control provides functionality for displaying a map in conjunction with the display of a Web page. The map control operates in combination with a location extraction component that analyzes the contents of the Web page to identify locations mentioned therein. Once the location extraction component has identified the locations mentioned in the Web page, a map is generated that encompasses the locations identified in the Web page. Once the map has been generated, the map control displays the map in conjunction with the display of the Web page. The map might include visual indicators corresponding to the locations mentioned in the Web page. The map might also include visual indicators corresponding to other locations near the locations identified in the Web page that have been identified using co-occurrence values generated through an analysis of a set of travelogues.Type: GrantFiled: September 29, 2009Date of Patent: October 2, 2012Assignee: Microsoft CorporationInventors: Rong Xiao, Jiangming Yang, Lei Zhang, Xingrong Chen
-
Patent number: 8280846Abstract: A swarm can develop around a piece of content. The swarm can include the original content, changes to the original content, the persons contributing the changes, and metadata, such as comments contributed by members of the swarm. A swarm can also include statistics generated about the content, such as the size of the swarm, the growth and/or death rates of the swarm, the longevity of the swarm, the intensity of the swarm, the persistence of the swarm, and the direction of the swarm. Swarms and their behaviors can be used to validate or invalidate content.Type: GrantFiled: January 19, 2010Date of Patent: October 2, 2012Assignee: Novell, Inc.Inventors: Andrew Fox, David Marshall LaPalomento, Ian Edward Roughley, Scott A. Isaacson
-
Patent number: 8280864Abstract: A system, method, and computer program product are provided for retrieving presentation settings from a database. In use, presentation capabilities information associated with media hardware is received. Further, a plurality of presentation settings is retrieved from a database, utilizing the presentation capabilities information.Type: GrantFiled: December 17, 2007Date of Patent: October 2, 2012Assignee: NVIDIA CorporationInventors: William S. Herz, Alexander E. Soohoo
-
Patent number: 8280899Abstract: An event is described herein as being representable by a quantified abstraction of the event. The event includes at least one predicate, and the at least one predicate has at least one constant symbol corresponding thereto. An instance of the constant symbol corresponding to the event is identified, and the instance of the constant symbol is replaced by a free variable to obtain an abstracted predicate. Thus, a quantified abstraction of the event is composed as a pair: the abstracted predicate and a mapping between the free variable and an instance of the constant symbol that corresponds to the predicate. A data mining algorithm is executed over abstracted, quantified events to ascertain a correlation between the event and another event.Type: GrantFiled: October 14, 2009Date of Patent: October 2, 2012Assignee: Microsoft CorporationInventors: David Lo, Ganesan Ramalingam, Venkatesh-Prasad Ranganath, Kapil Vaswani
-
Publication number: 20120246196Abstract: A network's evolution is characterized by graph evolution rules. A graph that represents an evolutionary network is mined to identify evolutional patterns of the network, and graph evolution rules are generated using identified evolutional patterns. The generated graph evolution rules represent the evolutional patterns of the network.Type: ApplicationFiled: June 6, 2012Publication date: September 27, 2012Inventors: Francesco Bonchi, Aristides Gionis, Michele Berlingerio, Björn Bringmann
-
Patent number: 8275608Abstract: A soft clustering method comprises (i) grouping items into non-exclusive cliques based on features associated with the items, and (ii) clustering the non-exclusive cliques using a hard clustering algorithm to generate item groups on the basis of mutual similarity of the features of the items constituting the cliques. In some named entity recognition embodiments illustrated herein as examples, named entities together with contexts are grouped into cliques based on mutual context similarity. Each clique includes a plurality of different named entities having mutual context similarity. The cliques are clustered to generate named entity groups on the basis of mutual similarity of the contexts of the named entities constituting the cliques.Type: GrantFiled: July 3, 2008Date of Patent: September 25, 2012Assignee: Xerox CorporationInventors: Julien Ah-Pine, Guillaume Jacquet
-
Publication number: 20120233213Abstract: A mining rule database update apparatus using a named entity dictionary and a mining rule combined with an ontology schema includes: a named entity dictionary and mining rule database storage module storing the named entity dictionary and a mining rule database; a mining pattern generation module recognizing terminology from text and converting the terminology into the mining pattern; a named entity and mining rule search module searching for a corresponding named entity and a mining rule from the named entity dictionary and the mining rule database using the recognized terminology and the mining pattern; and a mining rule database update module estimating a relationship name using a named entity of the recognized terminology and the ontology schema, generating a corresponding mining rule, and storing the generated mining rule in the mining rule database depending on a user's selection.Type: ApplicationFiled: May 22, 2012Publication date: September 13, 2012Applicant: KOREA INSTITUTE OF SCIENCE & TECHNOLOGY INFORMATIONInventors: Han Min JUNG, Pyung KIM, Seung Woo LEE, Mi Kyung LEE, Dong Min SEO, Won Kyung SUNG
-
Publication number: 20120233215Abstract: Processing medical records may be provided. First, medical records may be received from a plurality of sources. The medical records may then be converted into a computer-readable form. Once converted, the medical records may be searched for certain key words, phrases, or symbols. These searched key words, phrases and symbols may correspond to data of interest within the medical records. Once located, the searched key words, phrases and symbols may be extracted from the medical records, as well as an area of the records surrounding the located key words, phrases and symbols. Finally, the extracted data may be used to generate a summary report.Type: ApplicationFiled: March 10, 2011Publication date: September 13, 2012Inventor: Everett Darryl Walker
-
Publication number: 20120233214Abstract: A mining rule database update apparatus using a named entity dictionary and a mining rule combined with an ontology schema includes: a named entity dictionary and mining rule database storage module storing the named entity dictionary and a mining rule database; a named entity and mining rule search module searching for a corresponding mining rule and a named entity from the mining rule database and the named entity dictionary using a terminology included in an inputted mining pattern and the mining pattern; and a mining rule database update module estimating a relationship name using a named entity of the terminology and the ontology schema, generating a corresponding mining rule, and storing the generated mining rule in the mining rule database depending on user's selection.Type: ApplicationFiled: May 22, 2012Publication date: September 13, 2012Applicant: KOREA INSTITUTE OF SCIENCE & TECHNOLOGY INFORMATIONInventors: Han Min Jung, Pyung Kim, Seung Woo Lee, Mi Kyung Lee, Dong Min Seo, Won Kyung Sung
-
Publication number: 20120221602Abstract: A method and an apparatus for word quality mining and evaluating are disclosed. The method includes: calculating a Document Frequency (DF) of a word in mass categorized data; evaluating the word in multiple single-aspects according to the DF of the word; and evaluating the word in multiple aspects according to the multiple single aspect evaluations to obtain an importance weight of the word. According to the solution of the present invention, the importance of the word in the mass categorized data may be evaluated, and words with high quality may be obtained through an integrated evaluation.Type: ApplicationFiled: May 7, 2012Publication date: August 30, 2012Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Huaijun Liu, Zhongbo Jiang, Gaolin Fang
-
Patent number: 8250024Abstract: Method and system for optimizing search results in a business intelligence system. An member is selected in the business intelligence system having a user space, a content space, a data space, a master-data space and a metadata space. A relationship is determined between the member and a plurality of objects in the user space, the content space, the data space, the master-data space, or the metadata space. A ranking of the member is calculated based on the relationship. A relevance of the member in the business intelligence system is calculated using the ranking, thereby optimizing search results of the business intelligence system using the relevance of the object.Type: GrantFiled: May 12, 2009Date of Patent: August 21, 2012Assignee: International Business Machines CorporationInventors: Graham Douglas Mackintosh, John Andrew Kowal
-
Publication number: 20120209879Abstract: Embodiments of the invention are related to identifying a user's intent dynamically from at least a set of metadata associated with the user, wherein the set of metadata is associated with a user input, and providing to the user a set of labeled instances on determination of a user's intent, the set of labeled instances being directly related to user's intent, where the set of labeled instances are obtained in real-time from a set of information repositories.Type: ApplicationFiled: February 11, 2011Publication date: August 16, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Nilanjan Banerjee, Dipanjan Chakraborty, Anupam Joshi, Sumit Mittal, Seema Nagar, Angshu Rai, Koustuv Dasgupta
-
Publication number: 20120209880Abstract: A method of constructing a general mixture model of a dataset includes partitioning the dataset into at least two subsets according to predefined criteria, generating a subset mixture model for each of the at least two subsets, and then combining the mixture models from each subset to generate a general mixture model.Type: ApplicationFiled: February 15, 2011Publication date: August 16, 2012Applicant: GENERAL ELECTRIC COMPANYInventors: Robert Edward Callan, Brian Larder
-
Publication number: 20120209835Abstract: Computer-readable media and computerized methods for automatically organizing search results according to task groups are provided. The methods involve aggregating a gallery of entities (e.g., search queries that share a common categorization) into a query class and assigning a dictionary (e.g., list of terms that are drawn from various sources) to the query class. The task groups are identified from the list of terms within the dictionary. The process of identification includes analyzing patterns of user search behavior to select terms from the list of terms, which reflect popular user search intents, and ranking the selected terms based on predetermined parameters to produce an ordering. Based on the ordering, a set of the selected terms that are highest ranked are declared the task groups. The task groups are employed to arrange the search results on a UI display and to provide a consistent and intuitive format for refining a search.Type: ApplicationFiled: April 24, 2012Publication date: August 16, 2012Applicant: MICROSOFT CORPORATIONInventors: Sanaz Ahari, Xiaoxin Yin, Farid Hosseini, Sarthak Shah, Adam Troy, Dan Fain, Brian MacDonald, Nikhil Dandekar, Michael Cameron
-
Publication number: 20120209882Abstract: A system, a method, and a computer readable media for identifying a user-initiated log file record in a log file are provided. The log file has a user-initiated log file record and a repeating pattern of log file records automatically generated by a software program. The system allows a user to identify first and second timestamp values corresponding to first and second times which identify a time interval of interest in the log file. The system further analyzes the log file to identify the user-initiated log file record having a timestamp value between the first and second timestamp values. The system further identifies the repeating pattern of log file records in the log file.Type: ApplicationFiled: April 26, 2012Publication date: August 16, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Danny Yen-Fu Chen, David A. Cox, Sheryl S. Kinstler, Fabian F. Morgan
-
Publication number: 20120209881Abstract: Methods, systems and software for searching electronic documents allow a user to enter a single or multiple letters of a word group, name, phrase, or the like, and see hits that include the data inputted. The search tool solves the problem of finding the full words and ultimately the meaning of an acronym when reading a web page, word processing document, or other electronic searchable material. The search tool also solves the problem of searching through a document for particular word groups, phrases, and names, and may be especially useful where the exact spelling is unknown. The search tool may allow consumers the ability to search based on one or more characters of each name or word independently of the remaining characters in the name, phrase or word search.Type: ApplicationFiled: February 10, 2012Publication date: August 16, 2012Inventor: Bruce Eliot Ross
-
Patent number: 8239535Abstract: A network architecture with load balancing, fault tolerance and distributed querying comprises a plurality of front-end servers, a plurality of back-end servers, and a database. The front-end servers are coupled to a network to receive data requests from client devices. The front-end servers are each coupled to the plurality of back-end servers. The front-end servers handle data requests at a macro level and divide the request into sub-requests that are sent to the plurality of back-end servers. The back-end servers are coupled to the database to retrieve data. Each data request is distributed across the plurality of back-end servers according to workload. The front-end servers are fault tolerant in that they can respond to a request for data without all of the back-end servers being responsive or providing data.Type: GrantFiled: December 20, 2005Date of Patent: August 7, 2012Assignee: Adobe Systems IncorporatedInventors: Christopher Reid Error, Michael Paul Bailey
-
Patent number: 8234298Abstract: Method and system for determining a driving factor for a data value of interest in a multidimensional database, by collecting a context for the data value of interest in the multidimensional database. The data value of interest has dimensional levels with dimensional members outside the drill path of the data value of interest. The dimensional levels are enumerated in a list. A query using the dimensional members of the dimensional level is executed. A variance is calculated for the set of query results. A driving factor for the data value of interest is determined based on the variance. The driving factor is added to the context of the data value of interest.Type: GrantFiled: July 25, 2007Date of Patent: July 31, 2012Assignee: International Business Machines CorporationInventors: Stewart James Winter, Randy Mark Westman, Murray John Reid, Andrew Alexander Leikucs, William Todd MacCulloch
-
Patent number: 8229929Abstract: A system and associated method for evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively.Type: GrantFiled: January 6, 2010Date of Patent: July 24, 2012Assignee: International Business Machines CorporationInventors: Jeffrey M. Achtermann, Indrajit Bhattacharya, Kevin W. English, Jr., Shantanu R. Godbole, Sachindra Joshi, Ashwin Srinivasan, Ashish Verma
-
Patent number: 8229956Abstract: With respect to each part at which a word included in a characteristic condition defining a characteristic text set designated by a user through the input device appears in text, the characteristic condition assurance degree calculating unit of the text mining device obtains a reliability of the word from the word reliability storage unit to operate a value of a characteristic condition assurance degree for each text by predetermined operation based on all the obtained reliabilities. The characteristic condition assurance degree calculating unit executes operation such that when a value of each reliability is large, a value of a degree of assurance becomes large. The representative text output unit outputs text whose characteristic condition assurance degree is the highest among texts whose characteristic condition assurance degrees are calculated together with its characteristic condition assurance degree.Type: GrantFiled: November 30, 2006Date of Patent: July 24, 2012Assignee: NEC CorporationInventors: Takahiro Ikeda, Yoshihiro Ikeda, legal representative, Satoshi Nakazawa, Yousuke Sakao, Kenji Satoh
-
Patent number: 8224622Abstract: The present invention relates to an iterative method and an apparatus for distribution-independent detection of intermediate outliers and outliers in the distribution tail of streamed data. A considerable sequence of streamed data is sequentially read and subsequently assigned to matching bins. The bins are adaptively allocated when, where and if they are needed. Each bin range expands concurrently with the distribution range of the accumulating items assigned to the bin, adding a margin. For every N'th read item, overlapping or adjoining bins are merged, whereupon the bins are assessed for insider preclusion. Information regarding outliers is extracted from the remaining outlier bins when the entire data sequence has been processed.Type: GrantFiled: July 27, 2009Date of Patent: July 17, 2012Assignee: Telefonaktiebolaget L M Ericsson (Publ)Inventors: N Hari Kumar, J Mohamed Zahoor
-
Patent number: 8219582Abstract: A system, a method, and a computer readable media for identifying a user-initiated log file record in a log file are provided. The log file has a user-initiated log file record and a repeating pattern of log file records automatically generated by a software program. The system allows a user to identify first and second timestamp values corresponding to first and second times which identify a time interval of interest in the log file. The system further analyzes the log file to identify the user-initiated log file record having a timestamp value between the first and second timestamp values. The system further identifies the repeating pattern of log file records in the log file.Type: GrantFiled: April 25, 2008Date of Patent: July 10, 2012Assignee: International Business Machines CorporationInventors: Danny Yen-Fu Chen, David A. Cox, Sheryl S. Kinstler, Fabian F. Morgan
-
Patent number: 8219583Abstract: Mining of websites that in one embodiment includes obtaining web usage data of user sessions of a website, wherein the website has a hierarchical structure with granular levels and has mapping from each webpage of the website into the hierarchical structure, mapping the user sessions to the hierarchical structure of the website resulting in hierarchical user sessions, initiating an edit distance metrics to determine similarity in the hierarchical user sessions, and clustering similar hierarchical user sessions into groups.Type: GrantFiled: November 10, 2008Date of Patent: July 10, 2012Assignee: NBCUniversal Media, LLCInventors: Abha Moitra, Steven Matt Gustafson, Feng Xue
-
Patent number: 8219566Abstract: A system and method are provided for comparing portions of document text with potential citation components, determining if individual portions correspond to a citation component, and determining if a set of portions correspond to a valid citation pattern. A set of valid citation patterns is provided. Each citation pattern may include a specified combination of citation components. The invention further relates to identifying potential citation components from text in a document, analyzing a pattern of the identified citation components by comparing the pattern to a set of stored citation patterns to determine if the potential citation is a type of citation, and if so, is it a valid (and/or invalid) citation pattern. Once citation patterns have been determined in the document, annotations may be inserted into the document, and subsequent action may be taken, for example, generating a list of citations, providing research services, error-handling, and/or providing other options related to the citations.Type: GrantFiled: August 30, 2011Date of Patent: July 10, 2012Assignee: Litera Corp.Inventor: Tony Rolle
-
Patent number: 8214392Abstract: A computer automated method and system of presenting data. The method may include the steps of inputting a set of user-defined instructions into a remotely located computer database system via a public network connection, inputting a user query into the computer database system via the public network connection, mining the computer database system for data relevant to the user query, creating a data set comprising the data relevant to the user query, and aggregating data in the data set using domain metrics selected based on any of predefined and configurable rules and past user usage. The aggregation may further include tagging all data attributes in the data set based on database metadata and inputs from a user, wherein the data attributes comprise any of data identifications (IDs), data grouping attributes, and data measure attributes.Type: GrantFiled: December 22, 2010Date of Patent: July 3, 2012Assignee: Semantifi, Inc.Inventors: Sreenivasa R Pragada, Viswanath Dasari
-
Patent number: 8214391Abstract: In a data mining system, data is gathered into a data store using, e.g., a Web crawler. The data is classified into entities. Data miners use rules to process the entities and append respective keys to the entities representing characteristics of the entities as derived from rules embodied in the miners. With these keys, characteristics of entities as defined by disparate expert authors of the data miners are identified for use in responding to complex data requests from customers.Type: GrantFiled: May 8, 2002Date of Patent: July 3, 2012Assignee: International Business Machines CorporationInventors: Matthew Denesuk, Daniel Frederick Gruhl, Kevin Snow McCurley, Sridhar Rajagopalan, Andrew S. Tomkins
-
Publication number: 20120166484Abstract: The present invention provides a system, method and computer program for multi-dimensional temporal abstraction and data mining. The invention comprises collecting and optionally cleaning multi-dimensional data, the multi-dimensional data including a plurality of data streams; temporally abstracting the multi-dimensional data; and relatively aligning the temporally abstracted multi-dimensional data based on a shared time point of interest.Type: ApplicationFiled: July 22, 2010Publication date: June 28, 2012Inventor: Carlolyn Patricia McGregor
-
Publication number: 20120158783Abstract: Large-scale event processing systems are often designed to perform data mining operations by storing a large set of events in a massive database, applying complex queries to the records of the events, and generating reports and notifications. However, because such queries are performed on very large data sets, the processing of the queries often introduces a significant delay between the occurrence of the events and the reporting or notification thereof. Instead, a large-scale event processing system may be devised as a large state machine organized according to an evaluation plan, comprising a graph of event processors that, in realtime, evaluate each event in an event stream to update an internal state of the event processor, and to perform responses when response conditions are met. The continuous monitoring and evaluation of the stream of events may therefore enable the event processing system to provide realtime responses and notifications of complex queries.Type: ApplicationFiled: December 20, 2010Publication date: June 21, 2012Applicant: Microsoft CorporationInventors: Nir Nice, Daniel Sitton, Dror Kremer, Michael Feldman
-
Patent number: 8204904Abstract: A network's evolution is characterized by graph evolution rules. A graph, formed by merging multiple graphs representing the multiple snapshots of the network, that represents an evolutionary network is mined to identify evolutional patterns of the network. A pattern is selected from the identified patterns. Graph evolution rules are generated using identified evolutional patterns. The generated graph evolution rules represent the evolutional patterns of the network, the rules indicating that any occurrence of a child pattern of the selected pattern implies a corresponding occurrence of the selected pattern.Type: GrantFiled: September 30, 2009Date of Patent: June 19, 2012Assignee: Yahoo! Inc.Inventors: Francesco Bonchi, Aristides Gionis, Michele Berlingerio, Björn Bringmann
-
Publication number: 20120143913Abstract: Data stored in a column-oriented manner is encoded using a data mining algorithm for finding column patterns among a set of data tuples, where each data tuple contains a set of columns, and the data mining algorithm treats all columns and all column combinations and column ordering similarly or in the same manner when looking for column patterns. Column values are ordered occurring in the column patterns based on their frequencies into a prefix tree, where the prefix tree defines a pattern order. The data tuples are sorted according to the pattern order, resulting in sorted data tuples, and columns of the sorted data tuples are encoded using run-length encoding.Type: ApplicationFiled: August 10, 2011Publication date: June 7, 2012Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Felix Beier, Oliver Draese, Knut Stolze