Data Mining Patents (Class 707/776)
  • Patent number: 8311792
    Abstract: A method for training a ranking application. The method includes ranking the help postings to create an initial ranking using initial parameter values, and storing user interactions with the help postings to obtain stored interactions. Simulations are performed using the stored interactions to generate revised parameter values for the ranking application. Performing the simulations includes calculating relevance values from the stored interactions, creating a test posting, assigning, to the test posting, an initial score and a relevance value randomly selected from the relevance values to generate a test ranking, and simulating user interactions with the test ranking to generate simulated rankings. The simulated rankings are analyzed to obtain revised parameter values. The method further includes ranking, using the revised parameter values, the help postings to generate a revised ranking, and displaying the help postings in the forum according to the revised ranking.
    Type: Grant
    Filed: December 23, 2009
    Date of Patent: November 13, 2012
    Assignee: Intuit Inc.
    Inventors: Igor A. Podgorny, Floyd J. Morgan, Derek Szydlowski
  • Patent number: 8312041
    Abstract: RDF network construction device and method using an ontology schema having class dictionaries and mining rules. The RDF network construction device includes an ontology schema storing module, a class managing module, a mining rule managing module, a mining pattern creating module, and an RDF triple creating module.
    Type: Grant
    Filed: October 5, 2010
    Date of Patent: November 13, 2012
    Assignee: Korea Institute of Science and Technology Information
    Inventors: Han Min Jung, Pyung Kim, Seung Woo Lee, Mi Kyung Lee, Dong Min Seo, Won Kyung Sung
  • Publication number: 20120284198
    Abstract: Systems, methods and computer program products are provided for identifying patents of value for acquisition. Through analysis of actions by patent holders, accused infringers, competitors to patent holders, non-practicing entities and challengers of patents, patents of potential economic value can be identified. The acquisition of these patent rights can facilitate the development of a valuable patent portfolio.
    Type: Application
    Filed: May 2, 2011
    Publication date: November 8, 2012
    Applicant: ARTICLE ONE PARTNERS HOLDINGS
    Inventor: Cheryl Milone
  • Patent number: 8306997
    Abstract: Sources of operational problems in business transactions often show themselves in relatively small pockets of data, which are called trouble hot spots. Identifying these hot spots from internal company transaction data is generally a fundamental step in the problem's resolution, but this analysis process is greatly complicated by huge numbers of transactions and large numbers of transaction variables to analyze. A suite of practical modifications are provided to data mining techniques and logistic regressions to tailor them for finding trouble hot spots. This approach thus allows the use of efficient automated data mining tools to quickly screen large numbers of candidate variables for their ability to characterize hot spots. One application is the screening of variables which distinguish a suspected hot spot from a reference set.
    Type: Grant
    Filed: May 27, 2011
    Date of Patent: November 6, 2012
    Assignee: Verizon Services Corp.
    Inventor: James Howard Drew
  • Patent number: 8306945
    Abstract: A method, system and medium for organizing and associating log records into logically related groups is described. One or more input sources from, possibly, different systems/subsystems are input to a log correlation method. As the log records are processed the fields are interrogated to determine which log records are related to each other. As further log records are processed more information about previously unidentifiable relationships is determined. After this later information is known, log records that could previously not be associated with any other log records are added to the existing association. The system engineer is therefore presented with the pertinent information for monitoring, administrating and diagnosing system activities.
    Type: Grant
    Filed: October 8, 2007
    Date of Patent: November 6, 2012
    Assignee: BMC Software, Inc.
    Inventors: Larry Morris, Dale G. Wood
  • Publication number: 20120278361
    Abstract: A system and method are provided for augmenting information on business directory databases and communicating with businesses is disclosed. Using the enriched business directory database and Web mining technology, customized email message are sent inviting businesses to enter their enriched business information into the directory or even subscribe to other paid services provided by the directory service.
    Type: Application
    Filed: July 8, 2012
    Publication date: November 1, 2012
    Inventors: Narendra Gupta, Mazin Gilbert, Benjamin J. Stern
  • Patent number: 8301653
    Abstract: The present invention discloses a computer system for reporting online sessions and a computer enabled method utilizing the same. The computer system is made up of an icon that preferably appears on a user screen. The icon is capable of capturing a screen session on the user screen and saving it within a recording. The recording may then be communicated to a database server that is capable of extracting a plurality of target components from said recording, and is capable of storing them in a database. The database may contain a benchmark content of the plurality of target components. Target components may then be compared against the benchmark content in a variety of ways to determine whether the level of target components is above or below reasonable and socially accepted levels.
    Type: Grant
    Filed: January 25, 2010
    Date of Patent: October 30, 2012
    Inventors: Glenn Adamousky, Dennis Nagy
  • Patent number: 8301585
    Abstract: A system includes reception of metadata associated with one or more measures, determination of a compatibility of the two or more measures based on the metadata, and determination of a first score associated with a first visualization based on the compatibility. Some aspects include determination of a second score associated with a second visualization based on the compatibility, comparison of the first score and the second score, and recommendation of one of the first visualization or the second visualization based on the comparison.
    Type: Grant
    Filed: December 7, 2009
    Date of Patent: October 30, 2012
    Assignee: Business Objects Software Limited
    Inventors: Nicolas Mourey, Aurélien Theraud
  • Patent number: 8296300
    Abstract: The present invention relates to a method for reconstructing protein database for identifying a protein and a method for screening a protein using the same, more precisely a method for reconstructing protein database, and a method for identifying a protein using the same. The method for reconstructing protein database and the method for identifying the protein of the invention are very useful for the investigation of endogenous proteins and their functions and interactions, and are further effectively used for the development of diagnostic and therapeutic agents for various diseases.
    Type: Grant
    Filed: August 18, 2006
    Date of Patent: October 23, 2012
    Assignee: Korea Basic Science Institute
    Inventors: Kyung-Hoon Kwon, Jong Shin Yoo
  • Patent number: 8291058
    Abstract: The present invention describes a system and method of extracting and storing data elements from network packets, thus performing the task of data mining. In one embodiment of the present invention incoming packets are decomposed one protocol layer at a time to extract data elements contained in the protocol headers. Layer-specific parsers perform deep packet inspection in order to extract data elements from upper-level protocols. Extracted data is arranged in rows, which are subsequently stored into a memory-based accumulator. After some length of time the accumulator is flushed to disk files. Another process reads the flushed disk files row-by-row, inserting each row into a relational database. Standard SQL operations are performed on the relational database in order to generate and display reports of the collected data.
    Type: Grant
    Filed: February 19, 2010
    Date of Patent: October 16, 2012
    Assignee: Intrusion, Inc.
    Inventors: Tommy Joe Head, Daris A. Nevil
  • Publication number: 20120260188
    Abstract: One or more techniques and/or systems are provided for identifying potential recipients for a communication (e.g., email, instant message, content sharing platform, etc.) a user is presently preparing based at least in part upon a user's communication history. That is, information about the user's interactions with past recipients of his/her communications are compared with information known about the present communication and potential recipients of the present communication are identified based upon this comparison. Moreover, in one embodiment, based upon the past interactions of the user with others, one or more communication groups can be identified and presented to the user. In this way, a user may select a communication group and recipients included in the communication group can be added as recipients for the communication the user is presently preparing without the user having to manually create such groups, for example.
    Type: Application
    Filed: April 6, 2011
    Publication date: October 11, 2012
    Applicant: Microsoft Corporation
    Inventors: Seung-Hae Park, Suraj Samaranayake, Arcadiy Gregory Kantor, Sarah Filman, Omar H. Shahine, Piero Sierra, Stephen Liffick, Anthony Frey, Siddhartha Parmar
  • Publication number: 20120259890
    Abstract: In a data mining system, data is gathered into a data store using, e.g., a Web crawler. The data is classified into entities. Data miners use rules to process the entities and append respective keys to the entities representing characteristics of the entities as derived from rules embodied in the miners. With these keys, characteristics of entities as defined by disparate expert authors of the data miners are identified for use in responding to complex data requests from customers.
    Type: Application
    Filed: June 18, 2012
    Publication date: October 11, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Matthew Denesuk, Daniel Frederick Gruhl, Sridhar Rajagopalan, Andrew S. Tomkins
  • Patent number: 8285728
    Abstract: The present invention enables the construction and deployment of semantic search engines, which learn. Here, the learning necessarily occurs in unsupervised mode given the huge amount of text being searched and the resultant impracticality of providing human feedback. That is, it acquires a lexicon of synonyms and the context for their proper application by searching a moving window of text and compiling the results. No human operator is involved. This allows for the practical search, using machine learning techniques, of very large textual bases.
    Type: Grant
    Filed: August 24, 2010
    Date of Patent: October 9, 2012
    Assignee: The United States of America as represented by the Secretary of the Navy
    Inventor: Stuart Rubin
  • Patent number: 8285744
    Abstract: Indexing agents and/or data brokers are leveraged to provide search query results related to manufacturing processes. The indexing agents allow different manufacturing configuration data types to be “sub-indexed,” allowing them to be easily searched. In one instance, the sub-indices can be aggregated together to create an overall index to facilitate in query searches of the configuration data. Separate indexing agents can be utilized for indexing contents of the configuration components for the human-machine interface (HMI) and control system and the like. Data brokers can be employed to facilitate in responding to query searches by indexing/searching real-time process variables (tags) and historical data in persistent storage. A search engine can then be employed to aggregate the search results and present them to a user in a selectable fashion. User selected results are then rendered in the proper format and displayed to the user.
    Type: Grant
    Filed: September 30, 2005
    Date of Patent: October 9, 2012
    Assignee: Rockwell Automation Technologies, Inc.
    Inventors: Eric G. Dorgelo, Kevin G. Gordon, Clifton H. Bromley, Douglas J. Reichard, Marc D. Semkow, Shafin A. Virji
  • Patent number: 8285715
    Abstract: A system and method for displaying items by receiving a query, identifying query feature values for similarity features associated with the query, identifying items each having item feature values for similarity features thereof, for each of the identified items and for each of the similarity features, determining a feature distance between the query feature value and the item feature value, and presenting the identified items in a two dimensional array of cells. The array defines reference vectors corresponding to the similarity features. The identified items are positioned within the array relative to the origin cell, for any of the cells, by determining difference function values. for the identified items each based on the feature distances of the item, the reference vectors and the coordinates of the cell, and placing one of the items in the cell based upon the determined difference function values.
    Type: Grant
    Filed: August 17, 2009
    Date of Patent: October 9, 2012
    Assignee: Ugmode, Inc.
    Inventors: Arlo Mukai Faria, Ajeet Ganesh Shankar
  • Patent number: 8285745
    Abstract: Systems and methods to determine relevant keywords from a user's search query sessions are disclosed. The described method includes identifying search session logs of a user, segmenting the search session logs into one or more search sessions. After the segmentation, the search sessions are analyzed to compose a list of semantically relevant keyword sets including at least a first keyword set and a second keyword set. The described method further includes determining a semantic relevance between the first and second keyword sets according to the frequency at which the first and second keyword sets are reported in the query results and displaying one or more semantically high relevant keyword sets after being filtered by a threshold.
    Type: Grant
    Filed: August 31, 2007
    Date of Patent: October 9, 2012
    Assignee: Microsoft Corporation
    Inventors: Hua Li, HuaJun Zeng, Jian Hu, Zheng Chen, Jian Wang
  • Patent number: 8285748
    Abstract: A method and apparatus for proactive information security management is described. In one embodiment, for example, a computer-implemented method for controlling access to sensitive information, the method comprising: maintaining access constraint data that can be used to control access to the sensitive information, wherein the access constraint data includes match pattern data and apply pattern data; receiving a semantic query from a querier requesting access to the sensitive information; based on the match pattern data, determining whether the semantic query should be constrained according to the apply pattern data; where said semantic query should be constrained according to the apply pattern data, rewriting the semantic query according to the apply pattern data to produce a rewritten query; executing the rewritten query against a database that contains the sensitive information; and returning any results of executing the rewritten query.
    Type: Grant
    Filed: May 26, 2009
    Date of Patent: October 9, 2012
    Assignee: Oracle International Corporation
    Inventors: John S. Thomas, Aravind Yalamanchi, Idriss Mekrez, Matt Topper
  • Publication number: 20120254243
    Abstract: Systems, methods, and media for generating fused risk scores for determining fraud in call data are provided herein. Some exemplary methods include generating a fused risk score used to determine fraud from call data by generating a fused risk score for a leg of call data, via a fuser module of an analysis system, the fused risk score being generated by fusing together two or more uniquely calculated fraud risk scores, each of the uniquely calculated fraud risk scores being generated by a sub-module of the analysis system; and storing the fused risk score in a storage device that is communicatively couplable with the fuser module.
    Type: Application
    Filed: March 8, 2012
    Publication date: October 4, 2012
    Inventors: Torsten Zeppenfeld, N. Nikki Mirghafori, Lisa Guerra, Richard Gutierrez, Anthony Rajakumar
  • Publication number: 20120254242
    Abstract: Systems, methods, and computer-readable code stored on a non-transitory media for mining association rules include determining a minimum support threshold and a minimum confidence threshold for association rule mining; determining a sampling model; sampling transactions from a transaction dataset; mining association rules from the sampled transactions; and transmitting mined association rules.
    Type: Application
    Filed: May 19, 2011
    Publication date: October 4, 2012
    Applicant: INFOSYS TECHNOLOGIES LIMITED
    Inventors: Balasubramanian Kanagasabapathi, K. Antony Arokia Durai Raj
  • Publication number: 20120254241
    Abstract: Embodiments of the present disclosure set forth a method for selecting a preferred data set. The method includes generating a candidate data set based on a first data set having a first join attribute, and a first aggregate attribute and a second data set having a second join attribute compatible with the first join attribute, and a second aggregate attribute, wherein the candidate data set includes a total attribute having a value that is based on aggregating a value associated with the first aggregate attribute and a value associated with the second aggregate attribute; and selecting the preferred data set from the candidate data set based on the total attribute.
    Type: Application
    Filed: March 28, 2011
    Publication date: October 4, 2012
    Applicant: INDIAN INSTITUDE OF TECHNOLOGY KANPUR
    Inventors: ARNAB BHATTACHARYA, PALVALI TEJA B
  • Patent number: 8281246
    Abstract: A map user interface control provides functionality for displaying a map in conjunction with the display of a Web page. The map control operates in combination with a location extraction component that analyzes the contents of the Web page to identify locations mentioned therein. Once the location extraction component has identified the locations mentioned in the Web page, a map is generated that encompasses the locations identified in the Web page. Once the map has been generated, the map control displays the map in conjunction with the display of the Web page. The map might include visual indicators corresponding to the locations mentioned in the Web page. The map might also include visual indicators corresponding to other locations near the locations identified in the Web page that have been identified using co-occurrence values generated through an analysis of a set of travelogues.
    Type: Grant
    Filed: September 29, 2009
    Date of Patent: October 2, 2012
    Assignee: Microsoft Corporation
    Inventors: Rong Xiao, Jiangming Yang, Lei Zhang, Xingrong Chen
  • Patent number: 8280846
    Abstract: A swarm can develop around a piece of content. The swarm can include the original content, changes to the original content, the persons contributing the changes, and metadata, such as comments contributed by members of the swarm. A swarm can also include statistics generated about the content, such as the size of the swarm, the growth and/or death rates of the swarm, the longevity of the swarm, the intensity of the swarm, the persistence of the swarm, and the direction of the swarm. Swarms and their behaviors can be used to validate or invalidate content.
    Type: Grant
    Filed: January 19, 2010
    Date of Patent: October 2, 2012
    Assignee: Novell, Inc.
    Inventors: Andrew Fox, David Marshall LaPalomento, Ian Edward Roughley, Scott A. Isaacson
  • Patent number: 8280864
    Abstract: A system, method, and computer program product are provided for retrieving presentation settings from a database. In use, presentation capabilities information associated with media hardware is received. Further, a plurality of presentation settings is retrieved from a database, utilizing the presentation capabilities information.
    Type: Grant
    Filed: December 17, 2007
    Date of Patent: October 2, 2012
    Assignee: NVIDIA Corporation
    Inventors: William S. Herz, Alexander E. Soohoo
  • Patent number: 8280899
    Abstract: An event is described herein as being representable by a quantified abstraction of the event. The event includes at least one predicate, and the at least one predicate has at least one constant symbol corresponding thereto. An instance of the constant symbol corresponding to the event is identified, and the instance of the constant symbol is replaced by a free variable to obtain an abstracted predicate. Thus, a quantified abstraction of the event is composed as a pair: the abstracted predicate and a mapping between the free variable and an instance of the constant symbol that corresponds to the predicate. A data mining algorithm is executed over abstracted, quantified events to ascertain a correlation between the event and another event.
    Type: Grant
    Filed: October 14, 2009
    Date of Patent: October 2, 2012
    Assignee: Microsoft Corporation
    Inventors: David Lo, Ganesan Ramalingam, Venkatesh-Prasad Ranganath, Kapil Vaswani
  • Publication number: 20120246196
    Abstract: A network's evolution is characterized by graph evolution rules. A graph that represents an evolutionary network is mined to identify evolutional patterns of the network, and graph evolution rules are generated using identified evolutional patterns. The generated graph evolution rules represent the evolutional patterns of the network.
    Type: Application
    Filed: June 6, 2012
    Publication date: September 27, 2012
    Inventors: Francesco Bonchi, Aristides Gionis, Michele Berlingerio, Björn Bringmann
  • Patent number: 8275608
    Abstract: A soft clustering method comprises (i) grouping items into non-exclusive cliques based on features associated with the items, and (ii) clustering the non-exclusive cliques using a hard clustering algorithm to generate item groups on the basis of mutual similarity of the features of the items constituting the cliques. In some named entity recognition embodiments illustrated herein as examples, named entities together with contexts are grouped into cliques based on mutual context similarity. Each clique includes a plurality of different named entities having mutual context similarity. The cliques are clustered to generate named entity groups on the basis of mutual similarity of the contexts of the named entities constituting the cliques.
    Type: Grant
    Filed: July 3, 2008
    Date of Patent: September 25, 2012
    Assignee: Xerox Corporation
    Inventors: Julien Ah-Pine, Guillaume Jacquet
  • Publication number: 20120233213
    Abstract: A mining rule database update apparatus using a named entity dictionary and a mining rule combined with an ontology schema includes: a named entity dictionary and mining rule database storage module storing the named entity dictionary and a mining rule database; a mining pattern generation module recognizing terminology from text and converting the terminology into the mining pattern; a named entity and mining rule search module searching for a corresponding named entity and a mining rule from the named entity dictionary and the mining rule database using the recognized terminology and the mining pattern; and a mining rule database update module estimating a relationship name using a named entity of the recognized terminology and the ontology schema, generating a corresponding mining rule, and storing the generated mining rule in the mining rule database depending on a user's selection.
    Type: Application
    Filed: May 22, 2012
    Publication date: September 13, 2012
    Applicant: KOREA INSTITUTE OF SCIENCE & TECHNOLOGY INFORMATION
    Inventors: Han Min JUNG, Pyung KIM, Seung Woo LEE, Mi Kyung LEE, Dong Min SEO, Won Kyung SUNG
  • Publication number: 20120233215
    Abstract: Processing medical records may be provided. First, medical records may be received from a plurality of sources. The medical records may then be converted into a computer-readable form. Once converted, the medical records may be searched for certain key words, phrases, or symbols. These searched key words, phrases and symbols may correspond to data of interest within the medical records. Once located, the searched key words, phrases and symbols may be extracted from the medical records, as well as an area of the records surrounding the located key words, phrases and symbols. Finally, the extracted data may be used to generate a summary report.
    Type: Application
    Filed: March 10, 2011
    Publication date: September 13, 2012
    Inventor: Everett Darryl Walker
  • Publication number: 20120233214
    Abstract: A mining rule database update apparatus using a named entity dictionary and a mining rule combined with an ontology schema includes: a named entity dictionary and mining rule database storage module storing the named entity dictionary and a mining rule database; a named entity and mining rule search module searching for a corresponding mining rule and a named entity from the mining rule database and the named entity dictionary using a terminology included in an inputted mining pattern and the mining pattern; and a mining rule database update module estimating a relationship name using a named entity of the terminology and the ontology schema, generating a corresponding mining rule, and storing the generated mining rule in the mining rule database depending on user's selection.
    Type: Application
    Filed: May 22, 2012
    Publication date: September 13, 2012
    Applicant: KOREA INSTITUTE OF SCIENCE & TECHNOLOGY INFORMATION
    Inventors: Han Min Jung, Pyung Kim, Seung Woo Lee, Mi Kyung Lee, Dong Min Seo, Won Kyung Sung
  • Publication number: 20120221602
    Abstract: A method and an apparatus for word quality mining and evaluating are disclosed. The method includes: calculating a Document Frequency (DF) of a word in mass categorized data; evaluating the word in multiple single-aspects according to the DF of the word; and evaluating the word in multiple aspects according to the multiple single aspect evaluations to obtain an importance weight of the word. According to the solution of the present invention, the importance of the word in the mass categorized data may be evaluated, and words with high quality may be obtained through an integrated evaluation.
    Type: Application
    Filed: May 7, 2012
    Publication date: August 30, 2012
    Applicant: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Huaijun Liu, Zhongbo Jiang, Gaolin Fang
  • Patent number: 8250024
    Abstract: Method and system for optimizing search results in a business intelligence system. An member is selected in the business intelligence system having a user space, a content space, a data space, a master-data space and a metadata space. A relationship is determined between the member and a plurality of objects in the user space, the content space, the data space, the master-data space, or the metadata space. A ranking of the member is calculated based on the relationship. A relevance of the member in the business intelligence system is calculated using the ranking, thereby optimizing search results of the business intelligence system using the relevance of the object.
    Type: Grant
    Filed: May 12, 2009
    Date of Patent: August 21, 2012
    Assignee: International Business Machines Corporation
    Inventors: Graham Douglas Mackintosh, John Andrew Kowal
  • Publication number: 20120209879
    Abstract: Embodiments of the invention are related to identifying a user's intent dynamically from at least a set of metadata associated with the user, wherein the set of metadata is associated with a user input, and providing to the user a set of labeled instances on determination of a user's intent, the set of labeled instances being directly related to user's intent, where the set of labeled instances are obtained in real-time from a set of information repositories.
    Type: Application
    Filed: February 11, 2011
    Publication date: August 16, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Nilanjan Banerjee, Dipanjan Chakraborty, Anupam Joshi, Sumit Mittal, Seema Nagar, Angshu Rai, Koustuv Dasgupta
  • Publication number: 20120209880
    Abstract: A method of constructing a general mixture model of a dataset includes partitioning the dataset into at least two subsets according to predefined criteria, generating a subset mixture model for each of the at least two subsets, and then combining the mixture models from each subset to generate a general mixture model.
    Type: Application
    Filed: February 15, 2011
    Publication date: August 16, 2012
    Applicant: GENERAL ELECTRIC COMPANY
    Inventors: Robert Edward Callan, Brian Larder
  • Publication number: 20120209835
    Abstract: Computer-readable media and computerized methods for automatically organizing search results according to task groups are provided. The methods involve aggregating a gallery of entities (e.g., search queries that share a common categorization) into a query class and assigning a dictionary (e.g., list of terms that are drawn from various sources) to the query class. The task groups are identified from the list of terms within the dictionary. The process of identification includes analyzing patterns of user search behavior to select terms from the list of terms, which reflect popular user search intents, and ranking the selected terms based on predetermined parameters to produce an ordering. Based on the ordering, a set of the selected terms that are highest ranked are declared the task groups. The task groups are employed to arrange the search results on a UI display and to provide a consistent and intuitive format for refining a search.
    Type: Application
    Filed: April 24, 2012
    Publication date: August 16, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: Sanaz Ahari, Xiaoxin Yin, Farid Hosseini, Sarthak Shah, Adam Troy, Dan Fain, Brian MacDonald, Nikhil Dandekar, Michael Cameron
  • Publication number: 20120209882
    Abstract: A system, a method, and a computer readable media for identifying a user-initiated log file record in a log file are provided. The log file has a user-initiated log file record and a repeating pattern of log file records automatically generated by a software program. The system allows a user to identify first and second timestamp values corresponding to first and second times which identify a time interval of interest in the log file. The system further analyzes the log file to identify the user-initiated log file record having a timestamp value between the first and second timestamp values. The system further identifies the repeating pattern of log file records in the log file.
    Type: Application
    Filed: April 26, 2012
    Publication date: August 16, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Danny Yen-Fu Chen, David A. Cox, Sheryl S. Kinstler, Fabian F. Morgan
  • Publication number: 20120209881
    Abstract: Methods, systems and software for searching electronic documents allow a user to enter a single or multiple letters of a word group, name, phrase, or the like, and see hits that include the data inputted. The search tool solves the problem of finding the full words and ultimately the meaning of an acronym when reading a web page, word processing document, or other electronic searchable material. The search tool also solves the problem of searching through a document for particular word groups, phrases, and names, and may be especially useful where the exact spelling is unknown. The search tool may allow consumers the ability to search based on one or more characters of each name or word independently of the remaining characters in the name, phrase or word search.
    Type: Application
    Filed: February 10, 2012
    Publication date: August 16, 2012
    Inventor: Bruce Eliot Ross
  • Patent number: 8239535
    Abstract: A network architecture with load balancing, fault tolerance and distributed querying comprises a plurality of front-end servers, a plurality of back-end servers, and a database. The front-end servers are coupled to a network to receive data requests from client devices. The front-end servers are each coupled to the plurality of back-end servers. The front-end servers handle data requests at a macro level and divide the request into sub-requests that are sent to the plurality of back-end servers. The back-end servers are coupled to the database to retrieve data. Each data request is distributed across the plurality of back-end servers according to workload. The front-end servers are fault tolerant in that they can respond to a request for data without all of the back-end servers being responsive or providing data.
    Type: Grant
    Filed: December 20, 2005
    Date of Patent: August 7, 2012
    Assignee: Adobe Systems Incorporated
    Inventors: Christopher Reid Error, Michael Paul Bailey
  • Patent number: 8234298
    Abstract: Method and system for determining a driving factor for a data value of interest in a multidimensional database, by collecting a context for the data value of interest in the multidimensional database. The data value of interest has dimensional levels with dimensional members outside the drill path of the data value of interest. The dimensional levels are enumerated in a list. A query using the dimensional members of the dimensional level is executed. A variance is calculated for the set of query results. A driving factor for the data value of interest is determined based on the variance. The driving factor is added to the context of the data value of interest.
    Type: Grant
    Filed: July 25, 2007
    Date of Patent: July 31, 2012
    Assignee: International Business Machines Corporation
    Inventors: Stewart James Winter, Randy Mark Westman, Murray John Reid, Andrew Alexander Leikucs, William Todd MacCulloch
  • Patent number: 8229929
    Abstract: A system and associated method for evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively.
    Type: Grant
    Filed: January 6, 2010
    Date of Patent: July 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Jeffrey M. Achtermann, Indrajit Bhattacharya, Kevin W. English, Jr., Shantanu R. Godbole, Sachindra Joshi, Ashwin Srinivasan, Ashish Verma
  • Patent number: 8229956
    Abstract: With respect to each part at which a word included in a characteristic condition defining a characteristic text set designated by a user through the input device appears in text, the characteristic condition assurance degree calculating unit of the text mining device obtains a reliability of the word from the word reliability storage unit to operate a value of a characteristic condition assurance degree for each text by predetermined operation based on all the obtained reliabilities. The characteristic condition assurance degree calculating unit executes operation such that when a value of each reliability is large, a value of a degree of assurance becomes large. The representative text output unit outputs text whose characteristic condition assurance degree is the highest among texts whose characteristic condition assurance degrees are calculated together with its characteristic condition assurance degree.
    Type: Grant
    Filed: November 30, 2006
    Date of Patent: July 24, 2012
    Assignee: NEC Corporation
    Inventors: Takahiro Ikeda, Yoshihiro Ikeda, legal representative, Satoshi Nakazawa, Yousuke Sakao, Kenji Satoh
  • Patent number: 8224622
    Abstract: The present invention relates to an iterative method and an apparatus for distribution-independent detection of intermediate outliers and outliers in the distribution tail of streamed data. A considerable sequence of streamed data is sequentially read and subsequently assigned to matching bins. The bins are adaptively allocated when, where and if they are needed. Each bin range expands concurrently with the distribution range of the accumulating items assigned to the bin, adding a margin. For every N'th read item, overlapping or adjoining bins are merged, whereupon the bins are assessed for insider preclusion. Information regarding outliers is extracted from the remaining outlier bins when the entire data sequence has been processed.
    Type: Grant
    Filed: July 27, 2009
    Date of Patent: July 17, 2012
    Assignee: Telefonaktiebolaget L M Ericsson (Publ)
    Inventors: N Hari Kumar, J Mohamed Zahoor
  • Patent number: 8219582
    Abstract: A system, a method, and a computer readable media for identifying a user-initiated log file record in a log file are provided. The log file has a user-initiated log file record and a repeating pattern of log file records automatically generated by a software program. The system allows a user to identify first and second timestamp values corresponding to first and second times which identify a time interval of interest in the log file. The system further analyzes the log file to identify the user-initiated log file record having a timestamp value between the first and second timestamp values. The system further identifies the repeating pattern of log file records in the log file.
    Type: Grant
    Filed: April 25, 2008
    Date of Patent: July 10, 2012
    Assignee: International Business Machines Corporation
    Inventors: Danny Yen-Fu Chen, David A. Cox, Sheryl S. Kinstler, Fabian F. Morgan
  • Patent number: 8219583
    Abstract: Mining of websites that in one embodiment includes obtaining web usage data of user sessions of a website, wherein the website has a hierarchical structure with granular levels and has mapping from each webpage of the website into the hierarchical structure, mapping the user sessions to the hierarchical structure of the website resulting in hierarchical user sessions, initiating an edit distance metrics to determine similarity in the hierarchical user sessions, and clustering similar hierarchical user sessions into groups.
    Type: Grant
    Filed: November 10, 2008
    Date of Patent: July 10, 2012
    Assignee: NBCUniversal Media, LLC
    Inventors: Abha Moitra, Steven Matt Gustafson, Feng Xue
  • Patent number: 8219566
    Abstract: A system and method are provided for comparing portions of document text with potential citation components, determining if individual portions correspond to a citation component, and determining if a set of portions correspond to a valid citation pattern. A set of valid citation patterns is provided. Each citation pattern may include a specified combination of citation components. The invention further relates to identifying potential citation components from text in a document, analyzing a pattern of the identified citation components by comparing the pattern to a set of stored citation patterns to determine if the potential citation is a type of citation, and if so, is it a valid (and/or invalid) citation pattern. Once citation patterns have been determined in the document, annotations may be inserted into the document, and subsequent action may be taken, for example, generating a list of citations, providing research services, error-handling, and/or providing other options related to the citations.
    Type: Grant
    Filed: August 30, 2011
    Date of Patent: July 10, 2012
    Assignee: Litera Corp.
    Inventor: Tony Rolle
  • Patent number: 8214392
    Abstract: A computer automated method and system of presenting data. The method may include the steps of inputting a set of user-defined instructions into a remotely located computer database system via a public network connection, inputting a user query into the computer database system via the public network connection, mining the computer database system for data relevant to the user query, creating a data set comprising the data relevant to the user query, and aggregating data in the data set using domain metrics selected based on any of predefined and configurable rules and past user usage. The aggregation may further include tagging all data attributes in the data set based on database metadata and inputs from a user, wherein the data attributes comprise any of data identifications (IDs), data grouping attributes, and data measure attributes.
    Type: Grant
    Filed: December 22, 2010
    Date of Patent: July 3, 2012
    Assignee: Semantifi, Inc.
    Inventors: Sreenivasa R Pragada, Viswanath Dasari
  • Patent number: 8214391
    Abstract: In a data mining system, data is gathered into a data store using, e.g., a Web crawler. The data is classified into entities. Data miners use rules to process the entities and append respective keys to the entities representing characteristics of the entities as derived from rules embodied in the miners. With these keys, characteristics of entities as defined by disparate expert authors of the data miners are identified for use in responding to complex data requests from customers.
    Type: Grant
    Filed: May 8, 2002
    Date of Patent: July 3, 2012
    Assignee: International Business Machines Corporation
    Inventors: Matthew Denesuk, Daniel Frederick Gruhl, Kevin Snow McCurley, Sridhar Rajagopalan, Andrew S. Tomkins
  • Publication number: 20120166484
    Abstract: The present invention provides a system, method and computer program for multi-dimensional temporal abstraction and data mining. The invention comprises collecting and optionally cleaning multi-dimensional data, the multi-dimensional data including a plurality of data streams; temporally abstracting the multi-dimensional data; and relatively aligning the temporally abstracted multi-dimensional data based on a shared time point of interest.
    Type: Application
    Filed: July 22, 2010
    Publication date: June 28, 2012
    Inventor: Carlolyn Patricia McGregor
  • Publication number: 20120158783
    Abstract: Large-scale event processing systems are often designed to perform data mining operations by storing a large set of events in a massive database, applying complex queries to the records of the events, and generating reports and notifications. However, because such queries are performed on very large data sets, the processing of the queries often introduces a significant delay between the occurrence of the events and the reporting or notification thereof. Instead, a large-scale event processing system may be devised as a large state machine organized according to an evaluation plan, comprising a graph of event processors that, in realtime, evaluate each event in an event stream to update an internal state of the event processor, and to perform responses when response conditions are met. The continuous monitoring and evaluation of the stream of events may therefore enable the event processing system to provide realtime responses and notifications of complex queries.
    Type: Application
    Filed: December 20, 2010
    Publication date: June 21, 2012
    Applicant: Microsoft Corporation
    Inventors: Nir Nice, Daniel Sitton, Dror Kremer, Michael Feldman
  • Patent number: 8204904
    Abstract: A network's evolution is characterized by graph evolution rules. A graph, formed by merging multiple graphs representing the multiple snapshots of the network, that represents an evolutionary network is mined to identify evolutional patterns of the network. A pattern is selected from the identified patterns. Graph evolution rules are generated using identified evolutional patterns. The generated graph evolution rules represent the evolutional patterns of the network, the rules indicating that any occurrence of a child pattern of the selected pattern implies a corresponding occurrence of the selected pattern.
    Type: Grant
    Filed: September 30, 2009
    Date of Patent: June 19, 2012
    Assignee: Yahoo! Inc.
    Inventors: Francesco Bonchi, Aristides Gionis, Michele Berlingerio, Björn Bringmann
  • Publication number: 20120143913
    Abstract: Data stored in a column-oriented manner is encoded using a data mining algorithm for finding column patterns among a set of data tuples, where each data tuple contains a set of columns, and the data mining algorithm treats all columns and all column combinations and column ordering similarly or in the same manner when looking for column patterns. Column values are ordered occurring in the column patterns based on their frequencies into a prefix tree, where the prefix tree defines a pattern order. The data tuples are sorted according to the pattern order, resulting in sorted data tuples, and columns of the sorted data tuples are encoded using run-length encoding.
    Type: Application
    Filed: August 10, 2011
    Publication date: June 7, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Felix Beier, Oliver Draese, Knut Stolze