Data Mining Patents (Class 707/776)
  • Patent number: 8805877
    Abstract: A method, device, and computer program product are provided for regular expression learning is provided. An initial regular expression may be received from a user. The initial regular expression is executed over a database. Positive matches and negative matches are labeled. The initial regular expression and the labeled positive and negative matches are input in a transformation process. The transformation process may iteratively execute character class restrictions, quantifier restrictions, negative lookaheads on the initial regular expression to transform the initial regular expression into the pool of candidate regular expressions. The transformation process may execute, one at a time, the character class restrictions, quantifier restrictions, the negative lookaheads. A candidate regular expression is selected from the pool of candidate regular expressions, where the selected candidate regular expression has a best F-Measure out of the pool of candidate regular expressions.
    Type: Grant
    Filed: February 11, 2009
    Date of Patent: August 12, 2014
    Assignee: International Business Machines Corporation
    Inventors: Rajasekar Krishmamurthy, Yunyao Li, Sriram Raghavan, Shivakumar Vaithyanathan
  • Patent number: 8799301
    Abstract: A method for processing a data object for a database, the database containing data representing a first data model and a set of one or more mapping rules, includes receiving a data object that conforms to a second data model. The method then selects one or more of the mapping rules. The mapping rules provide a mapping between a set of elements of the second data model and a corresponding set of elements of the first data model. The method applies the selected mapping rules to transform a set of elements of the received data object into a corresponding set of elements of a target data object conforming to the first data model. The method then searches the database for the set of elements of the target data object to identify instances of the target data object in the database. A corresponding computer program product and apparatus are also disclosed.
    Type: Grant
    Filed: March 28, 2012
    Date of Patent: August 5, 2014
    Assignee: International Business Machines Corporation
    Inventors: Bin Jia, James Robert Magowan
  • Patent number: 8793263
    Abstract: A method for processing a data object for a database, the database containing data representing a first data model and a set of one or more mapping rules, includes receiving a data object that conforms to a second data model. The method then selects one or more of the mapping rules. The mapping rules provide a mapping between a set of elements of the second data model and a corresponding set of elements of the first data model. The method applies the selected mapping rules to transform a set of elements of the received data object into a corresponding set of elements of a target data object conforming to the first data model. The method then searches the database for the set of elements of the target data object to identify instances of the target data object in the database. A corresponding computer program product and apparatus are also disclosed.
    Type: Grant
    Filed: July 13, 2011
    Date of Patent: July 29, 2014
    Assignee: International Business Machines Corporation
    Inventors: Bin Jia, James Robert Magowan
  • Publication number: 20140207820
    Abstract: Disclosed herein is a method for parallel mining of temporal relations in a large event file using a MapReduce model. In the method for parallel mining of temporal relations in a large even file according to the present invention, an event file is sorted based on customer identification (ID) and event time at which each event has occurred. A set of large event types satisfying a preset support or more is generated from the event file. The event file is converted into a large event sequence including the large event type set. The large event sequence is summarized and then a time interval data file is created. Candidate temporal relations are generated from the time interval data file, and frequent temporal relations satisfying a preset support or more are derived from the candidate temporal relations. A temporal relation rule is generated from the derived frequent temporal relations.
    Type: Application
    Filed: October 9, 2013
    Publication date: July 24, 2014
    Applicant: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventor: Yong-Joon LEE
  • Patent number: 8788436
    Abstract: Features automatically extracted from semi-structured web pages are utilized by a search engine to rank documents that include semi-structured web pages. These features include, but are not limited to, a number of reviews, a number of positive reviews, and/or a number of negative reviews from a web page that includes user reviews. These features also include a number of views of a video that is viewable by way of a semi-structured web page. The features also include a number of subscribers to broadcasts of an individual from a social networking web page and a number of contacts of an individual listed on a social networking web page.
    Type: Grant
    Filed: July 27, 2011
    Date of Patent: July 22, 2014
    Assignee: Microsoft Corporation
    Inventors: Rupesh Rasiklal Mehta, Sree Hari Nagaralu, Anjana Das, Bhaskar Mitra
  • Patent number: 8788494
    Abstract: A method for processing electronic document and its corresponding device, a method for browsing electronic document and its corresponding browser, as well as a method for searching electronic document and its corresponding searching system are disclosed in the present invention. The method comprises at least the following steps of: generating one or more query according to the content of said document when an author is composing the electronic document; and correspondingly storing information about said one or more query with said electronic document. Wherein the query comprises keywords, keyword string or questions, and the query has passed the verification in order to ensure its reliability.
    Type: Grant
    Filed: August 19, 2009
    Date of Patent: July 22, 2014
    Assignee: International Business Machines Corporation
    Inventors: Shi Xia Liu, Li Ping Yang
  • Patent number: 8781815
    Abstract: A non-standard and standard clause detection system imports raw input data or contractual documents, and extracts non-standard and standard clauses that are semantically linked. One embodiment of a disclosed configuration is a system and a method for identifying non-standard and standard clauses in contractual documents. The system and the method comprise of generating a primary policy and a secondary policy, obtaining a first feature data set by applying the primary policy to a semantic language evaluator, and obtaining a second feature data set by applying the secondary policy to the semantic language evaluator. The first feature data set obtained is the aggregation of the standard clauses used in the document. Furthermore, the second feature data set encompasses the first feature data set, thus the difference between the first feature data set and the second feature data set is the aggregation of the non-standard clauses.
    Type: Grant
    Filed: December 5, 2013
    Date of Patent: July 15, 2014
    Assignee: Seal Software Ltd.
    Inventor: Kevin Gidney
  • Patent number: 8782219
    Abstract: Described herein are methods for determining patterns based on requests received by a server. Based on the determined patterns, insight into the types of requests received by the server can be gained. Additionally, performance statistics and query statistics can be aggregated in a useful way. For example, performance statistics may be summarized for each determined pattern. One technique for determining patterns includes determining a sequence of template identifiers identifying templates that correspond to sub-sequences of requests in a sequence of server requests. A model may be created based on the sequence of template identifiers. Based on the model, template patterns may be determined. Template patterns may further be grouped into pattern clusters.
    Type: Grant
    Filed: May 18, 2012
    Date of Patent: July 15, 2014
    Assignee: Oracle International Corporation
    Inventors: Konstantinos Morfonios, Leonidas Galanis, Neoklis Polyzotis, Karl Dias
  • Publication number: 20140195562
    Abstract: The propensity and intent of a user to make a purchase is predicted based on product search queries and chat streams. The contents of the data sources, including search queries and chat streams, are analyzed for product names and product attributes. The results of the analyses are used to predict user needs. Product names and attributes are extracted from the data sources. The extracted information is mapped onto abstract product categories. Based on the abstract product categories, offers for products and services are made to the user.
    Type: Application
    Filed: December 31, 2013
    Publication date: July 10, 2014
    Inventors: Nitin Kumar HARDENIYA, R. Mathangi SRI, Ravi VIJAYARAGHAVAN
  • Patent number: 8775423
    Abstract: One or more devices store in a memory, customer tags originating on an external social platform and employee tags originating on an internal social platform. The one or more devices provide to a user device, keyword suggestions for new content to be published on the internal social platform. The keyword suggestions include selections from both the customer tags and the employee tags. The one or more devices receive employee metadata, for content published on the internal social platform, that includes tags selected from the keyword suggestions, and associates the tags in the employee metadata as customer-originated tags or employee-originated tags based on the stored customer tags and employee tags. The one or more devices perform data correlation to determine relationships between use of the customer-originated tags and use of the employee-originated tags in the employee metadata.
    Type: Grant
    Filed: September 15, 2011
    Date of Patent: July 8, 2014
    Assignee: Verizon Argentina S.R.L.
    Inventor: Alejandro Pereyra-Rozas
  • Patent number: 8775010
    Abstract: A method of conducting vehicle usage data analysis is provided. The method includes providing usage data about at least one vehicle to a database. The usage data may be analyzed and compared to a member of a set of vehicle development models to determine whether to update a vehicle development model. The usage data may also be analyzed to determine whether to transmit a communication to a vehicle.
    Type: Grant
    Filed: May 16, 2011
    Date of Patent: July 8, 2014
    Assignee: Ford Motor Company
    Inventors: Raja Shekar Sohmshetty, Zhiyong Cedric Xia, Krishnaswamy Venkatesh Prasad, Matthew John Zaluzec
  • Patent number: 8775466
    Abstract: A method for projection mining comprises performing a first projection on a first data object of a first type comprising a plurality of data entries and a second data object of a second type comprising a plurality of data entries to create definitions of attributes of the first data object and definitions of attributes of the second data object, performing a second projection of the definitions of the attributes of the first data object and the definitions of the attributes of the second data object into a space of meta-attributes based on semantic relationships among the attributes of the first data object and the second data object, learning relationships between the space of meta-attributes formed by the projections of the first data object and the second data object and a space of meta-attributes relating to new data not included in the first data object and the second data object, and generating at least one new data object of the first or second type based on the new data using the learned relationships
    Type: Grant
    Filed: May 1, 2013
    Date of Patent: July 8, 2014
    Assignee: Oracle International Corporation
    Inventors: Pablo Tamayo, Mark Hornick, Marcos C. Campos, Boriana Milenova
  • Patent number: 8768892
    Abstract: Aspects of the subject matter described herein relate to analyzing data and providing recommendations regarding computing assets. In aspects, data is collected from computing assets and aggregated in a data repository. A data analyzer analyzes the data to determine problems associated with the computing assets. Work done to identify a problem with one computing asset may be used to identify problems with or provide recommendations for other computing assets controlled by the same or different entities. When a problem is identified in a computing asset, a recommendation may be proactively provided to an entity associated with the computing asset.
    Type: Grant
    Filed: September 29, 2008
    Date of Patent: July 1, 2014
    Assignee: Microsoft Corporation
    Inventors: Neal Robert Myerson, Darren C. Justus, Brian David Connolly, Vladimir Holostov
  • Patent number: 8768960
    Abstract: Disclosed are systems and methods for extracting semantic-based keywords through mining word semantics using an online encyclopedia's taxonomy. Described is the use a semantic bipartite graph that relates candidate keywords and topics.
    Type: Grant
    Filed: January 20, 2009
    Date of Patent: July 1, 2014
    Assignee: Microsoft Corporation
    Inventors: Jian Hu, Jian-Tao Sun, Zheng Chen
  • Publication number: 20140181145
    Abstract: A modular software system for use with an integration software technology that facilitates an integration project by providing valuable information resources that assists in the development, deployment, and management on an integrated system as well as the resolution of any integration related issues. The module software system accomplishes this through the use of a plurality of modules and a method of use. The plurality of modules save time and effort involved in collecting a plurality of technical and non technical information required to make key decisions during the integration project implementation. By providing information in a packaged and organized way, the plurality of modules saves valuable and skilled resource man hours and effort reducing the cost spent on project implementation. The method of use allows project managers, technical team and business users to spend less time on collecting and organizing information before and during the implementation of an integration projects.
    Type: Application
    Filed: December 23, 2013
    Publication date: June 26, 2014
    Inventor: Jafer S. KAMSAMOHIDEEN
  • Publication number: 20140181128
    Abstract: Described herein are systems and methods for processing data. In some embodiments, a system may include a natural language processing (NLP) engine configured to transform a data set into a plurality of concepts within a plurality of distinct contexts, an ontology configured to structure the plurality of concepts by annotating relationships between and creating aggregations of the concepts, and a data mining engine configured to process the relationships of the concepts and to identify associations and correlations in the data set. In some embodiments, the method may include the steps of receiving a data set, scanning the data set with a natural language processing (NLP) engine to identify a plurality of concepts within a plurality of distinct contexts, structuring the data set with an ontology by creating aggregations of the concepts and annotating relationships between the concepts, and identifying patterns in the relationships between the plurality of concepts.
    Type: Application
    Filed: March 5, 2012
    Publication date: June 26, 2014
    Inventors: Daniel J. Riskin, Anand Shroff
  • Patent number: 8762396
    Abstract: A system may include an address manager configured to map a data item including a plurality of attributes to a blocked Bloom filter (BBF) of a plurality of blocked Bloom filters. The system also may include a blocked Bloom filter (BBF) generator configured to map each attribute of the plurality of attributes to a corresponding block of the blocked Bloom filter.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: June 24, 2014
    Assignee: SAP AG
    Inventors: Benoit Hudzia, Eoghan O'Neill
  • Patent number: 8762411
    Abstract: Parental dependency information for various data fields may be analyzed to create a data field hierarchy. Each of the data fields may be presented in a selectable list through an interface. Once a data field is selected, any immediate parent(s) and/or child(ren) field(s) of the active data element may be demarcated in the list according to the hierarchy. Additional data entry fields relating to the selected data field and its familial fields may also be displayed. Data in each of data fields may also be analyzed to identify fields with incomplete data. Systems and methods are provided.
    Type: Grant
    Filed: December 1, 2010
    Date of Patent: June 24, 2014
    Assignee: SAP AG
    Inventors: Madison Poon, Ryan Hanna, Ashley Gadd, Chun Pong Chan, Julian Gosper, Sylvie Denis
  • Patent number: 8762364
    Abstract: Embodiments of the invention relate to methods of presenting personalized search results pages to users, and to search engine systems and servers configured to implement such methods. For example, a method of presenting such a page to a user of a search engine includes steps of computing an engagement index of the user based on the distribution in time of that user's interactions with the search engine then presenting, in response to a query by the user, a personalized search results page to the user.
    Type: Grant
    Filed: March 18, 2008
    Date of Patent: June 24, 2014
    Assignee: Yahoo! Inc.
    Inventors: Rajesh Parekh, Jignesh Parmar, Pavel Berkhin
  • Patent number: 8756233
    Abstract: In accordance with the embodiments of the present invention, a method and engine for assigning semantic tags to segments within media. The invention receives media and extracts textual information related to the media's content. It processes the textual information and creates a list of topics related to the content. The invention segments the media and intelligently assigns topical tags to the segments. The semantically segmented media data is outputted for storage or analysis.
    Type: Grant
    Filed: April 18, 2011
    Date of Patent: June 17, 2014
    Assignee: Video Semantics
    Inventors: Wael AbdAlmageed, Mohamed Hefeeda, Bassem Abdelaziz
  • Patent number: 8756210
    Abstract: Search results are generated using aggregated context data from two or more contexts. When two or more programmable search engines relate to a similar topic, context data associated with the programmable search engines are aggregated. The context is then applied to a query in order to present, in an integrated manner, relevant search results that make use of context intelligence from more than one programmable search engine.
    Type: Grant
    Filed: October 5, 2011
    Date of Patent: June 17, 2014
    Assignee: Google Inc.
    Inventor: Ramanathan V. Guha
  • Publication number: 20140164433
    Abstract: A method for automatically acquiring a set of data opens a searchable Internet database; initiates an automated timed search of each one of a plurality of records, each record in the plurality of record includes common criteria with the other records; retrieves information associated with the searched record; and provides the retrieved information in a desired format.
    Type: Application
    Filed: November 27, 2013
    Publication date: June 12, 2014
    Inventor: Hal Kravcik
  • Publication number: 20140164432
    Abstract: An exemplary embodiment of the present disclosure provides an ontology enhancement method. Firstly, at least an input information request is received. Then, based on an ontology, each input information request is expanded to produce at least an expanded information request of each corresponding input information request. Based on a searching model, according to each expanded information request, a file collection is searched to obtain searching results of each corresponding expanded information request. Then, according to each searching result, a plurality of candidate knowledge concepts of each corresponding searching result are extracted. Next, the candidate knowledge concepts of each searching result are selectively added into the ontology.
    Type: Application
    Filed: May 31, 2013
    Publication date: June 12, 2014
    Inventors: SHANG-HSIEN HSIEH, YU-HUEI JIN
  • Patent number: 8751531
    Abstract: A text mining apparatus, a text mining method, and a program are provided that accurately discriminate inherent portions of each of a plurality of text data pieces including a text data piece generated by computer processing. A text mining apparatus 1 to be used performs text mining using, as targets, a plurality of text data pieces including a text data piece generated by computer processing. Confidence is set for each of the text data pieces. The text mining apparatus 1 includes an inherent portion extraction unit 6 that extracts an inherent portion of each text data piece relative to another of the text data pieces, using the confidence set for each of the text data pieces.
    Type: Grant
    Filed: August 28, 2009
    Date of Patent: June 10, 2014
    Assignee: NEC Corporation
    Inventors: Kai Ishikawa, Akihiro Tamura, Shinichi Ando
  • Patent number: 8751517
    Abstract: In general information processing apparatuses, determination on what sort of operational interface is appropriate to be provided for which application (AP, hereinafter) is left up to the user. An information processing apparatus in an exemplary embodiment of the present invention comprises: an operation-definition storage means for storing a record including a function name, a function ontology and an operation type, with respect to each of a plurality of applications (APs); and an operation modification means for acquiring from the above-mentioned operation-definition storage means a set of records having an identical function ontology (identical ontology set) or a set of identifiers of the records belonging to the identical ontology set, and for replacing an operation type of a record including the above-mentioned identical function ontology of a designated AP with an operation type having a high appearance frequency (high frequency type) in the identical ontology set.
    Type: Grant
    Filed: June 23, 2010
    Date of Patent: June 10, 2014
    Assignee: NEC Corporation
    Inventor: Keisuke Umezu
  • Patent number: 8745085
    Abstract: A system and method is provided for automatically generating explanations for individual records in an access log.
    Type: Grant
    Filed: August 17, 2012
    Date of Patent: June 3, 2014
    Assignee: The Regents of the University of Michigan
    Inventors: Daniel Fabbri, Kristen Lefevre
  • Publication number: 20140149436
    Abstract: A system including a context-entity factory configured to build a data model defining an ontology of data objects that are context-aware, the model further defining metadata tags for the data objects. The system further includes a storage device storing the data objects as stored data objects, the device further storing associated contexts for corresponding ones of the stored objects. The system further includes a reduction component configured to capture a current context value of a first data object defined in the ontology, the component further configured to compare the current context value of the first data object with stored values of the associated contexts, and wherein when the current context value does not match a particular stored value of a particular associated context, the component is further configured to remove a corresponding particular stored data object and the particular associated context from the stored data objects.
    Type: Application
    Filed: November 26, 2012
    Publication date: May 29, 2014
    Applicant: THE BOEING COMPANY
    Inventor: The Boeing Company
  • Publication number: 20140149458
    Abstract: Embodiments of the disclosure include a method for data mining shape based data, the method includes receiving shape data for each of a plurality of data entries and creating a first abstract from the shape data for each of the plurality of data entries. The method also includes organizing the first abstracts into a plurality of groups based on a criterion and creating a second abstract for each data entry in the plurality of groups based on the criterion and information derived from the first abstract.
    Type: Application
    Filed: February 15, 2013
    Publication date: May 29, 2014
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Maroun M. Kassab, Leah M. Pastel, Adam E. Trojanowski
  • Patent number: 8738651
    Abstract: A technique for cataloging documents based on user activity includes assigning documents to a relevant document list based on activity of a user of a device. In this case, at least two of the documents are associated with different applications. The technique then provides the relevant document list to the user.
    Type: Grant
    Filed: March 6, 2008
    Date of Patent: May 27, 2014
    Assignee: Lenovo (Singapore) Pte Ltd
    Inventors: Jennifer G. Zawacki, David C. Challener, Justin T. Dubs, James J. Thrasher
  • Patent number: 8738652
    Abstract: Methods and systems for detecting anomalies in sets of data are disclosed, including: computing components of one or more types of feature vectors at a plurality of values of one or more independent variables, each type of the feature vectors characterizing a set of input data being dependent on the one or more independent variables; computing one or more types of output values corresponding to each type of feature vectors as a function of the one or more independent variables using a nonlinear sequence analysis method; and detecting anomalies in how the one or more types of output values change as functions of the one or more independent variables.
    Type: Grant
    Filed: March 11, 2008
    Date of Patent: May 27, 2014
    Assignee: Paragon Science, Inc.
    Inventor: Stephen Patrick Kramer
  • Publication number: 20140143277
    Abstract: Disclosed is a method and device for matching a friend relationship chain in an instant messaging tool. The method includes: performing data analysis on data information of a user; performing data mining on data information of other mass users according to an analysis result; and performing data matching between a mining result and the analysis result of the user. The device includes: a data analyzing module, a data mining module, and a data matching module. According to the technical solutions provided in the present disclosure, other users desired by the user are automatically matched for the user. The whole matching process requires no manual operation, which reduces usage threshold for a friend relationship chain matching system. In addition, as regards matching based on the user information, the matched users have a strong correlation with the user, and the matching quality is high.
    Type: Application
    Filed: January 24, 2014
    Publication date: May 22, 2014
    Applicant: Tencent Technology (Shenzhen) Company Limited
    Inventors: Yu Yang, Yiping Chen, Tingting An, Rongjun Feng, Zhiyong Lai
  • Publication number: 20140143276
    Abstract: An enterprise software system connected to multi-tenant hosted software offered in a cloud computing environment having the capacity to serve a large number of users with a small number of servers, and means for collecting and reporting statistically relevant information based on an aggregation of the data within the multi-tenant database. The integrated software modules include modules for IT management, financial operations, portfolio management, project management, project budget management, resource management, and operations management.
    Type: Application
    Filed: November 21, 2012
    Publication date: May 22, 2014
    Applicant: COUNTERPART TECHNOLOGIES INC.
    Inventors: Michael Anthony ROGERS, Claudio SILVESTRI
  • Publication number: 20140143240
    Abstract: An information search server and an information search method thereof are provided. The information search server includes a transceiver and a processor. The transceiver receives a search message having an original store phone number of an original store from a user device. The processor performs a data mining procedure, according to the original store phone number, to obtain an original store name and an original store address associated with the original store phone number, a category associated with the original store name, an original store latitude and longitude associated with the original store address, and a recommended store information associated with the category and the original store latitude and longitude, and generates a result message having the recommended store information. The transceiver further transmits the result message to the user device.
    Type: Application
    Filed: February 17, 2013
    Publication date: May 22, 2014
    Applicant: INSTITUTE FOR INFORMATION INDUSTRY
    Inventors: Yu-Chee TSENG, Chih-Yu LIN, Jia-Ming LIANG, You-Ren CHU, Chien-Yi LI, Hsin-Yu CHANG, Chih-Yang CHUANG, Chih-Chiang HSIEH
  • Patent number: 8732197
    Abstract: A document taxonomy alignment system and method, relying on document glosses and utilizing a soft ontology expansion. An all-new hierarchical leaf node can be created expressly for the purpose of better aligning the plurality of document taxonomies in question. A small but valuable subset of the nodes created by soft ontology expansion turn out to capture some otherwise unmappable taxonomy nodes, and thereby have the effect of classifying the documents better than would any pre-existing node in any one of those taxonomies.
    Type: Grant
    Filed: February 4, 2008
    Date of Patent: May 20, 2014
    Assignee: Musgrove Technology Enterprises LLC (MTE)
    Inventor: Timothy A. Musgrove
  • Patent number: 8732199
    Abstract: A system, a method, and a computer readable media for identifying a user-initiated log file record in a log file are provided. The log file has a user-initiated log file record and a repeating pattern of log file records automatically generated by a software program. The system allows a user to identify first and second timestamp values corresponding to first and second times which identify a time interval of interest in the log file. The system further analyzes the log file to identify the user-initiated log file record having a timestamp value between the first and second timestamp values. The system further identifies the repeating pattern of log file records in the log file.
    Type: Grant
    Filed: April 26, 2012
    Date of Patent: May 20, 2014
    Assignee: International Business Machines Corporation
    Inventors: Danny Yen-Fu Chen, David A. Cox, Sheryl S. Kinstler, Fabian F. Morgan
  • Patent number: 8732198
    Abstract: Methods and systems of defining product attributes may involve receiving a search query and extracting a user expectation from the search query. In addition, an attribute may be defined for a product based on the user expectation. In one example, consumer generated content such as forum content, review content, blog content and social networking content, is used to define the attribute.
    Type: Grant
    Filed: March 15, 2012
    Date of Patent: May 20, 2014
    Assignee: International Business Machines Corporation
    Inventors: Madhu K. Chetuparambil, George T. Jacob Sushil, Kalapriya Kannan
  • Patent number: 8725737
    Abstract: Systems and methods for managing electronic data are disclosed. Various data management operations can be performed based on a metabase formed from metadata. Such metadata can be identified from an index of data interactions generated by a journaling module, and obtained from their associated data objects stored in one or more storage devices. In various embodiments, such processing of the index and storing of the metadata can facilitate, for example, enhanced data management operations, enhanced data identification operations, enhanced storage operations, data classification for organizing and storing the metadata, cataloging of metadata for the stored metadata, and/or user interfaces for managing data. In various embodiments, the metabase can be configured in different ways. For example, the metabase can be stored separately from the data objects so as to allow obtaining of information about the data objects without accessing the data objects or a data structure used by a file system.
    Type: Grant
    Filed: September 11, 2012
    Date of Patent: May 13, 2014
    Assignee: CommVault Systems, Inc.
    Inventors: Anand Prahlad, Jeremy Alan Schwartz, David Ngo, Brian Brockway, Marcus S. Muller
  • Patent number: 8719257
    Abstract: In various embodiments, a semantic space associated with a corpus of electronically stored information (ESI) may be created and used for concept searches. Documents (and any other objects in the ESI, in general) may be represented as vectors in the semantic space. Vectors may correspond to identifiers, such as, for example, indexed terms. The semantic space for a corpus of ESI can be used in information filtering, information retrieval, indexing, and relevancy rankings.
    Type: Grant
    Filed: February 16, 2011
    Date of Patent: May 6, 2014
    Assignee: Symantec Corporation
    Inventor: Venkat Rangan
  • Patent number: 8718950
    Abstract: In some embodiments, a non-transitory processor-readable medium includes code to cause a processor to receive a set of variants identified by a comparison of a test DNA sequence with a reference DNA sequence and associate at least one of the set of variants with at least one of a set of annotations each indicative of at least one criterion. The code includes code to cause the processor to filter, based on the set of annotations, the set of variants to identify a subset of variants from the set of variants. Each variant from the subset of variants is associated with at least one common annotation from the set of annotations. The code further includes code to cause the processor to present the subset of variants such that the subset of variants can be used to render a clinical diagnosis.
    Type: Grant
    Filed: July 8, 2011
    Date of Patent: May 6, 2014
    Assignee: The Medical College of Wisconsin, Inc.
    Inventors: Elizabeth Anabel Worthey, David Paul Dimmock
  • Patent number: 8719299
    Abstract: In one embodiment, an approach to automated recurring concept extraction, from a plurality of input data models (schemas) is presented. The approach converts input data models to graphs, with typed elements. The graphs are mined for closed subgraphs that have a defined minimum support. The identified subgraphs can be filtered with a relevance metric. These subgraphs are converted to schemas or an appropriate representation, and stored for reuse in a repository. The repository can be used to automate further transformation or mapping of schemas presented to a system that uses the repository. In one example, the repository is used in a schema covering process to perform schema transformation.
    Type: Grant
    Filed: December 2, 2011
    Date of Patent: May 6, 2014
    Assignee: SAP AG
    Inventors: Konrad Voigt, Peter Mucha
  • Patent number: 8712956
    Abstract: Some aspects include reception of a selection of a part of a report, the selected report part associated with queries of a semantic layer, and creation of a description of a Web Service call to return contents of the selected report part. In some aspects, a Web Service call associated with a part of a report is received, the report part associated with queries of a semantic layer, and a query is determined based on the Web Service call to return contents of the report part.
    Type: Grant
    Filed: March 30, 2009
    Date of Patent: April 29, 2014
    Assignee: SAP AG
    Inventors: David J. M. Brunner, Nadir Loutfi
  • Publication number: 20140115002
    Abstract: The present disclosure is related to a method for monitoring at least one event data generating machine, including a data logging device for providing event data. The method comprises transferring logged event data from at least one of the event data generating machines to a central processor, mining a multi-dimensional sequential pattern within said transferred event data wherein at least one dimensional attribute holds information indicating said event data generating machine or the at least one event data generating machine property, and matching said mined multi-dimensional sequential pattern with patterns stored in a central pattern database.
    Type: Application
    Filed: October 23, 2012
    Publication date: April 24, 2014
    Applicant: Liebherr-Werk Nenzing GmbH
    Inventors: Michael Kocher, Martin Rajek
  • Patent number: 8706758
    Abstract: Disclosed are improvements to a method for account reconciliation comprising improved, extended, and more flexible algorithms for (1) automatically determining what transaction features are best candidates for matching diverse datasets; (2) automatically determining how logically to subdivide accounting datasets prior to reconciliation; (3) matching groups of transactions (allowing one-to-many, many-to-one, and many-to-many matches instead of just one-to-one matches); (4) making use of more types of transaction feature, including transaction dates (where proximity of two transactions in date may be significant even if the dates do not exactly match). The improved method is, therefore, better able to perform its intended function of identifying matching transactions. It is applicable to a wider class of problems while still saving significant costs and labor, and still retaining flexibility in not requiring source data in a particular format, and not being domain-dependent or requiring extensive user setup.
    Type: Grant
    Filed: February 20, 2012
    Date of Patent: April 22, 2014
    Assignee: Galisteo Consulting Group, Inc.
    Inventor: Peter A. Chew
  • Publication number: 20140108455
    Abstract: A method of capturing intentions within online text comprises with a data mining device (105), identifying (block 305) a number of statements of intention within an online forum (110), and with the data mining device (105), extracting (block 310) a number of attributes (240, 245, 250, 255, 260) from the statements of intention. A system (100) for extracting intentions expressed within an online forum comprises a data mining device (105), a forum server (115) comprising a number of online forums (110) communicatively coupled to the data mining device (105), in which the data mining device identifies a number of statements of intention within the online forums (110) and extracts a number of attributes (240, 245, 250, 255, 260) from the statements of intention.
    Type: Application
    Filed: June 28, 2011
    Publication date: April 17, 2014
    Inventors: Maria G. Castellanos, Riddhiman Ghosh, Mohamed E. Dekhil, Umeshwar Dayal, Meichun Hsu
  • Publication number: 20140101201
    Abstract: Methods and data structures are provided for allowing data mining with improved efficiency. During processing of a usage log (or multiple logs) for an activity, such as a usage logfile of network search activity, a common fact table is generated. The common fact table allows a plurality of auxiliary data structures to be formed from the common fact table. These auxiliary data structures are designed to allow users to submit queries against the contents of the data structure in order to investigate the data. The efficiency of access of the common fact table is improved by allowing users to access auxiliary data structures other than the auxiliary data structures that are associated with a user. Optionally, the common fact table and/or the auxiliary data structures can include dimension values that correspond to both pre-identified dimension values as well as dimension values that are identified during processing of the activity logfiles.
    Type: Application
    Filed: October 10, 2012
    Publication date: April 10, 2014
    Applicant: MICROSOFT CORPORATION
    Inventors: An Yan, Mingyang Zhao, Robert C. Wang, Yu Chen, Blake Anderson
  • Patent number: 8694515
    Abstract: Provided is an image search device which relatively easily searches a large amount of stored images for images that a user wishes to use for interpolation, and which includes: an interpolation range computing unit (103) which computes, as an interpolation range, a range of an area including a first photographing location where a first interpolation target image was taken and a second photographing location where a second interpolation target image was taken; an interpolation image candidate obtaining unit (104) which obtains, as candidate images, images whose photographing location are included in the interpolation range from the plurality of images; and an interpolation image selecting unit (106) which selects, from the candidate images, an image having a greater subject distance, which is a distance between a subject and an imaging device when the image was taken, as a traveling speed between the first photographing location and the second photographing location increases.
    Type: Grant
    Filed: November 5, 2009
    Date of Patent: April 8, 2014
    Assignee: Panasonic Corporation
    Inventors: Kazutoyo Takata, Kenji Mizutani
  • Publication number: 20140095542
    Abstract: A data mining system receives a data set that includes a plurality of columns of data. The system determines correlations between columns of data of the data set and displays an interactive listing of a plurality of pairs of columns based on the correlations. The listing includes preview information based on the correlations for each pair. The system receives a selection of a value from the interactive listing from a user and refines the data set in response to the selection.
    Type: Application
    Filed: May 31, 2013
    Publication date: April 3, 2014
    Inventor: Vladimir ZELEVINSKY
  • Patent number: 8688692
    Abstract: A computer searching technique identifies high quantitative patterns in data. A spatial indexing technique, such as an R-tree is used to represent the data. Then a pattern searching algorithm is used to identify anchor points that define the componentwise minimum patterns. High quantitative patterns are found responsive to the componentwise minimum patterns. The search strategy is demonstrated relevant to the problem of finding suitable locations for a retail business with reference to environments of prior successful retail businesses.
    Type: Grant
    Filed: September 27, 2010
    Date of Patent: April 1, 2014
    Assignee: International Business Machines Corporation
    Inventors: Jin Dong, Ta-Hsin Li, Hua Liang, Ming Xie, Wen Jun Yin, Bin Z. Zhang
  • Patent number: 8688732
    Abstract: Comparative decision systems and methods are disclosed for gathering and mining data representative of purchase decisions. One disclosed comparative decision system detects when a user is comparing items and provides the user with the ability to create a research note storing comparative information for the alternative items. The system displays information about items according to a variety of factors. The user can customize the factors and enter information for each alternative item according to the various factors. Some information may be pre-populated by the system. The research note may be made visible to other users, and may be suggested to another user based on the note's expected helpfulness and relevance to that user. One disclosed method for mining data stored within research notes identifies which factors are given higher relative priorities by users considering a purchase. Another mining method analyzes the effects of price changes on item popularity.
    Type: Grant
    Filed: November 19, 2010
    Date of Patent: April 1, 2014
    Assignee: Amazon Technologies, Inc.
    Inventors: Sameer R. Rajyaguru, Terrence R. Nightingale, Marvin M. Theimer
  • Publication number: 20140089345
    Abstract: An apparatus, method and article of manufacture of the present invention detects the presence of references to the same concept in separate sections of text, and, with no input required from the reader, presents the reader with information concerning the detected references to the concept. The information provided may comprise information related to the location of the reference to the concept in other sections of text, and the reader also is provided the ability to move from one reference to a concept directly to another reference to the same concept.
    Type: Application
    Filed: September 24, 2012
    Publication date: March 27, 2014
    Inventor: Philip R. Krause