Clustering Or Classification (epo) Patents (Class 707/E17.046)
  • Publication number: 20120005210
    Abstract: A method of structuring a database of objects, the objects each comprising one or more attributes, the attributes being ordered, the method being executed by at least one computer processor connected to a memory, the method classifying in memory the objects in a structure composed of a list CL of sets of formal concepts Ci, includes at least the following steps: create several groups of attributes SAi; for each of said groups SAi, construct a closed set Pi composed of all the attributes common to the objects comprising at least the attributes of said group SAi; determine the list CL of formal concepts Ci ordered in the lexicographic order, by successively determining the formal concepts in order of increasing intent, the intent F of a formal concept Ci being formed by a set of closed sets Pi.
    Type: Application
    Filed: November 18, 2009
    Publication date: January 5, 2012
    Applicant: THALES
    Inventors: Cédric Tavernier, Jean-Luc Rogier
  • Publication number: 20110320454
    Abstract: A system and method for constructing a hierarchical multi-faceted classification structure includes organizing a plurality of visual categories into a multi-relational reference ontology that accounts for a plurality of different types of relationships. Media artifacts are categorized into the plurality of visual categories. The categories of artifacts are refined based on faceted ontology relationships or constraints from the multi-relational reference ontology. The multi-relational reference ontology and the one or more media artifacts with relationships are stored as the hierarchical multi-faceted classification structure in computer readable memory storage.
    Type: Application
    Filed: June 29, 2010
    Publication date: December 29, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: MATTHEW HILL, JOHN R. KENDER, APOSTOL NATSEV, QUOC-BAO NGUYEN, JOHN R. SMITH, JELENA TESIC, LEXING XIE, RONG YAN
  • Publication number: 20110320395
    Abstract: Content provided by a decision engine system is described. Content, stored in a server system, is provided to a plurality of display units at a plurality of touch point devices. One or more features are determined to optimize the content provided to the plurality of display units. The content is updated syndicated across the plurality of display units at the plurality of touch point devices based on the determination.
    Type: Application
    Filed: June 29, 2010
    Publication date: December 29, 2011
    Inventors: Uzair Dada, Jason Kobilka, Michael Krol, Adeeb Ashraf, Abe Mammen, Omer Saeed
  • Publication number: 20110320447
    Abstract: In one aspect, a processing device of an information processing system is operative to perform high-dimensional stratified sampling of a database comprising a plurality of records arranged in overlapping sub-groups. For a given record, the processing device determines which of the sub-groups the given record is associated with, and for each of the sub-groups associated with the given record, checks if a sampling rate of the sub-group is less than a specified sampling rate. If the sampling rate of each of the sub-groups is less than the specified sampling rate, the processing device samples the given record, and otherwise does not sample the given record. The determine, check and sample operations are repeated for additional records, and samples resulting from the sample operations are processed to generate information characterizing the database.
    Type: Application
    Filed: June 28, 2010
    Publication date: December 29, 2011
    Inventors: Aiyou Chen, Ming Xiong
  • Publication number: 20110314018
    Abstract: Summaries of entities (e.g., people, places, things, concepts, etc.) may provide additional useful information to user. For example, a search engine may provide a summary of an entity within search results. A category (e.g., “writer”, “politician”, etc.) of the entity that is short and concise may be advantageous to provide within a summary of the entity. The category may allow a user to quickly determine whether the information of the entity relates to the intended entity (e.g., search results of an entity as “a writer” vs. search results of an entity as “a politician”). Potential categories and summary text may be extracted from pre-labeled data. The potential categories and summary text may be intersected to determine a set of candidate categories that may be ranked. An entity category having a desired ranked may be determined as the entity category that describes the entity in a desired way.
    Type: Application
    Filed: June 22, 2010
    Publication date: December 22, 2011
    Applicant: Microsoft Corporation
    Inventors: Michael Bieniosek, Franco Salvetti, Giovanni Lorenzo Thione
  • Publication number: 20110307487
    Abstract: A system for obtaining data from various sources. The data may be organized into cluster sets of related items. Elements of various kinds may be pulled from the data. The elements may be put together into sets of clusters for each kind of elements. The clusters may be refined relative to one another and in view of integrated properties of the cluster sets. Elements may be added or removed from the clusters during refinement. Examples of the elements may be people and events. Examples of cluster sets of such elements may be groups and goals, respectively.
    Type: Application
    Filed: June 15, 2010
    Publication date: December 15, 2011
    Inventors: Valerie Guralnik, Kirk Schloegel
  • Publication number: 20110302170
    Abstract: Methods for factoring search and browse policies and content preferences into Web search results are provided. Such search and browse policies and/or content preferences generally are provided by a parent, an employer, or other company representative and specify to whom they apply. Upon receiving a search query from a particular user, it is determined whether one or more search and browse policies and/or content preferences apply to the received search query. Upon determining that one or more search and browse policies and/or content preferences apply to the received search query, at least one of the received search query and any search results determined as satisfying the search query are analyzed in accordance with the one or more applicable search and browse policies and/or content preferences applying to the user. Any necessary modifications are made to the search results before the results are presented to the user.
    Type: Application
    Filed: June 3, 2010
    Publication date: December 8, 2011
    Applicant: MICROSOFT CORPORATION
    Inventor: VLADIMIR HOLOSTOV
  • Publication number: 20110302147
    Abstract: This disclosure describes systems and methods for identifying and correcting anomalies in web graphs. A web graph is transformed into a sequence of tokens via a walk algorithm. The sequence is fingerprinted to form a set of shingles. The singles are compared to shingles for other web graphs in order to determine similarity between web graphs. Actions are then carried out to remove anomalous web graphs and modify parameters governing web mapping in order to decrease the likelihood of future anomalous web graphs being built.
    Type: Application
    Filed: May 2, 2011
    Publication date: December 8, 2011
    Applicant: Yahoo! Inc.
    Inventors: Ali Dasdan, Panagiotis Papadimitriou
  • Publication number: 20110302166
    Abstract: The present invention provides a search system and a search method to make it easy to find out a document required truly among documents of a search result. This search system includes a division unit that divides a document to be searched into a plurality of blocks in accordance with designated division information, a calculation unit that calculates a hash value of each block by applying a hash function to a character string included in each block, a storage unit that stores the calculated hash value together with positional information on the block in the document, and a document grouping unit that fetches, for each document obtained by searching based on the search word, a corresponding hash value from the storage unit 545 in accordance with positional information on a block including the search word to group documents having the same hash value into one group and output the grouped documents as the search result.
    Type: Application
    Filed: October 16, 2009
    Publication date: December 8, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Yutaka Moriya, Fumihiko Terui
  • Publication number: 20110302168
    Abstract: In a method for representing a text document with a graphical model, a document including a plurality of ordered words is received and a graph data structure for the document is created. The graph data structure includes a plurality of nodes and edges, with each node representing a distinct word in the document and each edge identifying a number of times two nodes occur within a predetermined distance from each other. The graph data structure is stored in an information repository.
    Type: Application
    Filed: June 8, 2010
    Publication date: December 8, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Charu Aggarwal
  • Publication number: 20110295856
    Abstract: Techniques for grouping related objects such as documents and files using quantum clustering are disclosed. A method may include constructing a feature-object database of multiple objects. The feature-object database may have quantized selected features as keys. A connected objects database maybe built. Clusters of connected objects may be identified in the connected objects database. The clusters of identified objects may be evaluated to determine groups of related objects. The method may be implemented on a computing device.
    Type: Application
    Filed: August 8, 2011
    Publication date: December 1, 2011
    Inventors: Herbert L. Roitblat, Brian Golbére
  • Publication number: 20110289086
    Abstract: A system and method for searching a database for multiple entries in the database that contain similar data, in which some embodiments of the method include collating data on physical sites from at least one database source to form a collation of site data, assigning a unique entry identifier to each entry of the site data in the collation, performing a lexical analysis of the site data and assigning a similarity metric(s) to each entry of the site data, sorting site data into at least one group with similar lexical content based on a metric threshold difference analysis of the similarity metric(s), to thereby provide at least one group, having at least one site data entry therein, and wherein where there are two or more site data entries in the at least one group, preferably they refer to the same site or to sites having a similar physical address.
    Type: Application
    Filed: May 20, 2011
    Publication date: November 24, 2011
    Inventors: Philip Martin Jordan, Vilosh Marion Brito
  • Publication number: 20110282872
    Abstract: Categorizing data in an on-demand database environment is provided. The categorized data is accessed to provide results based on statistical likelihood that records provide a desired result of a query. The categorization of the data includes organizing queries based on semantic terms, with categorization based on a multidimensional categorization of data in the database environment. The generating of results includes accessing relationship metadata both for individual records and for categories. Relationships along the same category, or among categories can provide records that may answer the query. The relationships and statistics are updated based on usage of the results data. Records and relationships identified as being used to solve the query, or being a desired solution to the query, can be weighted more heavily, thus increasing the likelihood of providing the most relevant data for subsequent queries.
    Type: Application
    Filed: May 11, 2011
    Publication date: November 17, 2011
    Applicant: salesforce.com, inc
    Inventors: Eugene Oksman, Alexandre Hersans
  • Publication number: 20110282875
    Abstract: A method, system, and computer program for processing records is disclosed. The records are associated with record sets. Record sets are associated with processor sets, which include one or more processors. Records are routed to associated processor sets for processing, based on the record set associated with the record. Records are processed on processors in the processor sets. Furthermore, various localized affinities can be established. Process affinity can link server processes with processor sets. Cache affinity can link database caches with processor sets. Data affinity can link incoming data to processor sets.
    Type: Application
    Filed: April 8, 2011
    Publication date: November 17, 2011
    Applicant: UNITED STATES POSTAL SERVICE
    Inventors: C. Scot Atkins, Joseph Conway
  • Publication number: 20110276552
    Abstract: In a dynamic information delivery context, a system collects data regarding transient information accessed by a user. The user can then query the stored data to reconstruct transient information. The system uses heuristics to help reconstruct transient information. The heuristics include user profile, time stamps, metadata, and indexing.
    Type: Application
    Filed: May 7, 2010
    Publication date: November 10, 2011
    Applicant: TELCORDIA TECHNOLOGIES, INC.
    Inventors: Shoshana K. Loeb, Euthimios Panagos
  • Publication number: 20110276553
    Abstract: One embodiment is a computer-implemented method for classifying documents in a collection of documents according to their intended readerships. The method comprises using a computer to select a document in the collection of documents; and using a computer to determine a characteristic of the selected document, the characteristic being: misleading when the document includes one or more features that are determined to be for a purpose other than reading the document; commercial when the document includes features that are presented for a commercial purpose; or personal when the document includes features of a personal opinion. The method further includes using a computer to classify the selected document as misleading, commercial, or personal according to its determined characteristic; and using a computer to repeat the steps of select document, determine a characteristic of the selected document, and classify the selected document for additional documents in the collection.
    Type: Application
    Filed: May 10, 2010
    Publication date: November 10, 2011
    Applicant: International Business Machines Corporation
    Inventors: Ying Chen, Bin He, W. Scott Spangler
  • Publication number: 20110270819
    Abstract: Query classification techniques attempt to classify user search queries in order to better understand user search intent. Understanding a user's search intent allows search engines to provide relevant content tailored to the user's interest. Unfortunately, current classification techniques do not take into account contextual information. Accordingly, as provided herein, a target query may be classified based upon contextual information. In particular, features may be extracted from contextual information and/or other sources. For example, features may be extracted from the target query, related queries, and/or invoked search results of the related queries. In this way, the target query may be classified based upon other queries performed by the user and/or search results of the queries the user found interesting. In addition, a CRF model may be utilized in classifying the target query by providing generalized parameters learned from labeled query sessions.
    Type: Application
    Filed: April 30, 2010
    Publication date: November 3, 2011
    Applicant: Microsoft Corporation
    Inventors: Dou Shen, Daxin Jiang, Jian-Tao Sun
  • Publication number: 20110270808
    Abstract: A clustering-based approach to data standardization is provided. Certain embodiments take as input a plurality of addresses, identify one or more features of the addresses, cluster the addresses based on the one or more features, utilize the cluster(s) to provide a data-based context useful in identifying one or more synonyms for elements contained in the address(es), and standardize the address(es) to an acceptable format, with one or more synonyms and/or other elements being added to or taken away from the input address(es) as part of the standardization process.
    Type: Application
    Filed: April 30, 2010
    Publication date: November 3, 2011
    Applicant: International Business Machines Corporation
    Inventors: Tanveer A. Faruquie, Sachindra Joshi, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, Angel Smith, L. V. Subramaniam, Girish Venkatachaliah
  • Publication number: 20110270826
    Abstract: A document analysis system includes a database that stores documents, a document evaluation module that evaluates the documents by using features of the documents, and a user interface (UI) output unit that provides an evaluation result of the documents, which is produced by the document evaluation module, upon call of the documents.
    Type: Application
    Filed: October 27, 2009
    Publication date: November 3, 2011
    Inventors: Wan-Kyu Cha, Mi- Kyung Jung, Han-Joon Ahn, Jeong-Joong Kim, Sung-Ho Choi
  • Patent number: 8051084
    Abstract: Systems and methods are described that calculate the interestingness of a set of one or more records in a database, either absolutely (i.e., compared to an overall collection of records) or relative to some other set of records. In one embodiment, the measure is a relative entropy value that has been normalized. Various applications of the measure are described in the context of an information retrieval system. These applications include, for example, guiding query interpretation, guiding view selection and summarization, intelligent ranges, event detection, concept triggers and interpreting user actions, hierarchy discovery, and adaptive data mining.
    Type: Grant
    Filed: June 25, 2008
    Date of Patent: November 1, 2011
    Assignee: Endeca Technologies, Inc.
    Inventors: Daniel Tunkelang, Joyce Jeanpin Wang, Vladimir Zelevinsky, Paul Alexander Wehner
  • Publication number: 20110258173
    Abstract: A computerized system and method of constructing and expanding search queries for conducting searches through information sources. The system enables retrieving a category options tree, allowing a user to define a category route by selecting a category-node, which defines a search-category. The system may further enable retrieving a query scenario tree, having a hierarchal structure comprising query nodes, where the retrieved query scenario tree is associated with an initial input query, inputted by a user. Each query node defines a query route enabling to construct the content and structure of an expanded search query. The system enables selecting a query node of the retrieved query scenario tree, according to an online decision making process, which analyses the search-category in relation to available query routes in to allow selecting a query node from the retrieved scenario tree that is most compatible with the search-category.
    Type: Application
    Filed: June 28, 2011
    Publication date: October 20, 2011
    Inventors: Michael RATINER, Dmitry KUHARENKO, Alexander RUBINOV
  • Publication number: 20110258192
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for question and answer services. In one aspect, a method combines receiving a plurality of questions from a plurality of different servers according to a protocol that defines services for submitting questions and obtaining answers to questions. Each received question is analyzed and associated with one or more labels based on the analysis. A request from a server is received according to the protocol to obtain questions related to one or more labels. Questions associated with one or more of the labels are identified and provided in response to the request.
    Type: Application
    Filed: November 29, 2010
    Publication date: October 20, 2011
    Applicant: GOOGLE INC.
    Inventors: Jun Yao, Jinhui Du
  • Publication number: 20110258193
    Abstract: One embodiment of the present invention provides a system for estimating a similarity level between semantic entities. During operation, the system selects two or more semantic entities associated with a number documents. The system subsequently parses the documents into sub-parts, and calculates the similarity level between the semantic entities based on occurrences of the semantic entities within the sub-parts of the documents.
    Type: Application
    Filed: April 15, 2010
    Publication date: October 20, 2011
    Applicant: PALO ALTO RESEARCH CENTER INCORPORATED
    Inventors: Oliver Brdiczka, Petro Hizalev
  • Publication number: 20110246462
    Abstract: A method and system for prompting changes of electronic document content. The method includes the steps of: determining a first relation information from a first document where the first relation information includes: a first named entity, a second named entity, and a first relationship between the first named entity and the second named entity, storing the first relation information in a database, determining a second relation information from a second document, where the second relation information includes: a third named entity, a fourth named entity, and a second relationship between the third named entity and the fourth named entity, retrieving the first relation information from a database, and sending the first relation information to a client, if the first relation information is different from the second relation information, where at least one step is performed using a computer device.
    Type: Application
    Filed: March 29, 2011
    Publication date: October 6, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Xian Wu, Quan Yuan, Xia Tian Zhang, Shiwan Zhao
  • Publication number: 20110231400
    Abstract: Disclosed herein are a document manipulating method, a document managerial system, and an electronic device using the same. The electronic device includes the system, an activating unit, a determining unit and a placing unit. The system includes at least one label of a searchable and classifiable format, a database accessible by the electronic device, and a searching and classifying engine. The method includes the steps of activating a document, determining a labeling location and a labeling size within the document, placing the label at the labeling location to record a document description, and saving the label and a part of the document in the database.
    Type: Application
    Filed: June 9, 2010
    Publication date: September 22, 2011
    Applicant: Compal Electronics, Inc.
    Inventors: Yi-Chen Sung, Chien-Yuan Chen, Fei Wu
  • Publication number: 20110231387
    Abstract: A model is created and from seed trivia facts will create a database of pruned and ranked trivia facts and associated trigger terms. Search, email, or other information provider systems are configured to detect usage of the trigger terms and provide relevant trivia facts in response to the usage.
    Type: Application
    Filed: March 22, 2010
    Publication date: September 22, 2011
    Applicant: YAHOO! INC.
    Inventors: Alpa Jain, Gilad Mishne
  • Publication number: 20110219002
    Abstract: A computer-implemented method for determining similarities between system executable objects includes the steps of determining with one or more computing systems a plurality of subsequences of operation codes in a plurality of disassembled system executable objects, for each subsequence, determining with the one or more computing systems a first set of system executable objects associated with the subsequence, with the computing systems, clustering the first set of system executable objects with a cluster. The cluster includes a set of system executable objects. The step of clustering the first set of system executable objects and the cluster includes the steps of determining with the computing systems the relative similarity between the first set of system executable objects and the cluster, and if the first set of system executable objects is similar to the cluster, adding with the computing systems the system executable objects to the cluster.
    Type: Application
    Filed: March 5, 2010
    Publication date: September 8, 2011
    Applicant: MCAFEE, INC.
    Inventors: Anthony Vaughan Bartram, Adrian M. Dunbar
  • Publication number: 20110219000
    Abstract: Provided is a search apparatus, a search method, and a program that can improve search speed for a document set even when an object to be searched is a large-scale document set.
    Type: Application
    Filed: November 6, 2009
    Publication date: September 8, 2011
    Inventor: Yukitaka Kusumura
  • Publication number: 20110219005
    Abstract: Methods and computer-readable media are provided for performing a federated search using a library description file to locate multiple data sources. For a federated search, a library description can be used to describe a set of data sources searched, and may further be used to describe how search results should be presented to a user. The format of such a library description file can include multiple elements, some of which provide information on how to display the library and others that define which data sources are included in the library. The library description file can be created according to library description template.
    Type: Application
    Filed: May 12, 2011
    Publication date: September 8, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: Carlos Brito, Christopher Clayton McConnell, Shannon Scott Hysom, Paolo Marcucci, Tyler Kien Beam
  • Publication number: 20110218999
    Abstract: The index update unit analyses the information stored in a document repository to create an index for search and stores the index in a time-series divisional index storage unit and creates, from an ACL repository, an access control entry ACE in association with the index for search, which is correlation of information to be searched with access right of at least a group to which the user belongs. The ACL cache generation unit creates ACL cache data that correlates the user with access right to the information to be searched, from the ACE, and registers the ACL cache data created in an ACL cache. A search processing unit searches for an index for search in response to a request for search from said user. In case the ACL cache data correlating the user with the index for search is registered in the ACL cache, the search processing unit_takes, from among the information searched, the information, reference to which is allowed for the user as a search result, based on information in the ACL cache.
    Type: Application
    Filed: November 13, 2009
    Publication date: September 8, 2011
    Inventors: Masaki Kan, Yoshihiro Kajiki
  • Publication number: 20110218947
    Abstract: Electronic documents are analyzed to identify assertions, which are inverted to generate questions that may be answered by the assertions. A document or a corpus of electronic documents may be analyzed to identify entities and relationships among entities within the text of the document(s). Assertions are identified based on the entities and relationships among the entities. Each assertion represents a fact about an entity, and a group of assertions represents a summary of the document or document corpus. The assertions are inverted to generate questions that may be answered by the assertions. The questions may be further analyzed to identify relevant concepts and topics and to cluster the questions around the concepts and topics. A combined graph may also be generated that facilitates traversal among topics, concepts, questions, assertions, document summaries, and documents.
    Type: Application
    Filed: March 8, 2010
    Publication date: September 8, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: VISWANATH VADLAMANI, ABHINAI SRIVASTAVA, TAREK NAJM, MUNIRATHNAM SRIKANTH, PHANI VADDADI, ARUNGUNRAM CHANDRASEKARAN SURENDRAN
  • Publication number: 20110218997
    Abstract: A method for determining a predictability of a media entity portion, the method includes: receiving or generating (a) reference media descriptors, and (b) probability estimations of descriptor space representatives given the reference media descriptors; wherein the descriptor space representatives are representative of a set of media entities; and calculating a predictability score of the media entity portion based on at least (a) the probability estimations of the descriptor space representatives given the reference media descriptors, and (b) relationships between the media entity portion descriptors and the descriptor space representatives. A method for processing media streams, the method may include: applying probabilistic non-parametric process on the media stream to locate media portions of interest; and generating metadata indicative of the media portions of interest.
    Type: Application
    Filed: March 7, 2011
    Publication date: September 8, 2011
    Inventors: Oren Boiman, Alex Rav-Acha
  • Publication number: 20110202528
    Abstract: A method of identifying a fresh document in a document set is provided. The method may include obtaining a query document that is included in a document set comprising a plurality of documents. The method may also include grouping the plurality of documents into a plurality of fine clusters based on a textual similarity between the plurality of documents. The method may also include identifying a target fine cluster within the plurality of fine clusters, the target fine cluster including the query document. The method may also include ordering the documents included in the target fine cluster by time to identify the fresh document. The method may also include generating a query response that includes the fresh document.
    Type: Application
    Filed: February 13, 2010
    Publication date: August 18, 2011
    Inventors: Vinay Deolalikar, Hernan Laffitte
  • Publication number: 20110202534
    Abstract: In an embodiment, a method is provided for storing information related to a decision making process. In this method, data items that are associated with a choice, a fact, and/or a decision are accessed. These data items are used in an application that provides a functionality associated with the decision making process. A relationship between the data items is then created based on a context in which the data items are used in the application. The data items and the relationship are stored in a common data structure that is accessible by a different application that provides a different functionality associated with the decision making process.
    Type: Application
    Filed: February 18, 2010
    Publication date: August 18, 2011
    Applicant: Business Objects Software Ltd.
    Inventor: Mark Allerton
  • Publication number: 20110196870
    Abstract: Systems, methods and computer program products for classifying documents are presented. Systems, methods and computer program products for analyzing documents, e.g., associated with legal discovery are also presented. Systems, methods and computer program products for cleaning up data are also presented. Systems, methods and computer program products for verifying an association of an invoice with an entity are also presented. Systems, methods and computer program products for managing medical records are presented. Systems, methods and computer program products for face recognition are presented.
    Type: Application
    Filed: April 19, 2011
    Publication date: August 11, 2011
    Applicant: KOFAX, INC.
    Inventors: Mauritius A.R. Schmidtler, Roland Borrey, Anthony Sarah
  • Publication number: 20110196871
    Abstract: A method and a system are provided for targeting online ads by grouping and mapping user properties. In one example, the system receives user data associated with one or more users. The system identifies user properties for a user. The system eliminates unacceptable user properties associated with the user. The system identifies permutations of the user properties associated with the user. The system eliminates unacceptable permutations of the user properties associated with the user. Valid permutations remain. The system attaches a weight of importance to each valid permutation. A weight quantifies a level of importance of a valid permutation for the user with respect to buckets. A bucket is an ad category. The system grades each valid permutation relative to a bucket. The system calculates a final grade for each bucket. The system then assigns the user to zero or more buckets based on the final grade for each bucket.
    Type: Application
    Filed: February 5, 2010
    Publication date: August 11, 2011
    Inventors: Jonathan Kilroy, Dale Nussel, Allie K. Watfa
  • Publication number: 20110191343
    Abstract: A computer research tool for inputting, searching, displaying, and analyzing metabolic-related clinical data utilizing a novel graphical user interface (GUI) for visual-statistical data analysis and insight generation and method thereof are disclosed.
    Type: Application
    Filed: November 18, 2010
    Publication date: August 4, 2011
    Applicant: ROCHE DIAGNOSTICS INTERNATIONAL LTD.
    Inventors: Kelly Heaton, Amy Killoren Clark, Luc Girardin, Dominik Brodbeck
  • Publication number: 20110184949
    Abstract: A method for recommending places to visit, included using a processor to provide the following steps: assembling a collection of images, wherein each image has first and second tags with the first tag corresponding to the location where the image was taken, and the second tag corresponding to subject matter of the image; clustering the images in response to the first tags into a plurality of locations; using the images in each location to produce at least one representative image of the location; using the second tags of images of each location to produce a list of representative keywords for each location; providing a query in the form of an image or subject matter, or both; and using the query in the form of an image to search among the representative images to recommend a location to visit, or using the query in the form of subject matter to search among the keywords to recommend a location to visit.
    Type: Application
    Filed: January 25, 2010
    Publication date: July 28, 2011
    Inventor: Jiebo Luo
  • Publication number: 20110184914
    Abstract: A technique for archiving a relational database having tables of rows may use clusters. Transaction identifiers may be assigned to each of the rows in each of the tables such that all rows belonging to the same application transaction share a unique transaction identifier. Plural hierarchies may be determined, each hierarchy having high level nodes corresponding to the rows in a single table and dependent nodes corresponding to rows in other tables to which the rows in the single table are related in the database. The plural hierarchies may be merged to farm plural clusters, one cluster for each unique transaction identifier. Each cluster may have high level nodes corresponding to the plural hierarchies but only those dependent nodes from the plural hierarchies whose transaction identifiers correspond to that of the cluster. The clusters may be stored in one or more files to form an archive.
    Type: Application
    Filed: January 28, 2010
    Publication date: July 28, 2011
    Inventor: Jeff Gong
  • Publication number: 20110184955
    Abstract: Organizing video data [110] is described. Video data [110] comprising metadata is received [205], wherein the metadata [120] provides an intra-video tag of the video data [110]. The metadata [120] is compared [210] with a plurality of video profiles [130]. Based on the comparing [210], the video data [110] is associated [215] with a corresponding one of the plurality of video profiles [130].
    Type: Application
    Filed: October 31, 2008
    Publication date: July 28, 2011
    Inventors: April Sleyoen Mitchell, Mitchell Trott, W. Alex Vorbau
  • Publication number: 20110179028
    Abstract: One or more techniques and/or systems are disclosed herein for aggregating web-based data stored in a distributed data store so that it can be retrieved in a first-in, first-out (FIFO) manner. A unique aggregation key is generated for respective one or more data generated from a web-based event, where the one or more data are added to the distributed data store, and the aggregation key corresponds merely to the data generated from the web-based event. The one or more data from the web based event is aggregated in a FIFO queue and stored in a same partition of the distributed data store, based on the aggregation key.
    Type: Application
    Filed: January 15, 2010
    Publication date: July 21, 2011
    Applicant: Microsoft Corporation
    Inventors: Andrew Ness, Alexander Mallet, Bruce Copeland, Christopher Rickman, Rajesh Viswanathan
  • Publication number: 20110179037
    Abstract: A data classifier system of the present invention selects a plurality of classifications correlated to data groups so as to output classification axes based on hierarchical classifications and data groups. The data classifier system includes a basic category accumulation means, a classification axis candidate creation means and a priority calculation means. The basic category accumulation means accumulates classifications serving as basic categories used for selecting desired classifications in advance. The classification axis candidate creation means creates classification axis candidates based on combinations of classifications each correlated to at least one data among descendant classifications of each basic category. The priority calculation means calculates priorities with respect to the classification axis candidates created by the classification axis candidate creation means based on hierarchical distances of classifications in the classified hierarchy.
    Type: Application
    Filed: July 29, 2009
    Publication date: July 21, 2011
    Inventors: Hironori Mizuguchi, Kenji Tateishi, Itaru Hosomi, Dai Kusui
  • Publication number: 20110178844
    Abstract: The present invention improves upon existing systems and methods by providing a passive profile creation method. The data accessible to a financial processor, such as spend level data, is leveraged using sophisticated data clustering and/or data appending techniques. Associations are established among entities (e.g., consumers), among merchants, and between entities and merchants. In one embodiment, a system and method for passively collecting spend level data for a transaction of a first entity, aggregating the collected spend level data for a plurality of entities; and clustering the first entity with a subset of the plurality of entities, based on aggregated spend level data of the first entity is provided.
    Type: Application
    Filed: January 20, 2010
    Publication date: July 21, 2011
    Applicant: American Express Travel Related Services Company, Inc.
    Inventors: Rajendra R. Rane, Melissa Schwartz
  • Publication number: 20110179033
    Abstract: A method and a system to organize a data set into groups of data subsets in multiple passes using different parameters and to automatically name the groups is disclosed. For example, a data set is retrieved in accordance with a search query submitted by a user. The data set is organized into clusters based on a statistic(s) of the data set. The data set is then organized into groups of data subsets based on an attribute(s) indicated by the data set. Each of the groups are automatically named based on a property shared by data units of the group. The name(s) of a group may be mined from the data units of the group, retrieved from a structure that maps to attribute values indicated by the data units of the group, etc.
    Type: Application
    Filed: March 14, 2011
    Publication date: July 21, 2011
    Applicant: eBay Inc.
    Inventors: John A. Mount, Badrul M. Sarwar
  • Publication number: 20110173201
    Abstract: This invention relates to a method and an apparatus for determining a reliability indicator for at least one set of signatures obtained from clinical data collected from a group of samples. The signatures are obtained by detecting characteristics in the clinical data from the group of sample sand each of the signatures generate a first set of stratification values that stratify the group of samples. At least one additional and parallel stratification source to the signatures obtained from group of sample sis provided, the at least one additional and parallel stratification source to the signatures being independent from the signatures and generates a second set of stratification values. A comparison is done for each respective sample, where the first stratification values are compared with a true reference stratification values, and where the second stratification values are compared with the true reference stratification values.
    Type: Application
    Filed: September 24, 2009
    Publication date: July 14, 2011
    Applicant: KONINKLIJKE PHILIPS ELECTRONICS N.V.
    Inventors: Angel Janevski, Nilanjana Banerjee, Yasser Alsafadi, Vinay Varadan
  • Publication number: 20110173202
    Abstract: Systems and methods for classifying a document are provided. In exemplary embodiments, an organization specific classification code (OSCC) is used to classify the document or data. The OSCC is a classification code based on an information type and an organization. In some embodiments, one or more policies may be associated with the OSCC.
    Type: Application
    Filed: August 16, 2006
    Publication date: July 14, 2011
    Inventors: Deidre Paknad, Puttappaiah Muniyappa
  • Publication number: 20110173197
    Abstract: Exemplary methods and apparatuses are provided which may be implemented using one or more computing devices to allow for super clustering of clusters of electronic documents based, at least in part, on structural and static content features.
    Type: Application
    Filed: January 12, 2010
    Publication date: July 14, 2011
    Applicant: Yahoo! Inc.
    Inventors: Rupesh R. Mehta, Srinivasan H. Sengamedu, Rajeev R. Rastogi
  • Publication number: 20110167065
    Abstract: A data generating apparatus includes an acquiring unit that acquires text data (name data) related to a name associated with position information; a classifying unit that using the acquired position data, classifies the name data according to given regions; an integrating unit that integrates neighboring regions such that the total data size of the name data included in regions to be integrated does not exceed a predetermined given data size; a storage unit that groups the name data according to integrated regions and stores the grouped name data as a name dictionary to be used in both a facility search process and a map display process; and an extracting unit that from the classified name data, extracts the name data common to regions of a given number or more, where the storage unit groups and stores the common name data as a common name dictionary different from the name dictionary.
    Type: Application
    Filed: June 17, 2008
    Publication date: July 7, 2011
    Applicants: Pioneer Corporation, Increment P Corporation
    Inventors: Shunsaku Toyoda, Takashi Hanyuda, Takashi Hashimoto, Ippei Nambata, Hajime Adachi
  • Publication number: 20110167064
    Abstract: A system and associated method for evaluating cross-domain clusterability upon a target domain and a source domain. The cross-domain clusterability is calculated as a linear combination of a target clusterability and a source-target pair matchability, by use of a trade-off parameter that determines relative contribution of the target clusterability and the source-target pair matchability. The target clusterability quantifies how clusterable the target domain is. The source-target pair matchability is calculated as an average of a target-side matchability and a source-side matchability, which quantifies how well target centroids of the target domain are aligned with the source centroids and how well source centroids of the source domain are aligned with the target centroids, respectively.
    Type: Application
    Filed: January 6, 2010
    Publication date: July 7, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: JEFFREY M. ACHTERMANN, INDRAJIT BHATTACHARYA, KEVIN W. ENGLISH, Jr., SHANTANU R. GODBOLE, SACHINDRA JOSHI, ASHWIN SRINIVASAN, ASHISH VERMA
  • Publication number: 20110161312
    Abstract: Mechanisms are provided for integration of Web information architecture taxonomy and Web metrics taxonomy. When the author creates source content, the mechanism classifies the content using a rich taxonomy. The mechanism also adds unique identifiers into the source content pages as tags. The mechanism may then transform the source content into Web content that contains the identifiers in the tags. When users view the Web content, the tags generate usage data, which contain the identifiers. A Web metrics mechanism generates a Web metrics report from the usage data. The page tags are the identifiers from the source content. The Web metrics report associates each page of Web content with the rich taxonomy available in the source content.
    Type: Application
    Filed: December 28, 2009
    Publication date: June 30, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Tracy H. Wallman