Clustering Or Classification (epo) Patents (Class 707/E17.046)
  • Publication number: 20110161174
    Abstract: A system for locating, segmenting, annotating and retrieving multimedia files, provides a database of metadata (68) relating to multimedia files (70), a database manager (64), and a database client (66) for accessing the data contained within the database. The database client (66), together with the metadata database (68) and database manager (64) provide a variety of different functionalities, namely a deep linking functionality, a segmentation functionality, a metadata annotation functionality, a retrieval functionality, and an access functionality. The user, through database client (68) annotates the multimedia file or segment of multimedia file with metadata which is saved in the database (68). When the user desires to locate a multimedia file, the metadata is searched or browsed to locate the database entry associated with the multimedia file in question.
    Type: Application
    Filed: October 11, 2007
    Publication date: June 30, 2011
    Applicant: Tagmotion Pty Limited
    Inventors: Andrew Simms, John Vernon Polglase
  • Publication number: 20110161294
    Abstract: The disclosed embodiments provide a system that determines whether to dynamically replicate data segments on a node in a computing cluster that stores a collection of data segments. During operation, the system identifies a data segment from the collection that is predicted to be frequently accessed by future tasks executing in the cluster. The system then determines a slowdown that would result for the current workload of the node if the data segment were to be replicated to the node. The system also determines a predicted future benefit that would be associated with replicating the data segment to the node. If the predicted slowdown is less than the predicted future benefit, the replication system replicates the data segment to the node.
    Type: Application
    Filed: December 30, 2009
    Publication date: June 30, 2011
    Applicant: SUN MICROSYSTEMS, INC.
    Inventors: David Vengerov, George Porter
  • Publication number: 20110154054
    Abstract: The invention relates to a computer implemented method for generating a pseudonym for a user comprising entering a user-selected secret, storing the user-selected secret in memory, computing a private key by applying an embedding and randomizing function onto the secret, storing the private key in the memory, computing a public key using the private key, the public key and the private key forming an asymmetric cryptographic key, erasing the secret and the private key from the memory, and outputting the public key for providing the pseudonym
    Type: Application
    Filed: January 20, 2010
    Publication date: June 23, 2011
    Applicant: CompuGROUP Holding AG
    Inventors: Adrian Spalka, Jan Lehnhardt
  • Publication number: 20110153604
    Abstract: Embodiments of techniques and systems for parallel XML parsing are described. An event-level XML parser may include a lightweight events partitioning stage, parallel events parsing stages, and a post-processing stage. The events partition may pick out event boundaries using single-instruction, multiple-data instructions to find occurrences of the “<” character, marking event boundaries. Subsequent checking may be performed to help identify other event boundaries, as well as non-boundary instances of the “<” character. During events parsing, unresolved items, such as namespace resolution or matching of start and end elements, may be recorded in structure metadata. This structure metadata may be used during the subsequent post-processing to perform a check of the XML data. If the XML data is well-formed, individual sub-event streams formed by the events parsing processes may be assembled into a flat result event stream structure. Other embodiments may be described and claimed.
    Type: Application
    Filed: December 17, 2009
    Publication date: June 23, 2011
    Inventors: Zhiqiang Yu, Yuejian Fang, Lei Zhai, Yun Wang, Zhonghai Wu, Mo Dai
  • Publication number: 20110153618
    Abstract: A system and method for managing media advertising enterprise data. An enterprise data management (EDM) module can be configured to include a set of rules at an enterprise level to manage disparate and disconnected records. A scoring function with respect to each record can be computed based on the rules to determine the highest priority. The rules in association with the scoring function can be stored locally in an EDM database. A matching process can then be performed to accurately match similar records regardless of manual input, location, and format of the records in a distributed system. Each record can then be assigned with a parent enterprise advertiser. Such an optimization mechanism can interactively manage and report records at the enterprise level in a simple and efficient manner.
    Type: Application
    Filed: December 20, 2009
    Publication date: June 23, 2011
    Inventors: KOHINOOR BASU, ANGEL BARNACHEA CHUA, MATTHEW M. FERRY, SCOTT ARTHUR ROBERTS
  • Publication number: 20110153607
    Abstract: In order to provide an improved tagging-based search method, the method includes one-click actions for searching as well as an “one view” indicator telling the user which search would be the most effective or most important search—by displaying a “search cloud”, called search bag.
    Type: Application
    Filed: December 21, 2010
    Publication date: June 23, 2011
    Applicant: International Business Machines Corporation
    Inventors: Andreas Nauerz, Michael Junginger, Mareike Lattermann, Thomas Steinheber
  • Publication number: 20110153680
    Abstract: A system for classifying documents receives parsable data that defines an information object associated with a document. The information object defines an ID and document characterization information. The system determines a database record associated with the ID by searching through a database for a record associated with the Id. The system stores at least some of the document characterization information in the record.
    Type: Application
    Filed: December 23, 2009
    Publication date: June 23, 2011
    Applicant: Brinks Hofer Gilson & Lione
    Inventors: Mark S. Rolla, Travis H. Tran
  • Publication number: 20110153528
    Abstract: Computer-readable media, computer systems, and computing devices facilitate providing a comparison experience to a user in response to a search query. Upon receiving a search query from the user, entities are extracted from the query. The entities are associated with entity classes. The entities, entity classes, previous user behavior, and other information are used to infer whether the user likely is engaging in a comparison task. If the inference indicates that the user likely is engaging in a comparison task, a comparison experience is generated and access to the comparison experience is provided to the user.
    Type: Application
    Filed: December 18, 2009
    Publication date: June 23, 2011
    Applicant: MICROSOFT CORPORATION
    Inventors: PETER BAILEY, LIWEI CHEN, SANAZ AHARI, NIPOON MALHOTRA
  • Publication number: 20110145249
    Abstract: A method of grouping a plurality of media content is provided. The method includes converting at least a portion of the media content into at least one document object model (“DOM”) using a processor. The DOM can include a plurality of block elements, each comprising at least one content object. The method includes apportioning the content objects into a relevant portion and an irrelevant portion and extracting a set of keywords, the set comprising at least one keyword, within the relevant portion of the content objects. The method includes apportioning the relevant portion of the content objects into a related portion and an unrelated portion using at least a portion of the set of keywords and grouping the related portion of the content to provide a group of related content.
    Type: Application
    Filed: December 16, 2009
    Publication date: June 16, 2011
    Applicant: HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.
    Inventors: Parag M. Joshi, Jian-Ming Jin, Sheng-Wen Yang, Samson J. Liu, Nina Bhatti, Suk Hwan Lim
  • Publication number: 20110145241
    Abstract: A method, a system and a computer program of reducing overheads in multiple applications processing are disclosed. The method includes identifying resources interacting with each of the applications from a set of applications and grouping the applications from the set of applications, resulting in at least one application cluster, in response to the identified resources. The method further includes assigning an agent corresponding to each of the identified resources and initializing the agent corresponding to each of the identified resources. The method further includes identifying parameters associated with the identified resources, pre-processing the identified parameters for each of the identified resources, and also includes selecting a clustering means for the clustering.
    Type: Application
    Filed: December 15, 2009
    Publication date: June 16, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gaurav Mehrotra, Abhinay R. Nagpal
  • Publication number: 20110145243
    Abstract: Methods and devices are provided for sharing data across two or more different clusters. An operating system (OS) in a cluster checks a metadata record of a file system of a shared device to retrieve path group identifiers (PGIDs). A control unit list of the shared device is checked to retrieve PGIDs that are active on the shared device. An input/output supervisor (IOS) record in a couple dataset is checked to retrieve PGIDs in the cluster. The metadata record, control unit list, and IOS record are compared, and if a PGID is found in the metadata record that is not in the IOS record and if the found PGID is not in the control unit list, the found PGID is not active on the shared device. The found PGID of the different cluster is removed from metadata record, and members of the cluster can R/W to file system.
    Type: Application
    Filed: December 16, 2009
    Publication date: June 16, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Harry M. Yudenfriend
  • Publication number: 20110145247
    Abstract: A search query may be interpreted as a number of possible interpretations, and each interpretation may be explored before the results of the search are sent to a user. In one embodiment, a device may split the search query into partitions. Each of the partitions may be submitted, as a search query, to search repositories. Confidence scores based on the results returned from the repositories may be used to determine a measure of confidence of the repository in the search query interpretation.
    Type: Application
    Filed: February 17, 2011
    Publication date: June 16, 2011
    Applicant: GOOGLE INC.
    Inventors: James NORRIS, Gregory John DONAKER, Nina Weiyu KANG
  • Publication number: 20110145223
    Abstract: Methods and apparatus for representing probabilistic data using a probabilistic histogram are disclosed. An example method comprises partitioning a plurality of ordered data items into a plurality of buckets, each of the data items capable of having a data value from a plurality of possible data values with a probability characterized by a respective individual probability distribution function (PDF), each bucket associated with a respective subset of the ordered data items bounded by a respective beginning data item and a respective ending data item, and determining a first representative PDF for a first bucket associated with a first subset of the ordered data items by partitioning the plurality of possible data values into a first plurality of representative data ranges and respective representative probabilities based on an error between the first representative PDF and a first plurality of individual PDFs characterizing the first subset of the ordered data items.
    Type: Application
    Filed: December 11, 2009
    Publication date: June 16, 2011
    Inventors: Graham Cormode, Antonios Deligiannakis, Minos Garofalakis, Andrew Iain Shaw McGregor
  • Publication number: 20110145240
    Abstract: A method, a system and a computer program of organizing annotations are disclosed. The method includes receiving an annotation, accessing an annotation repository and accessing a reference repository. The annotation repository includes stored annotation units. The reference repository includes stored references corresponding to the stored annotation units. The method further includes generating a reference corresponding to the annotation and initializing the reference. The method further includes recursively parsing the annotation into annotation units and comparing the parsed annotation units with the stored annotation units. The method further includes populating the reference with appropriate stored references and generating new reference in response to the comparison. The method also includes updating the annotation repository in response to the comparison. Also disclosed are a system and a computer program for organizing annotations.
    Type: Application
    Filed: December 15, 2009
    Publication date: June 16, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventor: Hariharan Sridharan
  • Publication number: 20110144900
    Abstract: A speed profile dictionary and associated lookup tables are disclosed. A set of distinct speed profiles is defined using a statistical analysis routine. Preferably, the statistical analysis routine uses clustering. The speed profiles are then matched to location codes identifying physical locations on a road network and days of the week. Applications using historic traffic data may use the speed profile dictionary and one or more lookup tables instead of a complete historic traffic database, thereby reducing the amount of memory needed to store historic traffic data.
    Type: Application
    Filed: December 15, 2009
    Publication date: June 16, 2011
    Applicant: NAVTEQ NORTH AMERICA, LLC
    Inventors: Toby S. Tennent, Michael L. Aper, Matthew G. Lindsay, Gary R. Rockwood
  • Publication number: 20110137903
    Abstract: In accordance with certain embodiments of the present disclosure, a regionalization method is disclosed. The method includes inputting a data set into a computer. The method further includes utilizing the computer to perform contiguity-constrained hierarchical clustering on the data set to generate two regions and performing a fine-tuning procedure on the two regions with the computer to iteratively modify the boundaries between the two regions.
    Type: Application
    Filed: December 6, 2010
    Publication date: June 9, 2011
    Applicant: UNIVERSITY OF SOUTH CAROLINA
    Inventor: Diansheng Guo
  • Publication number: 20110137906
    Abstract: A method for analyzing sentiment comprising: collecting an object from an external content repository, the collected objects forming a content database; extracting a snippet related to the subject from the content database; calculating a sentiment score for the snippet; classifying the snippet into a sentiment category; creating sentiment taxonomy using the sentiment categories, the sentiment taxonomy classifying the snippets as positive, negative or neutral; identifying topic words within the sentiment taxonomy; classifying the topic words as a sentiment topic word candidates or a non-sentiment topic word candidate, filtering the non-sentiment topic word candidates; identifying the frequency of the non-sentiment topic words in each of the sentiment categories; identifying the importance of the non-sentiment topic word for each of the sentiment categories; and, ranking the topic word, wherein the rank is calculated by combining the frequency of the topic words in each of the categories with its importance.
    Type: Application
    Filed: December 9, 2009
    Publication date: June 9, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES, INC.
    Inventors: Keke Cai, Ying Chen, W. Scott Spangler, LI Zhang
  • Publication number: 20110137900
    Abstract: A computer implemented method, computer program product and data processing system, for identifying common structures shared across a plurality of formatted text documents. The common structure is presented as a sequence of landmarks, each of which has a starting and ending marker to describe the borders of text. The common structure is identified by counting the occurrences of repeating text segments across documents. Frequently co-occurred adjacent segments become candidates for markers of landmarks. In addition, styling information of textual content within a landmark is extracted and mapped to rules. The rules are used to merge and summarize content from multiple documents, which gives an advantage over current practice of content concatenation.
    Type: Application
    Filed: December 9, 2009
    Publication date: June 9, 2011
    Applicant: International Business Machines Corporation
    Inventors: Yuan-chi Chang, Debdoot Mukherjee, Vibha Singhal Sinha, Biplav Srivastava
  • Publication number: 20110137905
    Abstract: Methods, systems, and articles for receiving, by a monitor server, change data associated with a change captured on a target host, are described herein. In various embodiments, the target host may have provided the change data in response to detecting the change, and the change data may include one or more rules, settings, and/or parameters. Further, in some embodiments, the monitor server may analyze the change data in order to group the change data into clusters. Once the change data have been classified as clusters, a report may be generated providing classification or categorization and cluster information for the various changes. In various embodiments, the generating may comprise generating a report to the target host and/or to an administrative user.
    Type: Application
    Filed: December 8, 2009
    Publication date: June 9, 2011
    Applicant: TRIPWIRE, INC.
    Inventors: Tom Good, Gene Kim, David Whitlock
  • Publication number: 20110137684
    Abstract: A method includes a computer which receives telematics data relating to a vehicle operated by a driver. The telematics data is associated with a match index. The match index indicates that the telematics data is pertinent to the driver without indicating the driver's identity. The computer receives other data relating to the driver. The other data is associated with the match index. The computer uses the match index to associate the telematics data with the other data. The computer uses the associated telematics data and the other data to generate a driver classification for the driver.
    Type: Application
    Filed: December 8, 2009
    Publication date: June 9, 2011
    Inventors: David F. Peak, Andrew J. Amigo
  • Publication number: 20110131453
    Abstract: A set of log entries is automatically inspected to determine a bug. A training set is utilized to determine clustering of log identifications. Log entries are examined in real-time or retroactively and matched to clusters. Timeframe may also be matched to a cluster based on log entries associated with the timeframe. Error indications may be outputted to a user of the system in respect to a log entry or a timeframe.
    Type: Application
    Filed: December 2, 2009
    Publication date: June 2, 2011
    Applicant: International Business Machines Corporation
    Inventors: Yaacov Fernandess, Ohad Rodeh, Lavi Shpigelman
  • Patent number: 7953594
    Abstract: A method and an apparatus for selecting a vocabulary closest to an input speech from among lexicons stored in memory, wherein a centroid lexicon representing lexicons belonging to a predetermined lexicon group is generated. Two lexicons, having a longest distance therebetween in the lexicon group, are selected using the centroid lexicon from the lexicon group, and a node indicating the lexicon group branches based on the two selected lexicons. A node having low group similarity is selected from among current terminal nodes, including branch nodes, and the above procedure is repeatedly performed on a lexicon group indicated by the selected node.
    Type: Grant
    Filed: January 31, 2006
    Date of Patent: May 31, 2011
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Sang-bae Jeong, In-jeong Choi, Ick-sang Han, Jeong-su Kim
  • Publication number: 20110125752
    Abstract: A system for compiling security data from an information network includes at least two network components (30,34), each providing data. A data parser (48,52) is coupled to the network components (30,34). The data parser (48,52) has access to two parser scripts that correspond to the network components' data. Categorized data can be produced by applying the parser scripts to the data received from the network components (30,34).
    Type: Application
    Filed: November 21, 2001
    Publication date: May 26, 2011
    Applicant: METASECURE CORPORATION
    Inventors: Kathy Maida-Smith, Steven W. Engle
  • Publication number: 20110125747
    Abstract: Data classification is used to classified input items by associating the input items with one or more classes from a set of one or more classes in a data classification system, including identifying relevant features in an input item to form a feature vector for the input item, receiving at the data classification system an indication of a point-of-view, adjusting the feature vector according to the point-of-view indication or modifying a pattern discriminator (e.g., trainer and classifier) to inline-process feature vectors depending on the provided point-of-view (e.g., SVM custom kernels), and classifying the input item into the set of classes according to the point-of-view. The point-of-view data can be introduced either as a pre-process step prior to passing it off to the pattern discrimination algorithm, or can be incorporated directly into the pattern discrimination algorithm if applicable.
    Type: Application
    Filed: June 24, 2010
    Publication date: May 26, 2011
    Applicant: BIZ360, Inc.
    Inventors: Daniel Gartung, Philip Chan, John Rotherham
  • Publication number: 20110125745
    Abstract: A balancing technique allows a database administrator to perform a mass data load into a relational database employing partitioned tablespaces. The technique automatically balances the usage of the partitions in a tablespace as the data is loaded. Previous definitions of the partitions are modified after the loading of the data into the tablespace to conform with the data loaded into the tablespace.
    Type: Application
    Filed: November 25, 2009
    Publication date: May 26, 2011
    Applicant: BMC Software, Inc.
    Inventor: Randol K. Bright
  • Publication number: 20110125743
    Abstract: An approach is provided for providing a contextual model based upon user context data. A context modeling platform collects context data of a user from a plurality of sources, and the sources include at least online activities of the user. The context modeling platform maps the collected user context data as context data points into a multidimensional contextual model.
    Type: Application
    Filed: November 23, 2009
    Publication date: May 26, 2011
    Applicant: Nokia Corporation
    Inventors: Pekka IMMONEN, Jarkko Heinonen
  • Publication number: 20110125749
    Abstract: Storing and indexing of high-speed network traffic data is disclosed. In one embodiment, a method of network database maintenance includes sequentially recording in real-time packet header and/or packet content attributes derived from network packets captured and stored in one of a packet capture repository and a file system in database units ordered by arrival of the network packet data. In addition, the method includes indexing each database unit to point to a memory location of the network packet data in one of the packet capture repository and the file system. The method also includes computing a hash value on certain input data and creating index bitmaps on each database unit to facilitate grouping of a similar attributes associated with the network packet data recorded in the database units. The resulting data may then be stored in compressed and/or encrypted formats on a file system for efficiency and security.
    Type: Application
    Filed: November 15, 2010
    Publication date: May 26, 2011
    Applicant: Solera Networks, Inc.
    Inventors: Matthew S. Wood, Joseph H. Levy, Paal Tveit
  • Publication number: 20110125748
    Abstract: Methods and a system of method and apparatus for real time identification and recording of artifacts are disclosed. In one embodiment, a method of network database maintenance includes designating a network packet data to be stored in one of a packet capture repository and a file system resident database to indicate an artifact type, a protocol type, an application, a user-definable attribute, and a temporal session duration based on a real-time packet inspection. The method includes grouping the designated packet data in a database including packet data having a similar one of the artifact type, the protocol type, the application, the user-definable attribute and the temporal session duration. In addition, the method of network database maintenance includes indexing the database to point to a memory location of the designated packet data grouped in the database in the packet capture repository.
    Type: Application
    Filed: November 15, 2010
    Publication date: May 26, 2011
    Applicant: Solera Networks, Inc.
    Inventors: Matthew S. Wood, Joseph H. Levy, Paal Tveit
  • Publication number: 20110119269
    Abstract: Described is a search (e.g., web search) technology in which concepts are returned in response to a query in addition to (or instead of) search results in the form of traditional links. Each concept generally corresponds to a set of links to content that are more directed towards a possible user intention, or information need, with respect to that query. If a user selects a concept, that concept's links are exposed to facilitate selection of a document the user finds relevant. In this manner, much more than the top ten ranked links may be provided for a query, each set of other links arranged by the concepts. Also described is processing a query log or other data store to optionally find related queries and find the concepts, e.g., by clustering a relationship graph built from the query log to find dense subgraphs representative of the concepts.
    Type: Application
    Filed: November 18, 2009
    Publication date: May 19, 2011
    Inventors: Rakesh Agrawal, Sreenivas Gollapudi, Nina Mishra
  • Publication number: 20110119267
    Abstract: The present disclosure provides a computer-implemented method of processing Web activity data. The method includes obtaining a collection of Web activity data generated by a plurality of users at a plurality of Webpages, wherein the Webpages are from a plurality of unaffiliated Websites. The method also includes extracting a plurality of search terms from the Web activity data and associating each of the plurality of search terms with a corresponding Webpage. The method also includes generating statistical data from the Web activity data based, at least in part, on the search terms, the statistical data corresponding to the online activity at one or more Webpages.
    Type: Application
    Filed: November 13, 2009
    Publication date: May 19, 2011
    Inventors: George Forman, Evan R. Kirshenbaum, Shyam Sundar Rajaram
  • Publication number: 20110119273
    Abstract: A computer readable storage medium includes executable instructions to receive a relevancy parameter. The relevancy parameter is searched in a category ensemble including a set of categories, where the category ensemble overlies a dataset. A first order of the set of categories is created based on the relevancy of the relevancy parameter to each category in the set of categories, where the relevancy is a degree of match to the relevancy parameter. A second order of the set of categories is created based on the merit of each category to partition the dataset. The first order and the second order are combined into a final order, which is returned. A measure in the category ensemble is searched based on the relevancy parameter. The measure is returned as a selected measure, where the selected measure is a codomain of a visualization depicting a portion of the dataset.
    Type: Application
    Filed: January 26, 2011
    Publication date: May 19, 2011
    Applicant: BUSINESS OBJECTS SOFTWARE LTD.
    Inventors: SAURABH ABHYANKAR, JEAN-LUC AGATHOS, VIRGILE CHONGVILAY, DAVOR CUBRANIC, JULIAN LARS GOSPER
  • Publication number: 20110119161
    Abstract: A system for automating comparisons and ratings of new product models and services that is timely, efficient, objective and consistent where an expert system extracts model information from a plurality of web page and compares and rates each model against comparable competing models gathered from one or more web sites and stored in a database to be displayed when requested by a user.
    Type: Application
    Filed: November 18, 2009
    Publication date: May 19, 2011
    Inventor: George M. Van Treeck
  • Publication number: 20110113032
    Abstract: A method for generating a conceptual association graph from structured content includes grouping content nodes into one or more topically biased clusters, the content nodes comprising structured digital content and unstructured digital content, the grouping based at least in part on the connectedness of each content node member to other content node members in the same cluster. The method also includes, responsive to the grouping, tagging the content nodes with one or more descriptive concepts. The method also includes, responsive to the tagging, establishing one or more associations between the one or more concepts, the one or more associations indicating a relevance of the one or more associations, the indicating based at least in part on patterns of co-occurrence of concepts in the tagged content nodes.
    Type: Application
    Filed: October 15, 2010
    Publication date: May 12, 2011
    Inventors: RICCARDO BOSCOLO, BEHNAM ATTARAN REZAEI, VWANI P. ROYCHOWDHURY, ALBERT CHERN, NIMA KHAJEHNOURI
  • Publication number: 20110112886
    Abstract: A critical parameter/requirements management process model for managing a development program for a product and an associated product structure-driven critical parameter/requirements management tool and environment is provided. In one embodiment, the process includes a product structure classification scheme, a parameter/requirements classification scheme, a parameter/requirements process and maturity model, and in-process and requirements conformance views. In one embodiment, the tool includes a user interface layer, a business layer, a data layer, and a database. The user interface layer may include a product structure feature group, an add/edit/link feature group, a manage maturity feature group, and a manage conformance feature group. The tool may be implemented as a web server accessible to user workstations operating as thin clients.
    Type: Application
    Filed: January 18, 2011
    Publication date: May 12, 2011
    Applicant: XEROX CORPORATION
    Inventors: Charles D. Rizzolo, Ronald E. Stokes, Louis F. LaVallee, Charles M. Gardiner, William R. Smith, Kathy Cupo, Richard S. Pagano, Joel S. Cornell, Barry P. Mandel, Ralph E. Simpson, John T. Potter
  • Publication number: 20110106772
    Abstract: A data processing apparatus (100) includes: a temporary storage unit (5) that stores a cluster-element correspondence table showing correspondence between a cluster ID for identifying each of a plurality of clusters classified by the data processing apparatus and an element ID of element data belonging to the cluster identified by the cluster ID, and a group-cluster correspondence table showing correspondence between a group ID for identifying a group classified according to a user's subjective criterion and a cluster ID of a cluster belonging to the group identified by the group ID; a feature extraction unit (1) that extracts a feature value of newly added element data; an automatic classification processing unit (2) that determines a belonging cluster from the plurality of clusters, and updates a classification boundary condition defining a boundary of the belonging cluster according to a predetermined constraint; and a data management unit (6) that records an element ID of the newly added element data and
    Type: Application
    Filed: April 23, 2010
    Publication date: May 5, 2011
    Inventors: Takashi Kawamura, Kuniaki Isogai, Yazhou Liu
  • Publication number: 20110106807
    Abstract: Described within are systems and methods for disambiguating entities, by generating entity profiles and extracting information from multiple documents to generate a set of entity profiles, determining equivalence within the set of entity profiles using similarity matching algorithms, and integrating the information in the correlated entity profiles. Additionally, described within are systems and methods for representing entities in a document in a Resource Description Framework and leveraging the features to determine the similarity between a plurality of entities. An entity may include a person, place, location, or other entity type.
    Type: Application
    Filed: November 1, 2010
    Publication date: May 5, 2011
    Applicant: JANYA, INC
    Inventors: Rohini K. Srihari, Harish Srinivasan, Richard Smith, John Chen
  • Publication number: 20110093794
    Abstract: The present disclosure provides a system and method for electronic communications dialog between a plurality of users using digital images. The user selects a template for entering a plurality of words and associated images that constitute an initial electronic message. The user then enters a plurality of words into the template corresponding to the initial electronic message. A plurality of images is selected having a direct correspondence with the plurality of words entered into the template. Each image is inserted into the template in a sequence corresponding to the initial electronic message. When the initial template is complete, the initial electronic message containing the sequenced images is sent to at least one other user.
    Type: Application
    Filed: December 22, 2010
    Publication date: April 21, 2011
    Inventor: Mark Grace
  • Publication number: 20110093463
    Abstract: An approach is provided for managing projection and injection operations on information spaces with respect to their information content. An information space projection module receives a query to project a first information space from a second information space. In response to the query, the module extracts a subset of information content from the second information space by using a partitioning function. The module also extracts a subset of rules from the second information space by using the partitioning function. The module then creates the first information space using the extracted subset of information content, and the extracted subset of rules while maintains a link between the first and the second information spaces. An information space injection module enables further injection of the first information space back into the second information space.
    Type: Application
    Filed: October 21, 2009
    Publication date: April 21, 2011
    Applicant: Nokia Corporation
    Inventors: Ian Justin Oliver, Sergey Boldyrev
  • Publication number: 20110087668
    Abstract: Documents likely to be near-duplicates are clustered based on document vectors that represent word-occurrence patterns in a relatively low-dimensional space. Edit distance between documents is defined based on comparing their document vectors. In one process, initial clusters are formed by applying a first edit-distance constraint relative to a root document of each cluster. The initial clusters can be merged subject to a second edit-distance constraint that limits the maximum edit distance between any two documents in the cluster. The second edit-distance constraint can be defined such that whether it is satisfied can be determined by comparing cluster structures rather than individual documents.
    Type: Application
    Filed: August 27, 2010
    Publication date: April 14, 2011
    Applicant: Stratify, Inc.
    Inventors: Joy Thomas, Sauraj Goswami, Vamsi Salaka
  • Publication number: 20110087709
    Abstract: According to one embodiment of the invention, a method for composing information into a generic information cell structure, which includes an information vacuole and a cell, is provided. In another embodiment, attaching generic tags, which correspond to the generic information cell structure, is provided. In another embodiment, generating structural and positional identification, fetching information characteristics, decomposing an information element into an atom class, processing the information element, and forming a native data manipulation statement, is provided. In another embodiment, a data repository, which includes an information element name and an atom type is provided. In yet another embodiment, a data directory, which includes a cell structure storage location identification, is provided. In one embodiment, a method of routing data by receiving a data store location identification for information, is provided. The data store identification may be externally defined and/or run-time defined.
    Type: Application
    Filed: October 12, 2009
    Publication date: April 14, 2011
    Inventor: Ramani Sriram
  • Publication number: 20110078146
    Abstract: Systems and methods for managing electronic data are disclosed. Various data management operations can be performed based on a metabase formed from metadata. Such metadata can be identified from an index of data interactions generated by a journaling module, and obtained from their associated data objects stored in one or more storage devices. In various embodiments, such processing of the index and storing of the metadata can facilitate, for example, enhanced data management operations, enhanced data identification operations, enhanced storage operations, data classification for organizing and storing the metadata, cataloging of metadata for the stored metadata, and/or user interfaces for managing data. In various embodiments, the metabase can be configured in different ways. For example, the metabase can be stored separately from the data objects so as to allow obtaining of information about the data objects without accessing the data objects or a data structure used by a file system.
    Type: Application
    Filed: September 20, 2010
    Publication date: March 31, 2011
    Applicant: CommVault Systems, Inc.
    Inventors: Anand Prahlad, Jeremy Alan Schwartz, David Ngo, Brian Brockway, Marcus S. Muller
  • Publication number: 20110078206
    Abstract: A tagging method and apparatus, including computer program products, based on a structured data set are provided, the tagging method comprising: creating classification models for respective nodes in the structured data set of an event; acquiring public opinions on the event; and tagging the opinions to corresponding nodes of the structured data set using the created classification models. The tagging method and apparatus of the present disclosure are able to provide well-ordered, focused public opinions for each event to users, and to exhibit the evolution of the public opinions along with time.
    Type: Application
    Filed: August 20, 2010
    Publication date: March 31, 2011
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Jian Chen, Ben Fei, Rui Ma, Zhong Su, Xian Wu
  • Publication number: 20110078145
    Abstract: A method, including receiving a data source selection from a user or software application, the data source including medical information of a plurality of patients, receiving, from the user or software application, a data pattern that is related to a concept to be explored in the data source, querying the data source to find information that approximately matches the data pattern; and receiving the information from the data source, wherein the information includes unstructured data, assigning a classification to individual parts of the information based on the part's relationship to the data pattern, and outputting the classified information to the user or software application.
    Type: Application
    Filed: September 28, 2010
    Publication date: March 31, 2011
    Applicant: SIEMENS MEDICAL SOLUTIONS USA INC.
    Inventors: Stanley Chung, Faisal Farooq, Glenn Fung, Balaji Krishnapuram, R. Bharat Rao, Romer E. Rosales, John Weis, Shipeng Yu
  • Publication number: 20110077048
    Abstract: The invention relates to a system for data correlation, having: a receiving device 1 having an image acquisition element 10 and a data set generator 12 for generating at least one object data set from at least one acquired first image, which represents a physical object, and an identification label, which uniquely determines an object-related acquisition procedure, and at least one information data set from at least one acquired second image, which represents coded information related to the physical object, and the identification label; a correlation device 2 for the extraction 20 of the coded information from the information data set, for the semantic analysis 22 of the extracted information, and for the generation of at least one combination data sets ? from the results of the semantic analysis, the extracted information, and the at least one object data set with the same identification label as the extracted information data set; and a user device 3 for the storage and further use of the combination data
    Type: Application
    Filed: March 3, 2009
    Publication date: March 31, 2011
    Applicant: Linguatec Sprachtechnologien GmbH
    Inventor: Reinhard Busch
  • Publication number: 20110072024
    Abstract: In one embodiment the present invention provides a novel method for probabilistically quantifying a degree of relevance between two or more citationally or contextually related data objects, such as patent documents, non-patent documents, web pages, personal and corporate contacts information, product information, consumer to behavior, technical or scientific information, address information, and the like. In another embodiment the present invention provides a novel method for visualizing and displaying relevance between two or more citationally or contextually related data objects. In another embodiment the present invention provides a novel search input/output interface that utilizes an iterative self-organizing mapping (“SOM”) technique to automatically generate a visual map of relevant patents and/or other related documents desired to be explored, searched or analyzed.
    Type: Application
    Filed: March 29, 2010
    Publication date: March 24, 2011
    Applicant: PatentRatings, LLC
    Inventor: Jonathan A. Barney
  • Publication number: 20110072016
    Abstract: A density-based data clustering method, comprising a parameter-setting step, a first retrieving step, a first determination step, a second determination step, a second retrieving step, a third determination step and first and second termination determination steps. The parameter-setting step sets parameters. The first retrieving step retrieves one data point and defines neighboring points. The first determination step determines whether the number of the data points exceeds the minimum threshold value. The second determination step arranges a plurality of first border symbols. The second retrieving step retrieves one seed data point from the seed list, arranges a plurality of second border symbols and defines seed neighboring points. The third determination step determines whether a data point density of searching ranges of the seed neighboring points is the same. The first termination determination step determines whether the clustering is finished.
    Type: Application
    Filed: January 6, 2010
    Publication date: March 24, 2011
    Inventors: Cheng-Fa TSAI, Yi-Ching Huang
  • Publication number: 20110066451
    Abstract: A method and apparatus for providing health management information to a patient. Patients are grouped according to their personal health records, order is determined between the patients in each group, and health management information of a patient highly ordered in each group is automatically provided to a patient lower ordered in the each group.
    Type: Application
    Filed: May 14, 2010
    Publication date: March 17, 2011
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Woo-young Jang, Kyu-tae Yoo
  • Publication number: 20110066564
    Abstract: Methods, devices, and systems for analyzing negotiated negotiable instruments for unlawful activity are described. A computer system, including a computer readable storage device and a processor may be provided. A plurality of electronic files may be received. Each of these electronic files of the plurality of electronic files may include an electronic image of at least a portion of a negotiable instrument and include a plurality of data fields. The plurality of electronic files may be divided into subsets based on whether data is available in particular data fields of the electronic files. Based upon the subset an electronic file is made a member of, various selection criteria may be applied to determine if the electronic file is a candidate for suspicious and/or illegal activity. Also, a listing of candidates for suspicious and/or illegal activity may be presented to a user.
    Type: Application
    Filed: September 11, 2009
    Publication date: March 17, 2011
    Applicant: The Western Union Company
    Inventors: Jeannie Larsen, Allen Hofmann, Eyal Tzarfati
  • Publication number: 20110066475
    Abstract: Systems and methods for providing information relating to professional growth are provided. A client provides client data regarding professional growth. The current level of professional growth of the client is determined, and the next level of professional growth is identified. The client is provided with information regarding the next level, based on the identification of the next level.
    Type: Application
    Filed: September 16, 2009
    Publication date: March 17, 2011
    Inventors: Daniel J. Sullivan, Barbara Sue Smith
  • Publication number: 20110060603
    Abstract: Articles of manufacture including electronic machines including but not limited to computers, computer stations, computing devices, computer systems, software, computer readable memory and other electronic devices adapted to provide an index for a performance characteristic or measure for groups of people or institutions. The index values are risk adjusted to the varying population compositions for groups of people or institutions by comparison to a reference portfolio and are updated in real time to account for the changing constitution of the clusters in the portfolios. Methods of using the disclosed devices include the ability to provide a continuously updated benchmark for the comparison of medical, business or educational performance by providers or practitioners of such services that effect populations of individuals, and that is based on measurable outcomes. Additionally, the methods of using the disclosed devices include the ability to compare the effectiveness of different therapies.
    Type: Application
    Filed: September 9, 2009
    Publication date: March 10, 2011
    Inventors: Christopher C. Capelli, William T. Little