Clustering Or Classification (epo) Patents (Class 707/E17.089)
  • Publication number: 20100175024
    Abstract: Embodiments of the invention provide a system and method of automatically generating weights for matching data records. Each field of a record may be compared by an exact match and/or close matches and each comparison can result in a mathematical score which is the sum of the field comparisons. To sum up the field scores accurately, the automatic weight generation process comprises an iterative process. In one embodiment, initial weights are computed based upon unmatched-set probabilities and default discrepancy weights associated with attributes in the comparison algorithm. A bulk cross-match is performed across the records using the initial weights and a candidate matched set is computed for updating the discrepancy probabilities. New weights are computed based upon the unmatched probabilities and the updated discrepancy probabilities. Test for convergence between the new weights and the old weights. Repeat with the new weight table until the weights converge to their final value.
    Type: Application
    Filed: March 19, 2010
    Publication date: July 8, 2010
    Inventors: Scott Schumacher, Scott Ellard, Norman S. Adams
  • Publication number: 20100174714
    Abstract: The present invention relates to a method computer program product for datamining with constant search time, the method and computer program product comprises the steps of: traversing a search tree to a leave, retrieving a one or more data store identifier from said leave, read data pointed to by said data store identifier, locating one or more value in said data, referencing one or more data descriptor, retrieve the n-nearest data descriptor neighbors, terminate said search.
    Type: Application
    Filed: June 6, 2007
    Publication date: July 8, 2010
    Applicant: HASKOLINN I REYKJAVIK
    Inventors: Fridrik Heidar Asmundsson, Herwig Lejsek, Bjorn Thor Jonsson
  • Publication number: 20100174680
    Abstract: [Object] To provide a network system capable of efficiently carrying out synchronous processing by positively notifying a home network appliance of an update content of a content or metadata. [Solving Means] A subscriber in a home network appliance acquires via a network an update notification message that stores update information of a content or metadata and in which a filter attribute for categorizing the update notification message is set. The appliance includes a service client that updates, by an application corresponding to a specific service, the content or the metadata in a local content/metadata database using the update information within the update notification message. The subscriber manages a correspondence between the service client and the filter attribute and specifies the service client that provides the update information within the update notification message based on the filter attribute set in the update notification message and the correspondence.
    Type: Application
    Filed: October 17, 2008
    Publication date: July 8, 2010
    Inventors: Yasuaki Yamagishi, Yasuhiro Yukawa
  • Publication number: 20100174715
    Abstract: A template or wrapper tree for a document such as a web page is generalized from the bottom up (from leaf toward root of a logical tree structure of the template). At a given level in the tree, sub-trees are clustered and the clustered sub-trees are generalized, and the process is repeated at a next higher level in the tree, resulting in a generalized template or wrapper tree. This can be done by generating a nested pattern regular expression based on the sub-tree clusters, merging sub-trees based on the nested pattern regular expression, and then replacing sub-trees in a tree-based regular expression of the template or wrapper at the given level with the merged sub-trees. This process is repeated at a next higher level of the tree (progressing from leaf towards root) until the wrapper or tree-based regular expression that represents the template is fully generalized.
    Type: Application
    Filed: February 22, 2010
    Publication date: July 8, 2010
    Applicant: YAHOO! INC.
    Inventors: Charu Tiwari, V.G. Vinod Vydiswaran
  • Publication number: 20100169163
    Abstract: An Internet business transaction processor of the present invention has a distributed processing architecture which allows the processing load to be distributed among multiple parallel servers. The transaction processor of the present invention provides a virtual store front utilizing “others people's warehouse” approach by using a dynamic distributor selection processing system to select among a plurality of distributors based on flexible rule-based algorithm.
    Type: Application
    Filed: October 26, 2009
    Publication date: July 1, 2010
    Inventor: Robert S. Alvin
  • Publication number: 20100169320
    Abstract: A method and system for performing email search, the said method comprising of enabling the user to find relations between emails and build network relations and to further retrieve groups based on the relations (and intersections of relations) as per the user's choice; the system comprising of giving and having the user select predetermined options for a search with a further ability to “drill-down” the results with the aid of filters to view further mails/results, and being also able to search on search results and also provide for storing user searches.
    Type: Application
    Filed: December 22, 2009
    Publication date: July 1, 2010
    Inventors: Prashanth PATNAM, Ajay Deshpande, Jitendra Gokhale, Vinod Kulkarni
  • Publication number: 20100161609
    Abstract: One aspect of the invention is a method for assigning categorical data to a plurality of clusters. An example of the method includes identifying a plurality of categories associated with the data. This example also includes, for each category in the plurality of categories, identifying at least one element associated with the category. This example also includes specifying a number of clusters to which the data may be assigned. This example additionally includes assigning at least some of the data, wherein each assigned datum is assigned to a respective one of the clusters. This example further includes, for at least one of the clusters, determining, for at least one category, the frequency in data assigned to the cluster of at least one element associated with the category. Further, some examples of the invention provide for detecting outliers, anomalies, and exemplars in the categorical data.
    Type: Application
    Filed: February 27, 2010
    Publication date: June 24, 2010
    Inventor: David B. Fogel
  • Publication number: 20100161605
    Abstract: A computer-implemented method is disclosed for determining a type of landing page to which to transfer web searchers that enter a particular query, the method comprising: classifying a landing page as one of a plurality of landing page classes with a trained classifier of a computer based on textual content of the landing page; determining, by the computer, characteristics of one or more query to be associated with the landing page; and choosing, with the computer, whether to retain or to change classification of the landing page to be associated with the one or more query based on relative average conversion rates of advertisements on a plurality of manually-classified landing pages when associated with the characteristics of the one or more query.
    Type: Application
    Filed: December 23, 2008
    Publication date: June 24, 2010
    Applicant: Yahoo! Inc.
    Inventors: Evgeniy Gabrilovich, Andrei Broder, Bo Pang, Vanja Josifovski, Hila Becker
  • Publication number: 20100161411
    Abstract: There is provided a system and method for generating display advertisements from search based keyword advertisements. The system includes a keyword generation unit for generating one or more advertising keywords from a received category profile defining a classification hierarchy, for use in selecting one or more candidate advertisement messages from a plurality of advertisement messages, an advertisement selection unit for receiving one or more candidate advertisement messages comprising a text message selected from the plurality of advertisement messages and selecting one advertisement message from the one or more received candidate advertisement messages based upon one or more characteristics associated with the received one or more candidate advertisement messages and a creative advertisement assembly unit for generating an advertisement image based on the text advertisement of the selected one advertisement message for display in network based content.
    Type: Application
    Filed: December 22, 2009
    Publication date: June 24, 2010
    Applicant: KINDSIGHT
    Inventors: Wang Wu, Dorothy Tse, Michael Gassewitz, Sitao Yang
  • Publication number: 20100153395
    Abstract: A method comprises storing real-time multimedia data in a plurality of tracks and/or track subsets; and identifying one or more multi-track groups, each multi-track group being associated with a relationship among one or more of the plurality of tracks and/or track subsets.
    Type: Application
    Filed: July 16, 2009
    Publication date: June 17, 2010
    Applicant: NOKIA CORPORATION
    Inventors: Miska Matias Hannuksela, Ye-Kui Wang
  • Publication number: 20100153396
    Abstract: Methods, systems and computer software program code products enabling the matching of a large number of names across any of a range of different languages comprise: receiving incoming names in any of a set of languages or scripts; generating high-recall keys based on the received incoming names; executing a full-text index process based on the generated high-recall keys; and looking up candidates for matching.
    Type: Application
    Filed: February 26, 2008
    Publication date: June 17, 2010
    Inventors: Benson Margulies, David Murgatroyd, Bernard Greenberg, Zhaohui Li
  • Publication number: 20100153397
    Abstract: Data is stored persistently. At least two different items of the data are stored in two different non-conflicting regions or two different physical clusters. A relationship is maintained between the two different items of data. The relationship enables a process to reach any one of the data items from the other data item. Consistency of the relationship is maintained notwithstanding updates of either or both of the items.
    Type: Application
    Filed: August 5, 2009
    Publication date: June 17, 2010
    Applicant: Miosoft Corporation
    Inventors: Albert B. Barabas, Ernst M. Siepmann, Mark D.A. van Gulik
  • Publication number: 20100153399
    Abstract: A framework is provided for obtaining window information. The window information can be applied to different assignment models to assign windows to different groups. A group may correspond to a task being performed by a user. The window information can be semantic or temporal information captured as window events and properties of windows whose events are captured. Temporal information can be information about switches between windows. Semantic information can be window titles. Temporal information, semantic information, or both, can be used to assign windows to groups.
    Type: Application
    Filed: February 26, 2010
    Publication date: June 17, 2010
    Applicant: MICROSOFT CORPORATION
    Inventors: Nuria M. Oliver, Arungunram C. Surendran, Chintan S. Thakkar, Gregory R. Smith
  • Publication number: 20100153400
    Abstract: Provided are systems and methods for rational selection of context sequences and sequence templates including a computer implemented method for obtaining a repository of attributes sets where the attributes sets are statistically associated with a sequence template representing two or more context sequences.
    Type: Application
    Filed: August 20, 2008
    Publication date: June 17, 2010
    Inventor: Yoav Namir
  • Publication number: 20100145949
    Abstract: Systems and methods for managing data, such as metadata or indexes of content of files. In one exemplary method, notifications to update a metadata database or an index database are combined into a combined notification. According to other aspects, an order among logical locations on a storage device is determined in order to specify a sequence for scanning for files to be indexed. According to another aspect, a method includes determining whether to index a file based on a path name of the file relative to a plurality of predetermined path names.
    Type: Application
    Filed: December 11, 2009
    Publication date: June 10, 2010
    Inventors: Yan Arrouye, Dominic Giampaolo, Andrew Carol
  • Publication number: 20100145945
    Abstract: A method, system and program product for classifying data elements into different levels of a business hierarchy. The method includes identifying data elements to be classified into one or more levels of a business hierarchy, selecting a first logic decision tree for evaluating the data elements identified for classification into the hierarchy and executing the first tree for recursively evaluating each data element identified until the first tree has been traversed. Further, the method includes dynamically creating configurable anchor point classifications for the data elements evaluated through the first tree and assigning a respective anchor point classification to each data element evaluated, such that, a respective anchor point classification assigned to a data element evaluated links the data element to a lowest level of the hierarchy, and where the anchor point classification conveys classification information as to each higher level of the hierarchy that the data element belongs to.
    Type: Application
    Filed: December 10, 2008
    Publication date: June 10, 2010
    Applicant: International Business Machines Corporation
    Inventors: James D. Episale, Mark A. Musa, David G. Ruest
  • Publication number: 20100138422
    Abstract: An image processing system particularly for use with diagnostic images, comprises at least one processing unit, which receives digital images acquired by one or more imaging apparatus and provides output images, processed by means of an image processing program loaded in the memory of the processing unit and executed thereby, characterized in that it consists of a central service unit, comprising an interface to be accessed by remote users, which connect to said central unit by remote communication means.
    Type: Application
    Filed: October 24, 2006
    Publication date: June 3, 2010
    Applicant: BRACCO IMAGING S.P.A.
    Inventor: Marco Mattiuzzi
  • Publication number: 20100138421
    Abstract: Systems and methods for identifying inadequate search content are provided. Inadequate search content, for example, can be identified based on statistics associated with the search queries related to the content.
    Type: Application
    Filed: February 3, 2010
    Publication date: June 3, 2010
    Applicant: GOOGLE INC.
    Inventors: Jeffrey David Oldham, Hal R. Varian, Matthew D. Cutts, Matt Rosencrantz
  • Publication number: 20100131467
    Abstract: Systems and methods for data classification to facilitate and improve data management within an enterprise are described. The disclosed systems and methods evaluate and define data management operations based on data characteristics rather than data location, among other things. Also provided are methods for generating a data structure of metadata that describes system data and storage operations. This data structure may be consulted to determine changes in system data rather than scanning the data files themselves.
    Type: Application
    Filed: January 28, 2010
    Publication date: May 27, 2010
    Applicant: CommVault Systems, Inc.
    Inventors: Anand Prahlad, Jeremy A. Schwartz, David Ngo, Brian Brockway, Marcus S. Muller
  • Publication number: 20100131507
    Abstract: A dynamic classification dictionary is built for use in profiling and targeting users for additional relevant content. Behavioral data is gathered from user activity, and user documents and actions are categorized. Author-generated document classification information is analyzed and assigned a first taxonomic noun to characterize the document. User-generated tags characterizing a portion of the document are assigned a second taxonomic noun. Search terms that resulted in the user accessing the document are identified and assigned a third taxonomic noun. Attributes related to the manner in which the document was accessed are evaluated and assigned a fourth taxonomic noun. The document is processed using pattern rules to extract a fifth taxonomic noun. The taxonomic nouns are aggregated into a composite set of taxonomic nouns, and the dynamic classification dictionary is build by storing the composite set of taxonomic nouns.
    Type: Application
    Filed: January 29, 2010
    Publication date: May 27, 2010
    Applicant: CBS INTERACTIVE, INC.
    Inventors: Tushar PRADHAN, Thomas OSBORNE, John POTTER
  • Publication number: 20100121711
    Abstract: The present invention relates to an advertising system and profit creation method using a metablog web page, in which the personal website and the display target entity of a member are posted on a personal metablog web page, managed by a metablog server 200, when the member subscribes to the metablog server as a member while operating the personal website managed by a web server 100, and which provide some of advertising fees, incurred by posting the display target entity and paid by an advertiser client 300, to the member. Therefore, the present invention is advantageous in that, since all members, who operate personal websites through different web servers, use a metablog web page, new profit can be created in such a way that the members can be paid some of the advertising fees paid by an advertiser client.
    Type: Application
    Filed: May 30, 2007
    Publication date: May 13, 2010
    Applicant: NR SYSTEMS, INC.
    Inventor: Yon-Ho Park
  • Publication number: 20100121853
    Abstract: A document accessible over a network can be registered. A registered document, and the content contained therein, is not transmitted undetected over and off of the network. In one embodiment, the invention includes a manager agent to maintain signatures of registered documents and a match agent to detect the unauthorized transmission of the content of registered documents.
    Type: Application
    Filed: January 20, 2010
    Publication date: May 13, 2010
    Inventors: Erik de la Iglesia, William Deninger, Ratinder Paul Singh Ahuja
  • Publication number: 20100121826
    Abstract: It is an object to provide a data collection system that is configured to reduce a communication amount, etc. at the time when data are collected from a plurality of devices, so as to reduce a communication amount attended by the collection of data without increasing processing loads imposed on devices. A symbol classifying unit of a data relay device classifies received data that have been already compressed. A data recompressing unit replaces codes contained in the classified already compressed data with other codes, so as to recompress the already compressed data. A symbol set clustering unit sends a transfer destination renewal device a communication speed at the time when the recompressed data are transferred to other devices, a processing speed at the recompressing time, etc. The transfer destination renewal device generates transfer destination information on the basis of the communication speed, the processing speed, etc.
    Type: Application
    Filed: February 26, 2008
    Publication date: May 13, 2010
    Inventor: Akitake Mitsuhashi
  • Publication number: 20100114893
    Abstract: Events can be searched by identifying a query that includes a time interval and a search component, determining a time increment associated with the time interval, and partitioning the time interval into partitions based on the time increment. For each partition, a relevance of each event in a collection of events that occur at a time in the partition is determined based on the query. A pre-determined number of the relevant events are displayed.
    Type: Application
    Filed: January 11, 2010
    Publication date: May 6, 2010
    Applicant: GOOGLE INC.
    Inventors: Nikhil Chandhok, Peter Solderitsch, Michael Gordon, Philo Juang
  • Publication number: 20100109938
    Abstract: A method of classifying items from reflected signals returned from said items is disclosed, the method comprising: processing said return signals to discriminate between a first set of signals indicative of items of interest and a further set of signals indicative of clutter; identifying items from said first set of signals and classifying them as a first class of item; processing said further set of signals to identify a second set of signals indicative of further items of interest; identifying items from said second set of signals and classifying them as a second class of item.
    Type: Application
    Filed: January 31, 2008
    Publication date: May 6, 2010
    Inventors: Gordon Kenneth Andrew Oswald, Edwin Christopher Carter, Per Arne Vincent Utsi, Samuel Julius Pumphrey, Desmond Keith Phillips, Michael Hugh Burchett, Allan Geoffrey Smithson, Jonathan Peter Edgecombe
  • Publication number: 20100114892
    Abstract: Provided is an introducing system in which a server device introduces users of terminal devices to each other while motivating the users to join the system by presenting appropriate information to them during a wait time before they receive introduction. When a terminal device requests introduction of another terminal device during a time slot (303) shifted from a time period (304) between times ti?1 and ti by a margin time period (305), the server device assigns an introduction time ti to the terminal device. The terminal device displays the difference between the assigned introduction time and the current time on a screen as a remaining wait time and the number of terminal devices in an introduction waiting list. When the time ti comes, the server device groups terminal devices that are assigned the introduction time ti to match an introduction target, and notifies the introduction target to each terminal device.
    Type: Application
    Filed: April 17, 2008
    Publication date: May 6, 2010
    Inventors: Hiromasa Kaneko, Hideo Ueda
  • Publication number: 20100114587
    Abstract: A patent evaluating device comprises a data acquiring section (105) for acquiring items of patent data and patent attribute information on each item of patent data in a predetermined technical field from a patent database, a data classifying section (115) for classifying the acquired items of patent data into groups within a predetermined period of time, and an evaluation value calculating section (120) for calculating the evaluation value of each item of the patent data by using the patent attribute information on each of the patent data belonging to each group and by using the value determined for each group. With this, the value of a patent application or a patent right is adequately evaluated according to numerical information objectively determined and according to the progress information of the patent application or the patent right or the content information.
    Type: Application
    Filed: November 2, 2007
    Publication date: May 6, 2010
    Inventors: Hiroaki Masuyama, Toshiro Ohsaki, Kazumi Hasuko
  • Publication number: 20100114899
    Abstract: Various embodiments of the present invention disclose a method for Business Intelligence (BI) metrics on unstructured data. Unstructured data is collected from numerous data sources that include unstructured data as ingested data. The ingested data is indexed and represents hyperlink and extracted data and metadata for each document. Thereafter, the ingested data is automatically classified into one or more relevance classes. Further, numerous analytics are performed on the classified data to generate business intelligence metrics that may be presented on an access device operated by a user.
    Type: Application
    Filed: October 7, 2009
    Publication date: May 6, 2010
    Inventors: Aloke Guha, Joan Wrabetz, Shumin Wu, Venky Madireddi
  • Publication number: 20100115615
    Abstract: A system and method for categorizing content on a webpage is disclosed. The method comprises receiving a request for a webpage from a user's computer. Next, the system determines whether there is dynamic content on the webpage by analyzing the address, links, reputation, type, style and other indicators of being able to easily change the webpage. If the webpage contains content that can be changed, then the webpage is analyzed to determine a current categorization thereof. If the webpage does not have dynamic content then the categorization of the webpage will remain the same thereby freeing system resources by only analyzing dynamic webpages.
    Type: Application
    Filed: June 29, 2009
    Publication date: May 6, 2010
    Applicant: WEBSENSE, INC.
    Inventors: Daniel Lyle Hubbard, Dan Ruskin
  • Patent number: 7693683
    Abstract: An information classifying device calculates, for a plurality of populations containing pieces of sample information, evaluation distance between a center of gravity of the pieces of sample information belonging to each population and a piece of sample information as an object of classification (object sample), calculates statistical information such as mean, variance and standard deviation of the evaluation distance for each population, evaluates the evaluation distance of the sample information to the population based on the evaluation distance and the statistical information and evaluates degree of assignment relevancy of the object sample to the population, determines to which population the object sample is to be assigned in accordance with the degree of assignment relevancy, and assigns the object sample to the population. Evaluation distance between the center of gravity of each updated population and the object sample belonging to each population is calculated.
    Type: Grant
    Filed: November 17, 2005
    Date of Patent: April 6, 2010
    Assignee: Sharp Kabushiki Kaisha
    Inventor: Masayoshi Ihara
  • Publication number: 20100082618
    Abstract: Methods and apparatus for searching data and grouping search results into clusters that are ordered according to search relevance. Each cluster comprises one or more data type, such as images, web pages, local information, news, advertisements, and the like. In one embodiment, a search term is evaluated for related concepts indicating categories of data sources to search. Data sources may also be identified by context information such as a location of a client device, a currently running application, and the like. Search results in each cluster are ordered by relevance and each cluster is given a score based on an aggregate of the relevance within the cluster. Each cluster score may be modified based on one or more corresponding concepts and/or context information. The clusters are ordered based on the modified scores. Content, including advertisements, may also be added to the ordered list to appear as another cluster.
    Type: Application
    Filed: October 30, 2009
    Publication date: April 1, 2010
    Applicant: Yahoo! Inc.
    Inventors: Edward Stanley Ott, IV, Keith David Saft, Marco Boerries, Meher Tendjoukian, Paul Yiu
  • Publication number: 20090254578
    Abstract: A method of synchronising a multimedia content file with an associated text file includes subdividing the text file into one or more samples, where each sample includes zero or more consecutive characters of the text file. The samples are associated with a corresponding contiguous time interval of the multimedia content file. For each sample, a corresponding consumption rate value is determined, which represents a use ratio of characters of the sample within the associated time interval of the multimedia content file. The consumption rate values are then stored, so that they may subsequently be used to compute time positions within the multimedia content file associated with corresponding text characters within the text file. Additional information, such as time cues and interlude intervals, may also be recorded in order to improve the accuracy of synchronisation.
    Type: Application
    Filed: April 1, 2009
    Publication date: October 8, 2009
    Inventor: Michael Andrew Hall
  • Publication number: 20090234688
    Abstract: A company technical document group analysis supporting device comprises index term extracting means for extracting an index term from a group of documents of a subject company including a technical document group, clustering means for classifying the document group of the subject company under given conditions to acquire multiple clusters, number-of-documents determination means for determining the number of documents belong to each cluster, appearance frequency calculating means for calculating a function value of an appearance frequency of each extracted index term in each cluster, per-cluster keyword point calculating means for calculating the keyword point in each cluster by dividing the function value of the appearance frequency of each index term in each cluster by the number of documents belonging to each cluster, and entire-cluster keyword point calculating means for calculating the total for the entire clusters of the results of the calculation by the per-cluster keyword point calculating means for e
    Type: Application
    Filed: October 11, 2006
    Publication date: September 17, 2009
    Inventors: Hiroaki Masuyama, Makoto Asada, Kazumi Hasuko, Hideaki Hotta, Norio Araki
  • Publication number: 20090210406
    Abstract: A method is provided for organizing a plurality of documents that include forms. An initial set of clusters is defined for the plurality of documents. The initial set of clusters is reclustered based on similarity values calculated in multiple feature spaces. For example, a first feature space may be associated with a content of a document while a second feature space may be associated with a content of a form associated with the document. Each cluster has an associated centroid vector in each feature space that is used to represent the cluster. The similarity between the document and each cluster is calculated in both feature spaces. Each document is assigned to the cluster whose centroid is most similar. The cluster centroids may be recalculated and the process repeated until the cluster assignments become stable.
    Type: Application
    Filed: February 15, 2008
    Publication date: August 20, 2009
    Inventors: Juliana Freire, Luciano Barbosa
  • Publication number: 20090164416
    Abstract: A method and system for adaptive classification during information retrieval from unstructured data are provided. The method includes receiving input from a user defining a classification. A sample set of unstructured data based on the user defined classification defined is determined. The sample set of unstructured data is analyzed to determine a classification mapping that maps attributes of the sample set of unstructured data to class labels for the classification. The attributes of a set of data objects in a second set of unstructured data are indexed and one or more data objects in the set of data objects are mapped to the class label based on the classification mapping. Feedback based on the user's response to an interaction with results is determined using the class label. Finally, adaptive classification mapping is performed based on analysis of feedback by adjusting the sample set of data objects.
    Type: Application
    Filed: December 9, 2008
    Publication date: June 25, 2009
    Applicant: Aumni Data Inc.
    Inventor: Aloke Guha
  • Publication number: 20090150436
    Abstract: The embodiments of the invention provide a method for the automatic identification of changing subtopics within topics. The method begins by receiving customer satisfaction data having unstructured data objects. Next, the data objects are automatically categorized into pre-defined topics, wherein the pre-defined topics do not change throughout the customer satisfaction analysis. The pre-defined topics can be automatically defined based on a history of customer satisfaction data. Following this, a clustering analysis is automatically performed to identify subtopics of the data objects within the pre-defined topics. The subtopics are more specific than the pre-defined topics, and the subtopics can change. Further, the clustering analysis can include extracting features from the data objects and grouping the features into the subtopics. Each of the subtopics includes features having a predetermined degree of similarity.
    Type: Application
    Filed: December 10, 2007
    Publication date: June 11, 2009
    Applicant: International Business Machines Corporation
    Inventors: Shantanu Godbole, Raghuram Krishnapuram, Shourya Roy
  • Publication number: 20090132561
    Abstract: A method of labeling unlabeled nodes in a graph that represents objects that have an explicit structure between them. A computing device can use a labeling engine to labeled nodes in a graph that are labeled and can identify an unlabeled node in the graph that is structurally associated with the labeled nodes. The labeling engine can label the unlabeled node with the label of the labeled node based on the structural association between the unlabeled node and the labeled node.
    Type: Application
    Filed: November 21, 2007
    Publication date: May 21, 2009
    Applicant: AT&T LABS, INC.
    Inventors: Graham Cormode, Smriti Bhagat, Irina Rozenbaum
  • Patent number: 7533093
    Abstract: A method and apparatus are disclosed for recommending items of interest to a user, such as television program recommendations, before a viewing history or purchase history of the user is available. A third party viewing or purchase history is processed to generate stereotype profiles that reflect the typical patterns of items selected by representative viewers. A user can select the most relevant stereotype(s) from the generated stereotype profiles and thereby initialize his or her profile with the items that are closest to his or her own interests. A clustering routine partitions the third party viewing or purchase history (the data set) into clusters, such that points (e.g., television programs) in one cluster are closer to the mean of that cluster than any other cluster. A distance computation routine evaluates the closeness of a television program to each cluster based on the distance between a given television program and the mean of a given cluster.
    Type: Grant
    Filed: November 13, 2001
    Date of Patent: May 12, 2009
    Assignee: Koninklijke Philips Electronics N.V.
    Inventors: Srinivas Gutta, Kaushal Kurapati
  • Publication number: 20090094207
    Abstract: In one embodiment, identifying clusters of words includes accessing a record that records affinities. An affinity between a first and second word describes a quantitative relationship between the first and second word. Clusters of words are identified according to the affinities. A cluster comprises words that are sufficiently affine with each other. A first word is sufficiently affine with a second word if the affinity between the first and second word satisfies one or more affinity criteria. A clustering analysis is performed using the clusters.
    Type: Application
    Filed: October 1, 2008
    Publication date: April 9, 2009
    Applicant: Fujitsu Limited
    Inventors: David L. Marvit, Jawahar Jain, Stergios Stergiou, Alex Gilman, B. Thomas Adler, John J. Sidorowich
  • Publication number: 20090077132
    Abstract: The present invention relates to an information processing device, an information processing method, and a program that make it possible to prevent recommendation in a CF method from concentrating on a part of contents, and recommend a content to a user with little history information. In step S11, another user X having most similar history information to that of a user A to whom to recommend a musical piece is detected. In step S12, a musical piece a that the user X has and which the user A does not have is detected. In step S13, a cluster in each cluster layer to which the musical piece a belongs is identified. Then, in step S14, common musical pieces classified into all the clusters identified are extracted and set as recommendation candidates. Further, in step S15, one musical piece that has most similar cluster information to that of the musical piece a is selected among the musical pieces as recommendation candidates. The musical piece selected in this step is recommended to the user A.
    Type: Application
    Filed: September 15, 2006
    Publication date: March 19, 2009
    Applicant: SONY CORPORATION
    Inventors: Noriyuki Yamamoto, Kei Tateno, Mari Saito, Tomohiro Tsunoda, Mitsuhiro Miyazaki
  • Publication number: 20090055412
    Abstract: A Bayesian spam filter determines an amount of content in incoming email messages that it knows from training. If the filter is familiar with a threshold amount of the content, then the filter proceeds to classify the email message as being spam or legitimate. On the other hand, if not enough of the words in the email are known to the filter from training, then the filter cannot accurately determine whether or not the message is spam. In this case, the filter classifies the message as being of type unknown. Different threshold metrics can be used, such as the percentage of known words, and the percentage of maximum correction value used during processing. This greatly improves the processing of emails in languages on which the filter was not trained.
    Type: Application
    Filed: August 24, 2007
    Publication date: February 26, 2009
    Inventor: Shaun Cooley
  • Patent number: 7474790
    Abstract: A method and apparatus for the detection of local image structures represented as clusters in a joint-spatial range domain where the method comprises receiving an input image made having one or more clusters in a joint-spatial range domain, and each of the one or more clusters having a corresponding mode. Receiving a set of analysis matrices and selecting through each one of the analysis matrices. Using the selected analysis matrix to partition the input image into the one or more clusters and their corresponding modes, and computing a mean, ?, and a local covariance matrix ? for each of the corresponding modes of each of the one or more clusters. Selecting at least one of the one or more clusters, where each selected cluster has a stable mean and stable covariance matrix across the set of analysis matrices, whereby each of the selected clusters is indicative of a local image structure.
    Type: Grant
    Filed: September 29, 2004
    Date of Patent: January 6, 2009
    Assignee: Siemens Medical Solutions USA, Inc.
    Inventors: Navneet Dalal, Dorin Comaniciu
  • Publication number: 20080270462
    Abstract: Described are a system and method for classifying information objects with metadata across heterogeneous data stores. A metadata model includes a plurality of interconnected nodes. A least one of the nodes corresponds to a metadata instance and at least one of the nodes corresponds to a metadata category. Information related to an information object maintained in a data store is acquired. A look up of the metadata model finds one or more metadata instances and metadata categories based on the acquired information related to the information object. One or more of the found metadata instances and metadata categories are associated with the information object maintained in the data store.
    Type: Application
    Filed: November 6, 2007
    Publication date: October 30, 2008
    Applicant: INTERSE A/S
    Inventor: Dan Thomsen
  • Publication number: 20080201371
    Abstract: The present invention is to provide for example a node device that receives content catalog information having attribute information and delivered from a delivery device in an information delivery system, the nodes being mutually communicable through a network and divided into a plurality of groups, the node device including: a new content catalog receiving means for receiving new content catalog information, delivered from the delivery device and having attribute information; a new content catalog saving means; a condition information saving means for saving the grouping condition and presentation time information; a group judgment means for judging on the basis of the grouping condition; a presentation time judgment means; and a content catalog presentation setting means for presenting the new content catalog information after the presentation time arrives.
    Type: Application
    Filed: January 22, 2008
    Publication date: August 21, 2008
    Applicant: BROTHER KOGYO KABUSHIKI KAISHA
    Inventor: Atsushi Murakami
  • Publication number: 20080172403
    Abstract: A method for updating a catalog of hardware device and software object identifiers by identifying unknown identifiers and categorizing each of the unknown identifiers. The method further provides the categorized identifiers to a community of users for review and receives comments from the community of users on the provided categorization. The method further determines if the categorized identifiers should be recategorized based upon the received comments. Another method performs a search for an entity associated with an unknown identifier, determines a likely entity associated with the unknown identifier, and verifies the correctness of such determined likely entity. Another method generates a catalog of computer system components, receives information regarding the identity of a computer system component from at least two different sources, and determines the identity of the computer system component based upon the reputation of the sources of the received information.
    Type: Application
    Filed: January 15, 2007
    Publication date: July 17, 2008
    Applicant: MICROSOFT CORPORATION
    Inventors: Ram P. Papatla, John Leo Ellis, Mario Hewardt, David James Armour
  • Publication number: 20080168054
    Abstract: A method for searching information and displaying search results is disclosed. The method includes: receiving one or more keywords; obtaining search results according to the one or more keywords, the search results comprising one or more documents; confirming at least one cluster name according to the search results, and clustering each of the one or more documents in the search results into a corresponding cluster name; classifying each document in the search results into a corresponding field, and thus obtaining classified search results; generating a cluster diagram according to the at least one cluster name and the clustered documents, and generating a cluster-classification diagram according to the classified search results and the generated cluster diagram; and outputting the generated cluster-classification diagram. A related system is also disclosed.
    Type: Application
    Filed: August 21, 2007
    Publication date: July 10, 2008
    Applicant: HON HAI PRECISION INDUSTRY CO., LTD.
    Inventors: Chung-I Lee, Chien-Fa Yeh, Yao-Huei Sie
  • Publication number: 20080059498
    Abstract: A system and method for facilitating the processing and the use of documents by providing a system for categorizing document section headings under a set of canonical section headings. In the method for categorizing section headings, there may be a process of training a database and matching methods to categorize different but equivalent document section headings under canonical headings and categories. Once trained, the system may match and categorize the document sections with little to no supervision of the categorization for large sets of documents.
    Type: Application
    Filed: September 7, 2007
    Publication date: March 6, 2008
    Applicant: Nuance Communications, Inc.
    Inventors: Alwin CARUS, Melissa MACPHERSON, Stefaan Heyvaert, Cornelia Parkes
  • Publication number: 20080010311
    Abstract: Methods and systems for processing a body of reference material to generate a directory for accessing information from a database.
    Type: Application
    Filed: February 16, 2007
    Publication date: January 10, 2008
    Inventors: Henry Kon, George Burch