Latent Semantic Index Or Analysis (lsi Or Lsa) Patents (Class 707/739)
-
Patent number: 8019765Abstract: To determine files associated with one or more workflows, a trace of accesses of files in at least one server is received. The files are grouped into at least one set of files, where the files in the set are accessed together more than a predetermined number of times in the trace. Files associated with the particular workflow are identified based on the at least one set.Type: GrantFiled: October 29, 2008Date of Patent: September 13, 2011Assignee: Hewlett-Packard Development Company, L.P.Inventors: Anna Povzner, Kimberly Keeton, Marcos K. Aguilera, Arif A. Merchant, Charles B. Morrey, III, Mustafa Uysal
-
Publication number: 20110219003Abstract: A method for retrieving information from a document includes a process of grouping paragraphs in the document to form passages, and forming indexes relating to a number of words in the passages. The number of paragraphs in a passage is determined based on the number of paragraphs considered optimum for a writer to cover a particular topic. Passages are formed by merging each N consecutive paragraphs in the document, where N is an integer greater than 1. Thus, individual passages may include paragraphs that are identical to other passages.Type: ApplicationFiled: May 16, 2011Publication date: September 8, 2011Inventor: Jiandong BI
-
Publication number: 20110211677Abstract: An online address book system having sufficient hardware and software to operate an address book user interface and to perform intelligent interpretations of voice and text inputs from users. The system includes at least one server software module that includes software to perform a plurality of functions. These include the ability to receive voice input data and separate user voice queries, wherein the software can arrange the data so as to create a data base that includes at least three access dimensions, including contact access, contact-relationship access and contact-time frame access, and so as to create a connectivity matrix based on a plurality of contact pair relationships applying connective recognition logic. The system provides a voice operated user interface that permits access to address book stored data based on user input selected from the group consisting of contact, a contact-relationship pair, a contact-time frame pair, and combinations thereof.Type: ApplicationFiled: May 13, 2011Publication date: September 1, 2011Inventor: CHARLES M. BASNER
-
Publication number: 20110208708Abstract: Systems and methods for finding related terms based on three different sources are disclosed. Generally, a first plurality of distances is determined based on one or more received terms and a first plurality of terms derived from an algorithmic search list. A second plurality of distances is determined based on the one or more received terms and a second plurality of terms derived from a sponsored search list. A third plurality of distances is determined based on the one or more received terms and a third plurality of terms derived from search logs. The first, second, and third pluralities of distances are combined to derive a fourth plurality of distances. Finally, a plurality of related terms related to the one or more received terms is generated based on the fourth plurality of distances.Type: ApplicationFiled: February 25, 2010Publication date: August 25, 2011Applicant: Yahoo! Inc.Inventors: Weiguo Liu, Qiong Zhang
-
Patent number: 8005841Abstract: Methods, systems, and products are disclosed for classifying content segments. A set of annotations is received that occur within a segment of time-varying content. Each annotation is scored to each node in an ontology. The segment is classified based on at least one of the scores.Type: GrantFiled: April 28, 2006Date of Patent: August 23, 2011Assignee: Qurio Holdings, Inc.Inventors: Richard J. Walsh, Alfredo C. Issa
-
Publication number: 20110202535Abstract: A method of identifying a provenance of a document is provided. The method may include obtaining a query document that is included in a document set comprising a plurality of documents. The method may also include grouping the plurality of documents into a plurality of fine clusters based on a textual similarity between the plurality of documents. The method may also include identifying a target fine cluster within the plurality of fine clusters, the target fine cluster including the query document. The method may also include ordering the documents included in the target fine cluster based, at least in part, on metadata associated with each of the documents to identify a source document. The method may also include generating a query response that includes the source document.Type: ApplicationFiled: February 13, 2010Publication date: August 18, 2011Inventors: Vinay Deolalikar, Hernan Laffitte
-
Patent number: 7996406Abstract: Method and apparatus for detecting web-based electronic mail in network traffic is described. In some examples, web pages are extracted from the network traffic. Fields in each page of a group of the web pages that share a documents structure are identified. A statistical analysis of the fields of each page in the group of web pages is performed to identify any electronic mail (e-mail) fields. The group of web pages is indicated to include web-based e-mail messages if the fields of each page in the group of web pages include at least one e-mail field.Type: GrantFiled: September 30, 2008Date of Patent: August 9, 2011Assignee: Symantec CorporationInventors: Basant Rajan, Chirag Deepak Dalal, Navin Kabra
-
Publication number: 20110191344Abstract: An automatic organization into topics for a browsing history. In one embodiment, a system identifies groups of browsing actions as related, and clusters the browsing history (e.g. a web browsing history) into sessions based on heuristics used to determine relationships. Latent semantic analysis can be used to determine the relationships which can be considered topics. User interfaces for displaying or otherwise presenting these sessions can include icons representative of topics, and these icons can have different sizes depending on a frequency of web page visits within a topic. The topics can be displayed in time ranges or in a cover flow view or both time ranges and cover flow view.Type: ApplicationFiled: February 3, 2010Publication date: August 4, 2011Inventors: Jing Jin, Kevin Decker, Timothy Hatcher, Raymond Sepulveda, Michael Thole
-
Publication number: 20110191345Abstract: An information processing apparatus (5) is provided comprising: a lexicon generation module (22) operable to process a set of documents (1) to identify key words (2) present in the documents; a link generation module (24) operable to generate network data (3) linking documents which share the same or semantically related key words identified by the lexicon generation module; and a network analysis module (26) operable to associate documents with metric values based upon the patterns of connectivity of the network data generated by the link generation module. The metric values associated with documents in the set can be utilized to select documents or groups of associated documents for further processing or indexing.Type: ApplicationFiled: January 28, 2011Publication date: August 4, 2011Applicant: E-THERAPEUTICS PLCInventor: Malcolm P. Young
-
Publication number: 20110184926Abstract: An expert list recommendation system is provided, including: a domain modeler for establishing an expert knowledge database according to a plurality of expert publications in different domains, receiving an inquired proposal, determining the academic field of the inquired proposal according to keywords of the inquired proposal and keyword sets of the expert publications in different domains stored in the expert knowledge database, and outputting a first domain expert list corresponding to the inquired proposal, wherein the first domain expert list comprises a first group of expert publications and a first group of expert names; and an expertise matcher for receiving the first domain expert list, comparing semantic relatedness between keywords of the inquired proposal and keywords corresponding to the first group of the expert publications of the first domain expert list to output a first expert list to a display device.Type: ApplicationFiled: June 25, 2010Publication date: July 28, 2011Applicant: NATIONAL TAIWAN UNIVERSITY OF SCIENCE & TECHNOLOGYInventors: Hahn-Ming LEE, Jan-Ming HO, Jerome YEH, Kai-Hsiang YANG, Tai-Liang KUO, Chun-Han CHEN
-
Patent number: 7987188Abstract: A domain-specific sentiment classifier that can be used to score the polarity and magnitude of sentiment expressed by domain-specific documents is created. A domain-independent sentiment lexicon is established and a classifier uses the lexicon to score sentiment of domain-specific documents. Sets of high-sentiment documents having positive and negative polarities are identified. The n-grams within the high-sentiment documents are filtered to remove extremely common n-grams. The filtered n-grams are saved as a domain-specific sentiment lexicon and are used as features in a model. The model is trained using a set of training documents which may be manually or automatically labeled as to their overall sentiment to produce sentiment scores for the n-grams in the domain-specific sentiment lexicon. This lexicon is used by the domain-specific sentiment classifier.Type: GrantFiled: August 23, 2007Date of Patent: July 26, 2011Assignee: Google Inc.Inventors: Tyler J. Neylon, Kerry L. Hannan, Ryan T. McDonald, Michael Wells, Jeffrey C. Reynar
-
Publication number: 20110179036Abstract: Systems and methods are provided for creating abstracted, normalized, and reuseable and combinable representations of information contained in multiple documents and information of any supported format, and allowing for exporting of information in any other desired and supported format. Further the system and methods provide for uploading documents based on a known template, where the data members can be automatically recognized and the document stored in normalized format without end-user or developer intervention. Normalization of data is achieved transparently on upload and denormalization performed transparently on download. Further, embodiments provide for the reuse and recombination of data members to create entirely new representations.Type: ApplicationFiled: December 16, 2010Publication date: July 21, 2011Inventors: Jason Townes French, Auston John Stewart
-
Publication number: 20110179035Abstract: A visualization-based interactive legal research tool that generates from a multi-dimensional citation network a semantics-constrained citation sub-network that focuses on one individual issue in which a user is interested, and puts the sub-network on an interactive user interface (“UT”), which allows the researcher to browse, navigate, and jump over to start new sub-networks on different issues that are relevant to original issues.Type: ApplicationFiled: June 1, 2010Publication date: July 21, 2011Applicant: LEXISNEXIS, A DIVISION OF REED ELSEVIER INC.Inventors: Paul Zhang, Lavanya Koppaka
-
Patent number: 7984041Abstract: Methods and apparatus provide for a local search indexer to allow for an optimized search within a web server that returns accurate search results while maintaining independent control as to defining search patterns, search prioritization, and updated content available for search. Specifically, the local search indexer organizes content according to a hierarchical directory structure at a web server. The hierarchical directory structure includes at least one directory level that provides at least one directory for storing the content. The local search indexer builds a search index associated with the directory and stores the search index at the web server. The search index is populated with indexed content based on an update of the content stored in the directory. The local search indexer employs a search engine, at the web server, to process search queries against the indexed content to provide a search result that includes the update of the content.Type: GrantFiled: July 9, 2007Date of Patent: July 19, 2011Assignee: Oracle America, Inc.Inventor: Yogesh Y Patil
-
Publication number: 20110173201Abstract: This invention relates to a method and an apparatus for determining a reliability indicator for at least one set of signatures obtained from clinical data collected from a group of samples. The signatures are obtained by detecting characteristics in the clinical data from the group of sample sand each of the signatures generate a first set of stratification values that stratify the group of samples. At least one additional and parallel stratification source to the signatures obtained from group of sample sis provided, the at least one additional and parallel stratification source to the signatures being independent from the signatures and generates a second set of stratification values. A comparison is done for each respective sample, where the first stratification values are compared with a true reference stratification values, and where the second stratification values are compared with the true reference stratification values.Type: ApplicationFiled: September 24, 2009Publication date: July 14, 2011Applicant: KONINKLIJKE PHILIPS ELECTRONICS N.V.Inventors: Angel Janevski, Nilanjana Banerjee, Yasser Alsafadi, Vinay Varadan
-
Publication number: 20110173200Abstract: An apparatus for authoring data in a communication system includes: an extraction unit configured to receive media corresponding to contents and extract contents information regarding the contents from the received media; a generation unit configured to generate a DMB ECG XML-based metadata comprising the extracted contents information; and a processing unit configured to visualize particulars of the DMB ECG XML-based metadata through a user interface and process the user interface so that the DMB ECG XML-based metadata is generated and edited on a template.Type: ApplicationFiled: November 12, 2010Publication date: July 14, 2011Applicant: Electronics and Telecommunications Research InstituteInventors: Seung-Jun YANG, Min-Sik Park, Han-Kyu Lee, Jin-Woo Hong
-
Publication number: 20110167053Abstract: A system that can analyze a multi-dimensional input thereafter establishing a search query based upon extracted features from the input. In a particular example, an image can be used as an input to a search mechanism. Pattern recognition and image analysis can be applied to the image thereafter establishing a search query that corresponds to features extracted from the image input. The system can also facilitate indexing multi-dimensional searchable items thereby making them available to be retrieved as results to a search query. More particularly, the system can employ text analysis, pattern and/or speech recognition mechanisms to extract features from searchable items. These extracted features can be employed to index the searchable items.Type: ApplicationFiled: March 15, 2011Publication date: July 7, 2011Applicant: Microsoft CorporationInventors: Stephen Lawler, Eric J. Horvitz, Joshua T. Goodman, Anoop Gupta, Christopher A. Meek, Eric D. Brill, Gary W. Flake, Ramez Naam, Surajit Chaudhuri, Oliver Hurst-Hiller
-
Patent number: 7971150Abstract: A document categorization system, including a clusterer for generating clusters of related electronic documents based on features extracted from the documents, and a filter module for generating a filter on the basis of the clusters to categorize further documents received by the system. The system may include an editor for manually browsing and modifying the clusters. The categorization of the documents is based on n-grams, which are used to determine significant features of the documents. The system includes a trend analyzer for determining trends of changing document categories over time, and for identifying novel clusters. The system may be implemented as a plug-in module for a spreadsheet application for permitting one-off or ongoing analysis of text entries in a worksheet.Type: GrantFiled: September 25, 2001Date of Patent: June 28, 2011Assignee: Telstra New Wave Pty Ltd.Inventors: Bhavani Raskutti, Adam Kowalczyk
-
Patent number: 7970767Abstract: One or more classification algorithms are applied to at least one natural language document in order to extract both attributes and values of a given product. Supervised classification algorithms, semi-supervised classification algorithms, unsupervised classification algorithms or combinations of such classification algorithms may be employed for this purpose. The at least one natural language document may be obtained via a public communication network. Two or more attributes (or two or more values) thus identified may be merged to form one or more attribute phrases or value phrases. Once attributes and values have been extracted in this manner, association or linking operations may be performed to establish attribute-value pairs that are descriptive of the product. In a presently preferred embodiment, an (unsupervised) algorithm is used to generate seed attributes and values which can then support a supervised or semi-supervised classification algorithm.Type: GrantFiled: April 30, 2007Date of Patent: June 28, 2011Assignee: Accenture Global Services LimitedInventors: Katharina Probst, Rayid Ghani, Andrew E. Fano, Marko Krema, Yan Liu
-
Publication number: 20110119272Abstract: Determining a semantic relationship is disclosed. Source content is received. Cluster analysis is performed at least in part by using at least a portion of the source content. At least a portion of a result of the cluster analysis is used to determine the semantic relationship between two or more content elements comprising the source content.Type: ApplicationFiled: January 25, 2011Publication date: May 19, 2011Applicant: APPLE INC.Inventors: Philip Andrew Mansfield, Michael Robert Levy, Yuri Khramov, Darryl Will Fuller
-
Patent number: 7945864Abstract: An operation assisting apparatus includes: an option-function distance storage unit that stores a semantic distance between each of the options displayed on a menu screen and each of functions positioned at an end in the hierarchical structure; an operation history storage unit that stores the operation history of the options sequentially selected by the user; an estimation unit that estimates, based on a semantic distance between a selection option selected by the user and each of the functions, and a semantic distance between an unselected selection option that has been selectable but not selected and each of the functions, a degree of probability that the function is the function desired by the user; and an operational assistance determination unit that determines, based on the result of the estimation, a detail of an output such that functions with higher probability will be presented with higher precedence in selectability.Type: GrantFiled: October 29, 2007Date of Patent: May 17, 2011Assignee: Panasonic CorporationInventors: Tsuyoshi Inoue, Makoto Nishizaki, Satoshi Matsuura
-
Patent number: 7945564Abstract: A computing system and method receive a query; separate a plurality of information sources into individual elements of content (EOC); tag each EOC with metadata that indicate source, date, and other relevant information; pattern match each EOC; calculate the respective distance function from every EOC to every other EOC; and output EOC to a set of virtual buffers (404) containing appropriately related EOC less than a given distance value. The method further creates virtual summary buffers (406); then concatenates the EOC in each virtual buffer (404); applies a comparative analysis filter (318) to remove redundant sub-elements; and presents the results as summary digests (408).Type: GrantFiled: August 14, 2008Date of Patent: May 17, 2011Assignee: International Business Machines CorporationInventors: Amon Amir, Gal Ashour, Brian K. Blanchard, Matthew Denesuk, Reiner Kraft
-
Patent number: 7945555Abstract: The present invention provides method and system for categorizing a content published on Internet. The method comprising gathering one or more feeds associated with the content. The method further comprises extracting contextual information from the one or more feeds. Thereafter, the content is categorized into one or more general web-based categories belonging to a set of general web-based categories. The categorizing step further comprises performing a semantic analysis of the contextual information that yields a keyword string. The content is classified into the one or more general web-based category based on the keyword string. Finally, the set of general web-based categories is translated to a set of pre-defined categories, such that one or more general web-based category is translated to a pre-defined category that is relevant to an end user.Type: GrantFiled: December 27, 2007Date of Patent: May 17, 2011Assignee: Yume, Inc.Inventors: Ayyappan Sankaran, Jayant Kadambi, Matthew D Shaver
-
Patent number: 7941418Abstract: A computer-implemented method of generating a dynamic corpus includes generating web threads, based upon corresponding sets of words dequeued from a word queue, to obtain web thread resulting URLs. The web thread resulting URLs are enqueued in a URL queue. Multiple text extraction threads are generated, based upon documents downloaded using URLs dequeued from the URL queue, to obtain text files. New words are randomly obtained from the text files, and the randomly obtained words from the text files are enqueued in the word queue. This process is iteratively performed, resulting in a dynamic corpus.Type: GrantFiled: November 9, 2005Date of Patent: May 10, 2011Assignee: Microsoft CorporationInventor: Carlos Alejandro Arguelles
-
Publication number: 20110106807Abstract: Described within are systems and methods for disambiguating entities, by generating entity profiles and extracting information from multiple documents to generate a set of entity profiles, determining equivalence within the set of entity profiles using similarity matching algorithms, and integrating the information in the correlated entity profiles. Additionally, described within are systems and methods for representing entities in a document in a Resource Description Framework and leveraging the features to determine the similarity between a plurality of entities. An entity may include a person, place, location, or other entity type.Type: ApplicationFiled: November 1, 2010Publication date: May 5, 2011Applicant: JANYA, INCInventors: Rohini K. Srihari, Harish Srinivasan, Richard Smith, John Chen
-
Publication number: 20110093343Abstract: Methods and systems are given for representing and generating contents from pre-existed and pre-built contents for a given content. Methods are given for transforming information representation from one medium, type, or language to another medium, type and language. Exemplary embodiment is given for transforming the semantics of a given text or spoken language to a visual representation or combination of them. The systems and methods generate new contents in general and multimedia contents in particular in response to or for representing an input composition utilizing pre-existed and pre-built contents of various types, languages, and forms. The associated client server systems over the communication network are also given for generating contents for the contents given by the clients.Type: ApplicationFiled: October 20, 2010Publication date: April 21, 2011Inventor: Hamid Hatami-Hanza
-
Publication number: 20110082863Abstract: A method, apparatus and computer program product provides for a semantic analyzer to produce and rank semantic terms to reflect their relationship to the theme and topics of a document. The text and the document can have no relationship to any pre-selected keywords before the semantic analyzer performs text extraction. The semantic analyzer extracts text from a document and performs semantic analysis on the extracted text. The semantic analyzer provides a plurality of ranked semantic terms as a result of the semantic analysis and associates semantic terms with the document as semantic keywords. The semantic terms define content to be presented with the document where the content is an advertisement, a link to a remote information resource or a second document.Type: ApplicationFiled: December 15, 2010Publication date: April 7, 2011Applicant: ADOBE SYSTEMS INCORPORATEDInventors: WALTER CHANG, NADIA GHAMRAWI
-
Publication number: 20110078146Abstract: Systems and methods for managing electronic data are disclosed. Various data management operations can be performed based on a metabase formed from metadata. Such metadata can be identified from an index of data interactions generated by a journaling module, and obtained from their associated data objects stored in one or more storage devices. In various embodiments, such processing of the index and storing of the metadata can facilitate, for example, enhanced data management operations, enhanced data identification operations, enhanced storage operations, data classification for organizing and storing the metadata, cataloging of metadata for the stored metadata, and/or user interfaces for managing data. In various embodiments, the metabase can be configured in different ways. For example, the metabase can be stored separately from the data objects so as to allow obtaining of information about the data objects without accessing the data objects or a data structure used by a file system.Type: ApplicationFiled: September 20, 2010Publication date: March 31, 2011Applicant: CommVault Systems, Inc.Inventors: Anand Prahlad, Jeremy Alan Schwartz, David Ngo, Brian Brockway, Marcus S. Muller
-
Publication number: 20110077048Abstract: The invention relates to a system for data correlation, having: a receiving device 1 having an image acquisition element 10 and a data set generator 12 for generating at least one object data set from at least one acquired first image, which represents a physical object, and an identification label, which uniquely determines an object-related acquisition procedure, and at least one information data set from at least one acquired second image, which represents coded information related to the physical object, and the identification label; a correlation device 2 for the extraction 20 of the coded information from the information data set, for the semantic analysis 22 of the extracted information, and for the generation of at least one combination data sets ? from the results of the semantic analysis, the extracted information, and the at least one object data set with the same identification label as the extracted information data set; and a user device 3 for the storage and further use of the combination dataType: ApplicationFiled: March 3, 2009Publication date: March 31, 2011Applicant: Linguatec Sprachtechnologien GmbHInventor: Reinhard Busch
-
Patent number: 7917514Abstract: A system that can analyze a multi-dimensional input thereafter establishing a search query based upon extracted features from the input. In a particular example, an image can be used as an input to a search mechanism. Pattern recognition and image analysis can be applied to the image thereafter establishing a search query that corresponds to features extracted from the image input. The system can also facilitate indexing multi-dimensional searchable items thereby making them available to be retrieved as results to a search query. More particularly, the system can employ text analysis, pattern and/or speech recognition mechanisms to extract features from searchable items. These extracted features can be employed to index the searchable items.Type: GrantFiled: June 28, 2006Date of Patent: March 29, 2011Assignee: Microsoft CorporationInventors: Stephen Lawler, Eric J. Horvitz, Joshua T. Goodman, Anoop Gupta, Christopher A. Meek, Eric D. Brill, Gary W. Flake, Ramez Naam, Surajit Chaudhuri, Oliver Hurst-Hiller
-
Publication number: 20110072020Abstract: A method for reverse geocoding location information obtained by a wireless communications device comprises determining the location information for a location, communicating the location information to a reverse geocoding server that reverse-geocodes the location information to generate location description data for a bounding region that geographically surrounds the location, receiving the location description data from the reverse geocoding server for the bounding region containing the location, and caching the location description data for the bounding region in a memory cache on the device. When the current location remains within one or more bounding regions cached on the device, location description data is fetched from the cache, thus improving application responsiveness. Only when the current location is no longer within the bounding region(s) does the device communicate a new request to the reverse geocoding server.Type: ApplicationFiled: September 18, 2009Publication date: March 24, 2011Applicant: RESEARCH IN MOTION LIMITEDInventors: Ngoc Bich Ngo, Russell Norman Owen
-
Publication number: 20110072021Abstract: In one embodiment, access a search query comprising one or more query words, at least one of the query words representing one or more query concepts; access a network document identified for a search query by a search engine, the network document comprising one or more document words, at least one of the document words representing one or more document concepts; semantic-text match the search query and the network document to determine one or more negative semantic-text matches; and construct one or more negative features based on the negative semantic-text matches.Type: ApplicationFiled: September 21, 2009Publication date: March 24, 2011Applicant: YAHOO! INC.Inventors: Yumao Lu, Lei Duan, Fan Li, Benoit Dumoulin, Xing Wei
-
Patent number: 7912816Abstract: In one embodiment, input is received from a user defining a classification and an analytic for the classification. Multiple classifications and analytics may be defined by a user. A definition of relevance parameters is determined that characterize the classification and a set of analytics measures associated with the analytic. The definition may be for the classification. Unstructured data and structured data are analyzed based on the definition of the relevance parameters to determine relevant data in the unstructured data and the structured data. The relevant data being data that is determined to be relevant to the classification defined by the user. An index of the terms from the relevant data is determined. The index is useable by an analytics tool to provide results for queries of the unstructured data and structured data. The query may be used within the classification such that targeted results are provided using the index and the relevant data to the classification.Type: GrantFiled: April 18, 2008Date of Patent: March 22, 2011Assignee: Alumni Data Inc.Inventors: Aloke Guha, Joan Wrabetz
-
Publication number: 20110066618Abstract: Methods, apparatuses, and systems are provided to determine a response to a user submitted query based, at least in part, on a relationship between and/or among a plurality of terms of the query.Type: ApplicationFiled: September 14, 2009Publication date: March 17, 2011Applicant: Yahoo! Inc.Inventors: Borkur Sigurbjornsson, Vanessa Murdock, Roelof van Zwol, Maarten Clements
-
Publication number: 20110066619Abstract: Architecture for enabling a user to automatically recover documents and other information associated with work contexts and recover documents and other information artifacts associated with a specific project. The architecture enables monitoring and recording of activity information related to user interactions with information artifacts pertaining to a particular work context. The user can select a document having a portion of work content (e.g., a term or other type of reference item in a document) related to the work context. A lexical analysis is performed on the activity information and the reference item to identify lexical similarities. A list of candidate items (e.g., related documents) is inferred from the information artifacts based on the lexical similarities. The candidate items related to the work context are presented to the user, who can select specific items to reestablish the work context.Type: ApplicationFiled: September 16, 2009Publication date: March 17, 2011Applicant: Microsoft CorporationInventors: George Perantatos, Kuldeep Karnawat, John S. Wana
-
Patent number: 7908253Abstract: Data indexing using polyarchical indexing codes and automatically generated expansion paths. For a piece of data, an indexing code is received relating to a particular categorization or other indexing parameter. Based upon the indexing code, one or more expansion sets of codes are retrieved and applied to the piece of data. The expansion sets of codes may include indexing codes that relate to hierarchical levels of indexing. The expansion sets of codes may also include different expansion paths through the hierarchical levels of indexing. The polyarchical codes may include multiple cross-categorization of the data across the same or different levels of categories. They may also include multiple expansion paths in different directions across hierarchical levels of categories or indexing.Type: GrantFiled: August 7, 2008Date of Patent: March 15, 2011Assignee: Factiva, Inc.Inventors: Jonathan Guy Grenside Cooke, Andrew Richard Young
-
Patent number: 7908171Abstract: The present invention provides an information providing system including an information registration unit capable of registering a front keyword for use in relation to content or content information to be provided a user terminal and back keywords set in relation to the front keyword can be registered, an advertisement registration unit capable of registering advertisement information for use in relation to the back keyword and an information providing unit capable of providing the advertisement information to the user terminal. The advertisement registration unit is capable of selecting specific advertisement information through an auction transaction. The information providing unit is capable of displaying keyword buttons enabling keyword selection in a display screen at the user terminal.Type: GrantFiled: November 13, 2007Date of Patent: March 15, 2011Assignees: Sony Corporation, Plat-Ease CorporationInventors: Kazuhiro Fukuda, Tetsuo Maruyama, Tetsu Sumita
-
Patent number: 7904457Abstract: Improved techniques for flow analysis in messaging systems are disclosed. For example, a method for finding correlations between messages of a system based on content includes the following steps. For one or more executions of the system, obtaining the messages of the system, wherein each message has a schema associated therewith. The messages are categorized into groups, wherein each group has a common schema. Pairs of messages from disparate groups are found wherein, for the messages of a pair, there is a feature in common in their contents.Type: GrantFiled: May 30, 2007Date of Patent: March 8, 2011Assignee: International Business Machines CorporationInventors: Wim De Pauw, Robert L. Hoch, Yi Huang
-
Patent number: 7904458Abstract: The present invention relates to a method and apparatus for optimizing queries. The present invention discloses an efficient method for providing answers to queries under parametric aggregation constraints.Type: GrantFiled: December 26, 2009Date of Patent: March 8, 2011Assignee: AT&T Intellectual Property II, L.P.Inventors: Nikolaos Koudas, Divesh Srivastava, Sudipto Guha, Dimitrios Gunopulos, Michail Vlachos
-
Publication number: 20110047148Abstract: A semantically integrated knowledge retrieval, management, delivery and presentation system.Type: ApplicationFiled: March 26, 2010Publication date: February 24, 2011Inventor: Nosa Omoigui
-
Patent number: 7885973Abstract: A computer method and system generates inquires. The method and system provide a plurality of templates. Each template outlines a respective inquiry and is associated with one or more semantic types or contexts. Each template has one or more parameters for defining a query instance of the respective inquiry. User input selects a template from the plurality and specifies values for the parameters of the user selected template. Using the user selected template and the user-specified parameter values, an instance of a query is produced. Each template, is associated with semantic types during template construction. The semantic types may be based on classes in an ontology. Template construction may include templatizing prior existing or other queries to create respective templates. In application or use of a template, query generation may be during modeling of a certain domain, and the produced query is for information about the certain domain.Type: GrantFiled: February 22, 2008Date of Patent: February 8, 2011Assignee: International Business Machines CorporationInventors: Nishanth R. Sastry, Steven I. Ross, Daniel M. Gruen, Susanne C. Hupfer
-
Publication number: 20110022598Abstract: The disclosed embodiments of computer systems and techniques utilize an ensemble semantics framework to combine knowledge acquisition systems that yield significantly higher quality resources than each system in isolation. Gains in entity extraction are achieved by combining state-of-the-art distributional and pattern-based systems with a large set of features from, for example, a webcrawl, query logs, and wisdom of the crowd sources. This results in improved query interpretation and greater relevancy in providing search results and advertising, for example.Type: ApplicationFiled: July 24, 2009Publication date: January 27, 2011Applicant: YAHOO! INC.Inventors: Marco Pennacchiotti, Patrick Pantel
-
Patent number: 7877349Abstract: Methods and apparatus to evaluate the semantic proximity between reference free-form text entry and a candidate free-form text request.Type: GrantFiled: April 1, 2010Date of Patent: January 25, 2011Assignee: Microsoft CorporationInventors: Francois Huet, Gray Salmon Norton
-
Patent number: 7877388Abstract: A method (and system) for clustering a plurality of items. Each of the items includes information. The method includes inputting a plurality of items. Each of the items includes information. The items are provided into a clustering process. The method also inputs an initial organization structure into the clustering process. The initial organization structure includes one or more categories, at least one of the categories being associated with one of the items. The method processes the plurality of items based upon at least the initial organization structure and the information in each of the items; and determines a resulting organization structure based upon the processing. The resulting organization structure relates to the initial organization structure.Type: GrantFiled: October 31, 2007Date of Patent: January 25, 2011Assignee: Stratify, Inc.Inventors: John O. Lamping, Ramana Venkata, Shashidhar Thakur, Samdeer Siruguri
-
Patent number: 7873640Abstract: A method, apparatus and computer program product provides for a semantic analyzer to produce and rank semantic terms to reflect their relationship to the theme and topics of a document. The text and the document can have no relationship to any pre-selected keywords before the semantic analyzer performs text extraction. The semantic analyzer extracts text from a document and performs semantic analysis on the extracted text. The semantic analyzer provides a plurality of ranked semantic terms as a result of the semantic analysis and associates semantic terms with the document as semantic keywords. The semantic terms define content to be presented with the document where the content is an advertisement, a link to a remote information resource or a second document.Type: GrantFiled: March 27, 2007Date of Patent: January 18, 2011Assignee: Adobe Systems IncorporatedInventors: Walter Chang, Nadia Ghamrawi
-
Patent number: 7870124Abstract: Techniques for processing reference-based SQL/XML operators are provided. Instead of extracting copies of one or more nodes from XML data, a reference-based operator returns a reference to a node. Such a reference is used to determine, for example, whether the corresponding node comes logical before, after, or is the same as another node. An SQL/XML query that includes a reference-based operator may be the original query, or may be generated (e.g., rewritten) from a non-SQL/XML query, such as an XQuery query. One or more physical rewrites may be performed on the SQL/XML query, depending on how the XML data is stored and/or whether an XML index exists for the XML data.Type: GrantFiled: December 13, 2007Date of Patent: January 11, 2011Assignee: Oracle International CorporationInventors: Zhen Hua Liu, Hui Joe Chang, James W. Warner
-
Publication number: 20110004595Abstract: According to embodiments, a diagnostic report search supporting apparatus and a diagnostic report searching apparatus each have a report registering part, a structuring processing part, a related-term analyzing part, a counting part, and a keyword extracting part. The structuring processing part extracts terms from a sentence written in a diagnostic report, and classifies the terms into predetermined kinds. The related-term analyzing part generates combinations each composed of two or more terms based on the plurality of terms having been extracted. The counting part counts the existence number of same combinations in the plurality of combinations, and extracts combinations whose existence numbers are a predetermined number or more. The keyword extracting part extracts a combination including a desired keyword, and extracts a term other than the desired keyword as a related keyword.Type: ApplicationFiled: June 23, 2010Publication date: January 6, 2011Applicants: Kabushiki Kaisha Toshiba, TOSHIBA MEDICAL SYSTEMS CORPORATIONInventors: Hiromasa YAMAGISHI, Hikaru Futami, Kenichi Niwa
-
Patent number: 7860867Abstract: An information managing system includes a parameter setting unit for setting a parameter representative of an attribute of a user and information to be retrieved, and an information relevance space generator for generating an information relevance space representative of information indicating a relevance between the user and the information to be retrieved, based on the parameter set by the parameter setting unit.Type: GrantFiled: December 20, 2006Date of Patent: December 28, 2010Assignee: NEC CorporationInventors: Masaki Kan, Junichi Yamato, Yuji Kaneko, Yoshihiro Kajiki
-
Publication number: 20100312767Abstract: Provided is an information process apparatus including: an extraction unit which is configured to extract words in a predetermined word class from comments which predetermined users write about a predetermined item; a grouping unit which is configured to group the predetermined users by performing a multivariate analysis using the words extracted by the extraction unit; a storage unit which is configured to store the groups, the predetermined item, and the words in association with each other; a determination unit which is configured to determine which group a user who is to write a comment belongs to when the user is to write the comment about the predetermined item; and a reading unit which is configured to read from the storage unit words which are associated with the group determined by the determination unit and the predetermined item which the comment is to be written about.Type: ApplicationFiled: May 14, 2010Publication date: December 9, 2010Inventor: Mari SAITO
-
Publication number: 20100293166Abstract: The present invention discloses methods, systems, and tools for unified semantic ranking of compositions of ontological subjects. The method breaks a composition to a plurality of partitions as well as its constituent ontological subjects of different orders and builds a participation matrix indicating the participation of ontological subjects of the composition in other ontological subjects, i.e. the partitions, of the composition. Using the participation information of the OSs into each other a similarity matrix is built from which the semantic importance ranks of the partitions of the composition are calculated. The method systematically enables the calculation the semantic ranks of ontological subjects of different orders of the composition. Various systems for implementing the method and numerous applications and services are disclosed.Type: ApplicationFiled: April 7, 2010Publication date: November 18, 2010Applicant: Hamid Hatami-HanzaInventor: Hamid Hatami-Hanza