Latent Semantic Index Or Analysis (lsi Or Lsa) Patents (Class 707/739)

ONTOLOGICAL INFORMATION RETRIEVAL SYSTEM

Publication number: 20120124051

Abstract: An ontological information retrieval system is provided. According to an embodiment, the subject ontological information retrieval system can be utilized for computer-aided clinical Traditional Chinese Medicine (TCM) practice. In one implementation, a graphical user interface (GUI) is provided, enabling a user to input a query with symptoms determined from a patient, and the system's parser can find instances of the symptoms in a document object model (DOM) tree of the TCM ontological information. Diagnosis based upon the symptoms can be communicated to the user through the GUI. A relevance index (RI) and/or a frequency index (F1) can be further provided for evaluating a diagnosis by comparing the symptoms determined from a patient with the expected symptoms of the diagnosed illness and returning a value based on the number of matched symptoms, or a weighted index of matched symptoms.

Type: Application

Filed: July 29, 2010

Publication date: May 17, 2012

Inventors: Wilfred Wan Kei Lin, Allan Kang Ying Wong, Jackei Ho Kei Wong, Jewels Chun Wing Kong
Generating action trails from web history

Patent number: 8180778

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating action trails from web history are described. In one aspect, a method includes receiving a web content access history of a user, the content access history including one or more user actions, each user action being associated with a content item upon which the user action is performed and identifying one or more action trails from the content access history, each action trail including a sequence of user actions performed one content items relating to a topic. Identifying a particular action trail includes clustering the user actions into a series of segments using temporal criteria; calculating semantic similarities between the content items, and adding a segment of the series of segments to the action trail when the semantic similarities between the segment and another segment satisfy a similarity threshold.

Type: Grant

Filed: February 5, 2010

Date of Patent: May 15, 2012

Assignee: Google Inc.

Inventors: Elin Pedersen, Karl A. Gyllstrom, Shengyin Gu, Peter Jin Hong
Method and system to compare data objects

Patent number: 8180777

Abstract: The present invention relates in general to methods and systems for comparing and maximizing the optimal selection of a first set of one or more data objects to a set of second data objects. In one embodiment, the first set of data objects represent one or more tasks to be fulfilled by a set of capabilities represented by the second data objects. In one embodiment, methods and systems are provided that apply topic modeling and similarity metrics to determine the optimal selection. In one embodiment, methods and systems are provided to determine the appropriateness of a set of second data objects to satisfy the requirements of a first data object given interaction attributes. Embodiments may be used to compare mission requirements with potential team members to determine the appropriateness of team members and teams for a given mission based on interaction attributes of the team members and teams.

Type: Grant

Filed: October 24, 2010

Date of Patent: May 15, 2012

Assignee: Aptima, Inc.

Inventors: Andrew Duchon, Kari Kelton, Pacey Foster, Kara Orvis, Robert McCormack
ADAPTIVE MULTIMEDIA SEMANTIC CONCEPT CLASSIFIER

Publication number: 20120109964

Abstract: A method of classifying a set of semantic concepts on a second multimedia collection based upon adapting a set of semantic concept classifiers and updating concept affinity relations that were developed to classify the set of semantic concepts for a first multimedia collection. The method comprises providing the second multimedia collection from a different domain and a processor automatically classifying the semantic concepts from the second multimedia collection by adapting the semantic concept classifiers and updating the concept affinity relations to the second multimedia collection based upon the local smoothness over the concept affinity relations and the local smoothness over data affinity relations.

Type: Application

Filed: October 27, 2010

Publication date: May 3, 2012

Inventors: Wei Jiang, Alexander C. Loui
Density-based data clustering method

Patent number: 8171025

Abstract: A density-based data clustering method, comprising a parameter-setting step for setting a scanning radius and a minimum threshold value, a dividing step for dividing a space of a plurality of data points according to the scanning radius, a data-retrieving step for retrieving one data point out of the plurality of data points as a core data point, a searching step for calculating a distance between the core data point and each of the query points, a grouping determination step for determining whether a number of the neighboring points is smaller than the minimum threshold value.

Type: Grant

Filed: January 6, 2010

Date of Patent: May 1, 2012

Assignee: National Pingtung University Of Science & Technology

Inventors: Cheng-Fa Tsai, Chien-Tsung Wu
System and method for matching and assembling records

Patent number: 8166033

Abstract: A system and method for matching and assembling records is provided. One embodiment of the invention assembles records by applying a method for grouping records based on matching fields, assembling a new record as a composite of the matched records, and then repeating the grouping, matching and assembling steps in a cascade where the matching grouping and assembling steps are modified as a function of the cascade step and the assembled records created in earlier steps.

Type: Grant

Filed: February 27, 2003

Date of Patent: April 24, 2012

Assignee: Parity Computing, Inc.

Inventors: Zunaid H. Kazi, Christopher D. Rosin, Ramamohan Paturi, Holden P. Robbins, Mark W. S. Land
Method and apparatus for processing metadata

Patent number: 8156123

Abstract: Methods and apparatuses for processing metadata are described herein. In one embodiment, when a file (e.g., a text, audio, and/or image files) having metadata is received, the metadata and optionally at least a portion of the content of the file are extracted from the file to generate a first set of metadata. An analysis is performed on the extracted metadata and the content to generate a second set of metadata, which may include metadata in addition to the first set of metadata. The second set of metadata may be stored in a database suitable to be searched to identify or locate the file. Other methods and apparatuses are also described.

Type: Grant

Filed: April 22, 2005

Date of Patent: April 10, 2012

Assignee: Apple Inc.

Inventors: Guy L. Tribble, Yan Arrouye, Dominic Giampaolo
Semantic Grouping for Program Performance Data Analysis

Publication number: 20120072423

Abstract: Particular portions of program execution data are specified and organized in semantic groups. A grouping expression written in a transformation syntax language specifies a pattern and a replacement, for grouping performance data samples. An exception to the pattern can also be specified. In response to the grouping expression, a cost accounting shows groups and their costs. The grouping expression may operate on names and/or name-associated characteristics such as private/public status, author, directory, and the like. Samples may represent nodes in a directed acyclic graph memorializing call stacks or memory allocation. Grouping expressions are used to group nodes and consolidate costs by various procedures when making modified sample stacks: clustering-by-name, entry-group-clustering, folding-by-name, a folding-by-cost. An entry group clustering shows at least one entry point name while avoiding unwanted detail.

Type: Application

Filed: September 20, 2010

Publication date: March 22, 2012

Applicant: MICROSOFT CORPORATION

Inventors: Vance Morrison, Joshua Ryan Williams
Personalization engine for building a user profile

Patent number: 8140515

Abstract: Users of electronic documents are classified for profiling and targeting of additional relevant content. Behavioral data is gathered from user registration information and user activity, and user documents and actions are categorized. Registration information is combined with collaborative and editorial data to provide user profile information. Author-generated document classification information is analyzed and assigned a first taxonomic noun to characterize the document. User-generated tags characterizing a portion of the document are assigned a second taxonomic noun. Search terms that resulted in the user accessing the document are identified and assigned a third taxonomic noun. Attributes related to how the document was accessed are evaluated and assigned a fourth taxonomic noun. The document is processed using pattern rules to extract a fifth taxonomic noun.

Type: Grant

Filed: October 28, 2009

Date of Patent: March 20, 2012

Assignee: CBS Interactive Inc.

Inventors: Tushar Pradhan, Thomas Osborne, John Potter
Systems and methods for using metadata to enhance data identification operations

Patent number: 8131725

Abstract: Systems and methods for managing electronic data are disclosed. Various data management operations can be performed based on a metabase formed from metadata. Such metadata can be identified from an index of data interactions generated by a journaling module, and obtained from their associated data objects stored in one or more storage devices. In various embodiments, such processing of the index and storing of the metadata can facilitate, for example, enhanced data management operations, enhanced data identification operations, enhanced storage operations, data classification for organizing and storing the metadata, cataloging of metadata for the stored metadata, and/or user interfaces for managing data. In various embodiments, the metabase can be configured in different ways. For example, the metabase can be stored separately from the data objects so as to allow obtaining of information about the data objects without accessing the data objects or a data structure used by a file system.

Type: Grant

Filed: September 20, 2010

Date of Patent: March 6, 2012

Assignee: Comm Vault Systems, Inc.

Inventors: Anand Prahlad, Jeremy Alan Schwartz, David Ngo, Brian Brockway, Marcus S. Muller
Adaptive archive data management

Patent number: 8131684

Abstract: In one embodiment, input is received from a user defining a classification and an analytic for the classification. Multiple classifications and analytics may be defined by a user. A definition of relevance parameters is determined that characterize the classification and a set of analytics measures associated with the analytic. The definition may be for the classification. Unstructured data and structured data are analyzed based on the definition of the relevance parameters to determine relevant data in the unstructured data and the structured data. The relevant data being data that is determined to be relevant to the classification defined by the user. An index of the terms from the relevant data is determined. The index is useable by an analytics tool to provide results for queries of the unstructured data and structured data. The query may be used within the classification such that targeted results are provided using the index and the relevant data to the classification.

Type: Grant

Filed: March 21, 2011

Date of Patent: March 6, 2012

Assignee: Aumni Data Inc.

Inventors: Joan Wrabetz, Aloke Guha
Managing Information

Publication number: 20120054185

Abstract: The different illustrative embodiments provide a method, a computer program product, and an apparatus for managing information. A request to store text in a table in a database is received. A determination is made as to whether a first collection of textual information having a first concept that is related to a second concept for the text is present in the database responsive to receiving the request containing the text. The text is associated with the first collection of textual information in the database responsive to a determination that the first collection of textual information in the database having the first concept that is related to the second concept for the text is present in the database. A second collection for the data with a third concept that is related to the second concept for the text within the degree of relatedness is created.

Type: Application

Filed: August 31, 2010

Publication date: March 1, 2012

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Sandra K. Johnson, Grant D. Miller, Robert F. Pryor
TEXT MINING OF MICROBLOGS USING LATENT TOPIC LABELS

Publication number: 20120041953

Abstract: A latent topic labels text mining system and method to mine and analyze the content of textual data. Embodiments of the system and method are particularly well suited for use on microblog data to help people identify posts they want to read and to find people that they want to follow. Embodiments of the system and method use a modified Labeled LDA technique (called an L+LDA technique) that analyzes content using a combination of labeled and latent topics. The resultant data is assigned labels one of four labels to generate a lower-dimensional representation of the data that the individual words in a microblog post. This learned topic representation is used to characterize, summarize, filter, find, suggest, and compare the content of microblog posts. Embodiments of the system and method also include visualization techniques such as a tag cloud visualization that is used to visualize microblogging data.

Type: Application

Filed: August 16, 2010

Publication date: February 16, 2012

Applicant: Microsoft Corporation

Inventors: Susan Theresa Dumais, Daniel Ramage, Daniel John Liebling, Steven Mark Drucker
Technique for enhancing a set of website bookmarks by finding related bookmarks based on a latent similarity metric

Patent number: 8117205

Abstract: A method and system for enhancing the quality of a bookmark or a set of bookmarks that have been organized by topic and contain information related to that topic. The method and system analyzes documents accessible by the bookmark or set of bookmarks and performs a search using key terms from that analysis in a vector called a latent similarity metric. The terms that result from this search are preferably ranked in a hierarchy or the like and utilized in a subsequent search to locate and rank additional related documents.

Type: Grant

Filed: July 8, 2008

Date of Patent: February 14, 2012

Assignee: International Business Machines Corporation

Inventor: Michael D. Rychener
Semantic and text matching techniques for network search

Patent number: 8112436

Abstract: In one embodiment, access a search query comprising one or more query words, at least one of the query words representing one or more query concepts; access a network document identified for a search query by a search engine, the network document comprising one or more document words, at least one of the document words representing one or more document concepts; semantic-text match the search query and the network document to determine one or more negative semantic-text matches; and construct one or more negative features based on the negative semantic-text matches.

Type: Grant

Filed: September 21, 2009

Date of Patent: February 7, 2012

Assignee: Yahoo ! Inc.

Inventors: Yumao Lu, Lei Duan, Fan Li, Benoit Dumoulin, Xing Wei
Auto-summary generator and filter

Patent number: 8108398

Abstract: A system that facilitates data presentation and management includes at least one database to store a corpus of data relating to one or more topics. The system further includes a summarizer component to automatically determine a subset of the data over the corpus of data relating to at least one of the topic(s), wherein the subset forms a summary of at least one topic.

Type: Grant

Filed: June 29, 2007

Date of Patent: January 31, 2012

Assignee: Microsoft Corporation

Inventors: Shai Guday, Bret P. O'Rourke, John Mark Miller, James Morris Alkove, Andrew David Wilson
Information recommendation device and information recommendation method

Patent number: 8108376

Abstract: A document set, and history documents including documents, etc., browsed by a user are input. The document set and the history documents are each analyzed to obtain characteristic vectors. A plurality of topic clusters and a plurality of sub-topic clusters are obtained by clustering the document set. A transition structure showing transitions of topics among the sub-topic clusters is generated, and a characteristic attribute is extracted from each topic cluster and each sub-topic cluster. An cluster-of-interest is extracted in comparison among characteristic vectors of the history documents and a characteristic vector of each document included in the document set, a sub-topic cluster having transition relations with the cluster-of-interest is obtained on the basis of a transition structure owned by the cluster-of-interest, and a document included in the sub-topic cluster is extracted as a recommended document to be presented together with the characteristic attribute.

Type: Grant

Filed: March 20, 2009

Date of Patent: January 31, 2012

Assignee: Kabushiki Kaisha Toshiba

Inventors: Masayuki Okamoto, Masaaki Kikuchi
Generation of Annotation Tags Based on Multimodal Metadata and Structured Semantic Descriptors

Publication number: 20120023103

Abstract: In one embodiment, a method of generating annotation tags (28) for a digital image (22) includes maintaining a library (16) of human-meaningful words or phrases organized as category entries (72) according to a number of defined image description categories (70), and receiving context metadata (20) associated with the capture of a given digital image (22). The method further includes selecting particular category entries (72-1, 72-2) as vocabulary metadata (24) for the digital image (22) by mapping the context metadata (20) into the library (16), and generating annotation tags (28) for the digital image (22) by logically combining the vocabulary metadata (24) according to a defined set of deductive logic rules (30) that are predicated on the defined image description categories (70). In another embodiment, a processing apparatus (12), such as a digital processor (18, 26) and supporting memory (14), etc., is configured to carry out the above method, or to carry out variations of the above method.

Type: Application

Filed: January 21, 2009

Publication date: January 26, 2012

Applicant: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)

Inventors: Joakim Soderberg, Jonas Bjork, Andreas Fasbender
UNSUPERVISED DOCUMENT CLUSTERING USING LATENT SEMANTIC DENSITY ANALYSIS

Publication number: 20120011124

Abstract: According to one embodiment, a latent semantic mapping (LSM) space is generated from a collection of a plurality of documents, where the LSM space includes a plurality of document vectors, each representing one of the documents in the collection. For each of the document vectors considered as a centroid document vector, a group of document vectors is identified in the LSM space that are within a predetermined hypersphere diameter from the centroid document vector. As a result, multiple groups of document vectors are formed. The predetermined hypersphere diameter represents a predetermined closeness measure among the document vectors in the LSM space. Thereafter, a group from the plurality of groups is designated as a cluster of document vectors, where the designated group contains a maximum number of document vectors among the plurality of groups.

Type: Application

Filed: July 7, 2010

Publication date: January 12, 2012

Applicant: APPLE INC.

Inventor: Jerome R. Bellegarda
Methods for Enabling a Scalable Transformation of Diverse Data into Hypotheses, Models and Dynamic Simulations to Drive the Discovery of New Knowledge

Publication number: 20120004893

Abstract: The present invention relates to a method for the automatic identification of at least one informative data filter from a data set that can be used to identify at least one relevant data subset against a target feature for subsequent hypothesis generation, model building and model testing. The present invention describes methods, and an initial implementation, for efficiently linking relevant data both within and across multiple domains and identifying informative statistical relationships across this data that can be integrated into agent-based models. The relationships, encoded by the agents, can then drive emergent behavior across the global system that is described in the integrated data environment.

Type: Application

Filed: September 10, 2009

Publication date: January 5, 2012

Applicant: QUANTUM LEAP RESEARCH, INC.

Inventors: Akhileswar Ganesh VAIDYANATHAN, Stephen D. PRIOR, Jijun Wang, Bin Yu
Document management system and method

Patent number: 8090743

Abstract: Provided are a document management system and method. The document management system including a database storing documents and a document classification unit for automatically classifying the documents stored in the database, wherein the document classification unit comprises a feature extraction module extracting features based on a keyword included in the documents and vectorizing the extracted features, a similarity judgment module judging similarity among the documents using vectors formed by the feature extraction module, and a classification system module classifying the documents stored in the database according to a preset classification system, the document classification unit performing document classification according to the classification system with respect to documents provided to the database.

Type: Grant

Filed: January 10, 2007

Date of Patent: January 3, 2012

Assignee: LG Electronics Inc.

Inventors: Wan Kyu Cha, Jeong Joong Kim, Han Joon Ahn
MULTI-FACET CLASSIFICATION SCHEME FOR CATALOGING OF INFORMATION ARTIFACTS

Publication number: 20110320454

Abstract: A system and method for constructing a hierarchical multi-faceted classification structure includes organizing a plurality of visual categories into a multi-relational reference ontology that accounts for a plurality of different types of relationships. Media artifacts are categorized into the plurality of visual categories. The categories of artifacts are refined based on faceted ontology relationships or constraints from the multi-relational reference ontology. The multi-relational reference ontology and the one or more media artifacts with relationships are stored as the hierarchical multi-faceted classification structure in computer readable memory storage.

Type: Application

Filed: June 29, 2010

Publication date: December 29, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: MATTHEW HILL, JOHN R. KENDER, APOSTOL NATSEV, QUOC-BAO NGUYEN, JOHN R. SMITH, JELENA TESIC, LEXING XIE, RONG YAN
Graph caching

Patent number: 8086609

Abstract: In a method and apparatus for analyzing nodes of a Deterministic Finite Automata (DFA), an accessibility ranking, based on a DFA graph geometrical configuration, may be determined in order to determine cacheable portions of the DFA graph in order to reduce the number of external memory accesses. A walker process may be configured to walk the graph in a graph cache as well as main memory. The graph may be generated in a manner allowing each arc to include information if the node it is pointing to is stored in the graph cache or in main memory. The walker may use this information to determine whether or not to access the next arc in the graph cache or in main memory.

Type: Grant

Filed: November 1, 2007

Date of Patent: December 27, 2011

Assignee: Cavium, Inc.

Inventors: Rajan Goyal, Muhammad Raghib Hussain, Trent Parker
Tag suggestions based on item metadata

Patent number: 8086504

Abstract: Tag suggestions enable a hosting entity such as a website to determine one or more tags to suggest to a user for association with a particular item within an electronic catalog. After this determination, the hosting entity may suggest the determined tags to the user. To determine these tags, the hosting entity may employ techniques to determine items related to the particular item. The hosting entity then suggests some or all of the tags associated with the related items. Additionally or alternatively, the hosting entity may determine certain metadata associated with the particular item. The entity then may suggest this metadata, or some related phrase or tag, to the user for association with the particular item. However the tag suggestions are determined, the hosting entity may rank the tag suggestions to determine which tags to present to the user or to determine an order in which to present the tags.

Type: Grant

Filed: September 6, 2007

Date of Patent: December 27, 2011

Assignee: Amazon Technologies, Inc.

Inventors: Russell A. Dicker, Waqas Ahmed, Aaron D. Wilson, Scott Allen Mongrain, Florin V. Manolache, Valentin Radu Munteanu, Val Dan Dar Ion I. Rosca, Corneliu Gabriel Alexandru Rudeanu
K engine - process count after build in threads

Publication number: 20110314022

Abstract: In a KStore having a plurality of K nodes with count fields a method for updating count fields, receiving a particle to provide a received particle, updating selected node counts of the plurality of nodes counts in response to the received particle to provide first updated K node count fields, and saving selected K node count fields for later updating to provide second updated count fields are recited. The K nodes include elemental root nodes and the second updated K node count fields include elemental root nodes of the plurality of elemental root nodes. The second updated K node count fields include only elemental root nodes of the plurality of elemental root nodes. The first updated K node count fields include no elemental root nodes. The second updated K node count fields include K nodes pointed to by the Result pointers of the first updated K node count fields.

Type: Application

Filed: June 8, 2006

Publication date: December 22, 2011

Applicant: Unisys Corporation

Inventors: Jane Campbell Mazzagatti, Steven L. Rajcan, Robert R. Buckwalter
Method and system for document classification based on document structure and written style

Patent number: 8082248

Abstract: A classification method and system for documents containing text sentences and images having meta-data. The classification method and system categorizes document sentences into subjective and non-subjective sentences and categorizes document images into descriptive and non-descriptive. The categorization is further used to calculate subjectivity and descriptive-images classification of a document. This classification system can be used by a web search engine to filter, sort or tag a set of document references based on user selection.

Type: Grant

Filed: May 29, 2008

Date of Patent: December 20, 2011

Inventor: Rania Abouyounes
GRAPHICAL MODELS FOR REPRESENTING TEXT DOCUMENTS FOR COMPUTER ANALYSIS

Publication number: 20110302168

Abstract: In a method for representing a text document with a graphical model, a document including a plurality of ordered words is received and a graph data structure for the document is created. The graph data structure includes a plurality of nodes and edges, with each node representing a distinct word in the document and each edge identifying a number of times two nodes occur within a predetermined distance from each other. The graph data structure is stored in an information repository.

Type: Application

Filed: June 8, 2010

Publication date: December 8, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventor: Charu Aggarwal
PRESENTING SUPPLEMENTAL CONTENT IN CONTEXT

Publication number: 20110302152

Abstract: Techniques that may be used for detecting a primary content (e.g., a web page) that the user is viewing and presenting one or more pieces of supplemental content (e.g., social media data) together with the primary content. The supplemental content presented to the user together with the primary content may be content that is matched to the primary content and therefore detected to be relevant to the user. Detection of primary content and matching to supplemental content may be carried out based on a comparison of entities related to the primary and supplemental content. In some embodiments, an analysis of the primary content for entities may include ordering entities according to significance in the primary content and selecting top entities for comparison. Also, in some embodiments, multiple pieces of supplemental content may be displayed to a user categorized based on entity.

Type: Application

Filed: June 7, 2010

Publication date: December 8, 2011

Applicant: Microsoft Corporation

Inventors: danah boyd, Gilad Lotan, Paul Oka, Emre Mehmet Kiciman, Chun-Kai Wang
Content searching device and content searching method

Patent number: 8073851

Abstract: To provide a content searching device which can efficiently present to the user a topical related keyword.

Type: Grant

Filed: March 2, 2009

Date of Patent: December 6, 2011

Assignee: Panasonic Corporation

Inventors: Kazutoyo Takata, Takashi Tsuzuki, Satoshi Matsuura
SYSTEM AND METHOD FOR ALIGNING AND INDEXING MULTILINGUAL DOCUMENTS

Publication number: 20110295857

Abstract: A system and method for aligning multilingual content and indexing multilingual documents, to a computer readable data storage medium having stored thereon computer code means for indexing multilingual documents, to a system for presenting multilingual content. The method for aligning multilingual content and indexing multilingual documents comprises the steps of generating multiple bilingual terminology databases, wherein each bilingual terminology database associates respective terms in a pivot language with one or more terms in another language; and combining the multiple bilingual terminology databases to form a multilingual terminology database, wherein the multilingual terminology database associates terms in different languages via the pivot language terms.

Type: Application

Filed: June 20, 2008

Publication date: December 1, 2011

Inventors: Ai Ti Aw, Min Zhang, Lian Hau Lee, Thuy Vu, Fon Lin Lai
Self-compacting pattern indexer: storing, indexing and accessing information in a graph-like data structure

Patent number: 8065293

Abstract: An indexing system uses a graph-like data structure that clusters features indexes together. The minimum atomic value in the data structure is represented as a leaf node which is either a single feature index or a sequence of two or more feature indexes when a minimum sequence length is imposed. Root nodes are formed as clustered collections of leaf nodes and/or other root nodes. Context nodes are formed from root nodes that are associated with content that is being indexed. Links between a root node and other nodes each include a sequence order value that is used to maintain the sequencing order for feature indexes relative to the root node. The collection of nodes forms a graph-like data structure, where each context node is indexed according to the sequenced pattern of feature indexes. Clusters can be split, merged, and promoted to increase the efficiency in searching the data structure.

Type: Grant

Filed: October 24, 2007

Date of Patent: November 22, 2011

Assignee: Microsoft Corporation

Inventors: Kunal Mukerjee, R. Donald Thompson, III, Jeffrey Cole, Brendan Meeder
Using asymmetric memory

Patent number: 8065304

Abstract: In one illustrative embodiment, a computer implemented method using asymmetric memory management is provided. The computer implemented method receives a request, containing a search key, to access an array of records in the asymmetric memory, wherein the array has a sorted prefix portion and an unsorted append portion, the append portion alternatively comprising a linked-list, and responsive to a determination that the request is an insert request, inserts the record in the request in arrival order in the unsorted append portion to form a newly inserted record. Responsive to a determination that the newly inserted record completes the group of records, stores an index, in sorted order, for the group of records.

Type: Grant

Filed: June 11, 2008

Date of Patent: November 22, 2011

Assignee: International Business Machines Corporation

Inventor: Kenneth Andrew Ross
Information processing with integrated semantic contexts

Patent number: 8060513

Abstract: A system and method for generating a frame of reference for a plurality of information, the plurality of information containing text data and obtained by a user through interaction with one or more information sources. The method and system include receiving selected information for analysis, the information including a plurality of text data and identifying a plurality of logical units of the text data. Also included are identifying a plurality of individual textual portions in each of the logical units and calculating the number of logical units associated with each of the individual textual portions of the plurality of textual portions for use in identifying a plurality of patterns including a respective pattern for each of the individual textual portions.

Type: Grant

Filed: July 1, 2008

Date of Patent: November 15, 2011

Assignee: Dossierview Inc.

Inventors: Stephen Basco, Nick Foisy, Bruce Scanlan, Harsch Khandelwal
Generating a user-specific search index of content within a virtual environment

Patent number: 8055656

Abstract: Embodiments of the invention provide techniques for searching for virtual objects of an immersive virtual environment based on user interactions within the virtual environment. Generally, embodiments provide an attribute index storing data describing attributes of virtual objects, and an interaction index storing data describing user interactions with virtual objects. Search queries may be evaluated using both the attribute index and interactions index. Thus, virtual objects may be searched in terms of object attributes as well as user interactions with the virtual objects.

Type: Grant

Filed: October 10, 2007

Date of Patent: November 8, 2011

Assignee: International Business Machines Corporation

Inventors: Ryan Kirk Cradick, Zachary Adam Garbow, Ryan Robert Pendergast
Systems and Methods for Discovering Synonymous Elements Using Context Over Multiple Similar Addresses

Publication number: 20110270808

Abstract: A clustering-based approach to data standardization is provided. Certain embodiments take as input a plurality of addresses, identify one or more features of the addresses, cluster the addresses based on the one or more features, utilize the cluster(s) to provide a data-based context useful in identifying one or more synonyms for elements contained in the address(es), and standardize the address(es) to an acceptable format, with one or more synonyms and/or other elements being added to or taken away from the input address(es) as part of the standardization process.

Type: Application

Filed: April 30, 2010

Publication date: November 3, 2011

Applicant: International Business Machines Corporation

Inventors: Tanveer A. Faruquie, Sachindra Joshi, Hima P. Karanam, Marvin Mendelssohn, Mukesh K. Mohania, Angel Smith, L. V. Subramaniam, Girish Venkatachaliah
Adaptive Knowledge Platform

Publication number: 20110264649

Abstract: Methods, systems, and apparatus, including medium-encoded computer program products, for providing an adaptive knowledge platform. In one or more aspects, a system can include a knowledge management component to acquire, classify and disseminate information of a dataset; a human-computer interaction component to visualize multiple perspectives of the dataset and to model user interactions with the multiple perspectives; and an adaptivity component to modify one or more of the multiple perspectives of the dataset based on a user-interaction model.

Type: Application

Filed: April 28, 2009

Publication date: October 27, 2011

Inventors: Ruey-Lung Hsiao, Eugene B. Shirley, Jr.
Computer aided validation of patent disclosures

Patent number: 8046364

Abstract: A method and system for analyzing a patent disclosure is disclosed. The method and system comprise a disclosure analysis and a separate claims analysis, such that each analysis may be performed independently. Missing and incorrect reference labels are identified within the disclosure. Antecedent basis and specification support are checked for the claim elements. Terms within the specification that do not have a reference number, but may require one, are identified, provided that they fit the profile of one of a set of particular lexical patterns.

Type: Grant

Filed: December 2, 2007

Date of Patent: October 25, 2011

Assignee: Veripat, LLC

Inventor: Michael Robert Kahn
System and method for clustering documents

Patent number: 8046363

Abstract: Provided are a system and method of clustering documents. The system includes a document DB, a document feature writing unit storing documents, a document retrieving unit, a clustering unit, and a cluster DB. The document DB stores documents. The document feature writing unit extracts attribute information of documents stored in the document database, and writes indexes with respect to the respective documents on the basis of the attribute information. The document retrieving unit retrieves documents including a query input by a user, using the indexes. The clustering unit includes a representative vector calculator calculating feature vectors and a representative vector of the retrieved documents, and a similarity calculator calculating similarities between the documents using the feature vectors and the representative vector. The cluster database stores documents clustered by the clustering unit.

Type: Grant

Filed: January 10, 2007

Date of Patent: October 25, 2011

Assignee: LG Electronics Inc.

Inventors: Wan Kyu Cha, Jeong Joong Kim, Han Joon Ahn
METHOD FOR CALCULATING ENTITY SIMILARITIES

Publication number: 20110258193

Abstract: One embodiment of the present invention provides a system for estimating a similarity level between semantic entities. During operation, the system selects two or more semantic entities associated with a number documents. The system subsequently parses the documents into sub-parts, and calculates the similarity level between the semantic entities based on occurrences of the semantic entities within the sub-parts of the documents.

Type: Application

Filed: April 15, 2010

Publication date: October 20, 2011

Applicant: PALO ALTO RESEARCH CENTER INCORPORATED

Inventors: Oliver Brdiczka, Petro Hizalev
Domain-Specific Sentiment Classification

Publication number: 20110252036

Abstract: A domain-specific sentiment classifier that can be used to score the polarity and magnitude of sentiment expressed by domain-specific documents is created. A domain-independent sentiment lexicon is established and a classifier uses the lexicon to score sentiment of domain-specific documents. Sets of high-sentiment documents having positive and negative polarities are identified. The n-grams within the high-sentiment documents are filtered to remove extremely common n-grams. The filtered n-grams are saved as a domain-specific sentiment lexicon and are used as features in a model. The model is trained using a set of training documents which may be manually or automatically labeled as to their overall sentiment to produce sentiment scores for the n-grams in the domain-specific sentiment lexicon. This lexicon is used by the domain-specific sentiment classifier.

Type: Application

Filed: June 17, 2011

Publication date: October 13, 2011

Inventors: Tyler J. Neylon, Kerry L. Hannan, Ryan T. McDonald, Michael Wells, Jeffrey C. Reynar
Membership checking of digital text

Patent number: 8037069

Abstract: The described implementations relate to data analysis, such as membership checking. One technique identifies candidate matches between document sub-strings and database members utilizing signatures. The technique further verifies that the candidate matches are true matches.

Type: Grant

Filed: June 3, 2008

Date of Patent: October 11, 2011

Assignee: Microsoft Corporation

Inventors: Kaushik Chakrabarti, Surajt Chaudhuri, Venkatesh Ganti, Dong Xin
Systems and methods for linking an issue with an entry in a knowledgebase

Patent number: 8037009

Abstract: An embodiment relates generally to a method of linking. The method includes receiving a message associated with at least one technical issue being resolved in a first system and containing non-confidential information and searching a knowledgebase in a second system based on the message to obtain at least one related entry. The method also includes associating at least one related entry with the non-confidential information of the message, updating at least one related entry with the non-confidential information, or creating a new entry with the non-confidential information, in the knowledgebase.

Type: Grant

Filed: August 27, 2007

Date of Patent: October 11, 2011

Assignee: Red Hat, Inc.

Inventor: Jason S. Hibbets
EXTRACTION OF ATTRIBUTES AND VALUES FROM NATURAL LANGUAGE DOCUMENTS

Publication number: 20110246467

Abstract: One or more classification algorithms are applied to at least one natural language document in order to extract both attributes and values of a given product. Supervised classification algorithms, semi-supervised classification algorithms, unsupervised classification algorithms or combinations of such classification algorithms may be employed for this purpose. The at least one natural language document may be obtained via a public communication network. Two or more attributes (or two or more values) thus identified may be merged to form one or more attribute phrases or value phrases. Once attributes and values have been extracted in this manner, association or linking operations may be performed to establish attribute-value pairs that are descriptive of the product. In a presently preferred embodiment, an (unsupervised) algorithm is used to generate seed attributes and values which can then support a supervised or semi-supervised classification algorithm.

Type: Application

Filed: June 13, 2011

Publication date: October 6, 2011

Applicant: Accenture Global Services Limited

Inventors: Katharina PROBST, Rayid GHANI, Andrew E. FANO, Marko KREMA, Yan LIU
Managing structured content stored as a binary large object (BLOB)

Patent number: 8032521

Abstract: Embodiments of the present invention address deficiencies of the art in respect to structured content storage and provide a novel and non-obvious method, system and computer program product for managing structured content stored in a BLOB. In an embodiment of the invention, a performance optimized structured content management system can include a content repository, a content manager configured to provide access to structured content in the content repository and multiple different performance optimized containers disposed in the content repository. Each of the containers can store a portion of the structured content, and each of the containers can include a flattened form of original structured content in a primary binary large object (BLOB) and a parsed form of the original structured content in a secondary BLOB, the parsed form of the original structured content in the secondary BLOB indexing the flattened form of the original structured content in the primary BLOB.

Type: Grant

Filed: August 8, 2007

Date of Patent: October 4, 2011

Assignee: International Business Machines Corporation

Inventors: Stephen J. Garward, Mark C. Hampton, Eric Martinez de Morentin, Kenneth Sabir
Named Entity Recognition in Query

Publication number: 20110231347

Abstract: Named Entity Recognition in Query (NERQ) involves detection of a named entity in a given query and classification of the named entity into one or more predefined classes. The predefined classes may be based on a predefined taxonomy. A probabilistic approach may be taken to detecting and classifying named entities in queries, the approach using either query log data or click through data and Weakly Supervised Latent Dirichlet Allocation (WS-LDA) to construct and train a topic model.

Type: Application

Filed: March 16, 2010

Publication date: September 22, 2011

Applicant: Microsoft Corporation

Inventors: Gu Xu, Hang Li, Jiafeng Guo
Vector space method for secure information sharing

Patent number: 8024344

Abstract: Presented are systems and methods for securely sharing confidential information. In such a method, term vectors corresponding to ones of a plurality of confidential terms included in a plurality of confidential documents is received. Each of the received term vectors is mapped into a vector space. Non-confidential documents are mapped into the vector space to generate a document vector corresponding to each non-confidential document, wherein the generation of each document vector is based on a subset of the received term vectors. At least one of the non-confidential documents is identified in response to a query mapped into the vector space.

Type: Grant

Filed: June 5, 2008

Date of Patent: September 20, 2011

Assignee: Content Analyst Company, LLC

Inventor: Roger Bradford
Query expansion

Patent number: 8024341

Abstract: An expanded queries data structure is described. The data structure is produced on the basis of a set of seed queries, and consists of entries each specifying an expanded query submitted by a user that has been determined to have a high degree of relatedness to at least a plurality of the seed queries of the set. The expanded queries specified by the entries of the expanded queries data structure can be used to define a segment of users expected to have interests characterized by the seed queries.

Type: Grant

Filed: July 10, 2008

Date of Patent: September 20, 2011

Assignee: AudienceScience Inc.

Inventors: Yair Even-Zohar, Basem Nayfeh
SYSTEM AND METHOD OF STRUCTURING DATA FOR SEARCH USING LATENT SEMANTIC ANALYSIS TECHNIQUES

Publication number: 20110225159

Abstract: The disclosed embodiments provide a system and method for using modified Latent Semantic Analysis techniques to structure data for efficient search and display. The present invention creates a hierarchy of clustered documents, representing the topics of a domain corpus, through a process of optimal agglomerative clustering. The output from a search query is displayed in a fisheye view corresponding to the hierarchy of clustered documents. The fisheye view may link to a two-dimensional self-organizing map that represents semantic relationships between documents.

Type: Application

Filed: January 27, 2011

Publication date: September 15, 2011

Inventor: Jonathan Murray
COMPUTER PRODUCT, OPERATION AND MANAGEMENT SUPPORT APPARATUS AND METHOD

Publication number: 20110225160

Abstract: A computer-readable, non-transitory medium stores therein an operation management support program that causes a computer to execute a process that includes acquiring execution history information recording for each element group included in activity diagrams expressing work procedures for operation processes executed by a system, correlations between elements and access destinations thereof; searching among elements not yet selected from among all element groups, for a second element having an access destination coinciding with that of a first element selected from among all element groups, the searching performed by referring to the acquired execution history information; setting the first and the second elements as synonymous elements, if a second element is retrieved at the searching; extracting from among the element groups included in the activity diagrams including synonymous elements, a common element string of elements common among the activity diagrams that include the synonymous elements; and output

Type: Application

Filed: May 24, 2011

Publication date: September 15, 2011

Applicant: Fujitsu Limited

Inventor: Masataka Sonoda
System and method for internet endpoint profiling

Patent number: 8019764

Abstract: The present invention relates to a method of profiling an Internet endpoint associated with an Internet Protocol (IP) address, the method includes generating a profiling rule using an Internet search engine, obtaining a search result by inputting the IP address to the Internet search engine, and classifying the Internet endpoint based on the search result using the profiling rule.

Type: Grant

Filed: April 17, 2008

Date of Patent: September 13, 2011

Assignee: Narus Inc.

Inventors: Antonio Nucci, Supranamaya Ranjan, Aleksandar Kuzmanovic

prev … 4 5 6 7 8 9 10 next