Latent Semantic Index Or Analysis (lsi Or Lsa) Patents (Class 707/739)

SUMMARIZATION OF SHORT COMMENTS

Publication number: 20140214842

Abstract: A method and a system for summarization of short comments are provided. The system comprises a memory to store a comments collection. The comments collection stores a plurality of comments for later access. The comments respectively include an overall rating and at least one phrase. The system also includes one or more processors to implement an aspect module to map a portion of the plurality of comments to a first aspect corresponding to an attribute of the entity. The one or more processor also implementing a rating module to determine an aspect rating corresponding to the first aspect based on the respective overall rating of the portion of the plurality of comments.

Type: Application

Filed: April 1, 2014

Publication date: July 31, 2014

Applicant: eBay Inc.

Inventors: Yue Lu, Neelakantan Sundaresan
Semantic Product Classification

Publication number: 20140214841

Abstract: The present disclosure extends to methods, systems, and computer program products for updating a merchant database with new product items and placing the new product items within a hierarchy of existing merchant product offerings. In operation, the new product is represented by a title and description that can be semantically classified using a plurality of classification models and reviewed by users for accuracy.

Type: Application

Filed: January 31, 2013

Publication date: July 31, 2014

Applicant: Wal-Mart Stores, Inc.

Inventors: Nikesh Lucky Garera, Narasimhan Rampalli, Dintyala Venkata Subrahmanya Ravikant, Srikanth Subramaniam, Chong Sun, Heather Dawn Yalin
Unified semantic ranking of compositions of ontological subjects

Patent number: 8793253

Abstract: The present invention discloses methods, systems, and tools for unified semantic ranking of compositions of ontological subjects. The method breaks a composition into a plurality of partitions as well as its constituent ontological subjects of different orders and builds a participation matrix indicating the participation of ontological subjects of the composition in other ontological subjects, i.e. the partitions, of the composition. Using the participation information of the ontological subjects into each other a similarity matrix is built from which the semantic importance ranks of the partitions of the composition are calculated. The method, systematically, enables the calculation of the semantic ranks of the ontological subjects of different orders of the composition. Various systems for implementing the method and numerous applications and services are disclosed.

Type: Grant

Filed: August 8, 2013

Date of Patent: July 29, 2014

Assignee: Hamid Hatami-Hanza

Inventor: Hamid Hatami-Hanza
SYSTEM AND METHOD FOR COMPUTERIZED SEMANTIC PROCESSING OF ELECTRONIC DOCUMENTS INCLUDING THEMES

Publication number: 20140207782

Abstract: System and method for computerized identification of themes in a large data set, the system comprising reducing the number of data set members in a large data set, using at least one computerized data set member pruning technique other than random selection; and using a computerized theme identification technique for identifying a plurality of themes in the reduced data set.

Type: Application

Filed: January 22, 2014

Publication date: July 24, 2014

Inventor: Yiftach RAVID
SAMPLING OF EVENTS TO USE FOR DEVELOPING A FIELD-EXTRACTION RULE FOR A FIELD TO USE IN EVENT SEARCHING

Publication number: 20140207784

Abstract: Embodiments are directed towards generating a representative sampling as a subset from a larger dataset that includes unstructured data. A graphical user interface enables a user to provide various data selection parameters, including specifying a data source and one or more subset types desired, including one or more of latest records, earliest records, diverse records, outlier records, and/or random records. Diverse and/or outlier subset types may be obtained by generating clusters from an initial selection of records obtained from the larger dataset. An iteration analysis is performed to determine whether a sufficient number of clusters and/or cluster types have been generated that exceed at least one threshold and when not exceeded, additional clustering is performed on additional records. From the resultant clusters, and/or other subtype results, a subset of records is obtained as the representative sampling subset.

Type: Application

Filed: January 30, 2014

Publication date: July 24, 2014

Applicant: SPLUNK INC.

Inventors: R. David CARASSO, Micah James Delfino
SYSTEM AND METHOD FOR COMPUTERIZED IDENTIFICATION AND EFFECTIVE PRESENTATION OF SEMANTIC THEMES OCCURRING IN A SET OF ELECTRONIC DOCUMENTS

Publication number: 20140207783

Abstract: System and method for computerized identification and presentation of semantic themes occurring in a set of electronic documents, comprising performing topic modeling on the set of documents thereby to yield a set of topics and for each topic, a topic-modeling output list of words; and using a processor performing a matching algorithm to match only a subset of each topic-modeling output list of words, to the output list's corresponding topic, such that each word appears in no more than a predetermined number of subsets from among said subsets.

Type: Application

Filed: January 22, 2014

Publication date: July 24, 2014

Inventor: Yiftach RAVID
HYBRID METHOD OF BUILDING TOPIC ONTOLOGIES FOR PUBLISHER AND MARKETER CONTENT AND AD RECOMMENDATIONS

Publication number: 20140201185

Abstract: Systems and methods are discussed to automatically create a domain ontology that is a combination of ontologies. Some embodiments include systems and methods for developing a combined ontology for a website that includes extracting collocations for each webpage within the website, creating first and second ontologies from the collocations, and then aggregating the ontologies into a combined ontology. Some embodiments of the invention include unique ways to calculate collocations, to develop a smaller yet meaningful document sample from a large sample, to determine webpages of interest to users interacting with a website, and to determine topics of interest of users interacting with a website. Various other embodiments of the invention are disclosed.

Type: Application

Filed: January 17, 2013

Publication date: July 17, 2014

Applicant: Adobe Systems Incorporated

Inventors: Walter Chang, Minhoe Hur, Geoff Baum
Method and system for identifying entities

Patent number: 8782042

Abstract: Some embodiments provide a program that identifies an entity having an entity attribute. The program receives, from each method of several methods, a set of candidate identity attributes that are each for identifying a particular entity having the entity attribute specified in the document. Each method of the several methods generates the corresponding set of candidate identity attributes based on the entity attribute specified in a document. The program calculates a score for each candidate identity attribute in the sets of candidate identity attributes. The program identifies, based on the sets of scores, an identity attribute from the sets of candidate identity attributes that identifies the entity having the entity attribute specified in the document.

Type: Grant

Filed: October 14, 2011

Date of Patent: July 15, 2014

Assignee: Firstrain, Inc.

Inventors: David Cooke, Martin Betz, Ashutosh Joshi, Binay Mohanty
SYSTEM AND METHOD FOR AUTOMATICALLY GENERATING SYSTEMATIC REVIEWS OF A SCIENTIFIC FIELD

Publication number: 20140195539

Abstract: A system and method are provided for automatically generating systematic reviews of received information in a field of science and technology, such as scientific literature, where the systematic review includes a systematic review of a research field in the scientific literature. The method includes the steps of constructing a time series networks of words, passages, documents, and citations and/or co-citations within received information into a synthesized network, decomposing the networks into clusters of fields or topics, performing part-of-speech tagging of text within the received information to provide tagged text, constructing semantic structures of concepts and/or assertions extracted from the source text, generating citation-based and content-based summaries of the clusters of fields or topics and the semantic structures, and generating structured narratives of the clusters of fields or topics and the summaries of the generated semantic structures.

Type: Application

Filed: October 1, 2013

Publication date: July 10, 2014

Applicant: Drexel University

Inventor: Chaomei Chen
Utilization and Power Efficient Hashing

Publication number: 20140188885

Abstract: Methods, systems, and computer readable storage medium embodiments for hashing with improved utilization and power efficiency are disclosed. Some embodiments include inserting a key in a selected bucket in accordance with an bucket identifier generated by a hash function, wherein the selected bucket is one of a plurality of buckets of a hash table configured in at least one memory, determining respective unique bit strings based upon corresponding bit positions for a plurality of keys in the selected bucket including the inserted key, inserting the respective unique bit strings in a table location corresponding to the bucket identifier, wherein the table location is one of a plurality of table locations in at least one control table configured in the at least one memory. Other embodiments include lookup operations in a hash table.

Type: Application

Filed: December 27, 2012

Publication date: July 3, 2014

Applicant: Broadcom Corporation

Inventors: Abhay Kulkarni, Bhupesh Ramchandani
Method and system for search structured data from a natural language search request

Patent number: 8762384

Abstract: A method and system for performing a semantic search on structured data. An unstructured search query is received from a requestor. The query is evaluated within a computer to identify a best structured request based on the unstructured search query. The selected structured request is applied to a set of structured data. The result of the application of the structured request is then returned to the requestor.

Type: Grant

Filed: August 19, 2010

Date of Patent: June 24, 2014

Assignee: SAP Aktiengesellschaft

Inventor: Robert Heidasch
Context-based filtering of search results

Patent number: 8762368

Abstract: A server is configured to receive, from a client, a query and context information associated with a document; obtain search results, based on the query, that identify documents relevant to the query; analyze the context information to identify content; generate first scores for a hierarchy of topics, that correspond to measures of relevance of the topics to the content; select a topic that is most relevant to the context information when the topic is associated with a greatest first score; generate second scores for the search results that correspond to measures of relevance, of the search results, to the topic; select one or more of the search results as being most relevant to the topic when the search results are associated with one or more greatest second scores; generate a search result document that includes the selected search results; and send, to a client, the search result document.

Type: Grant

Filed: April 30, 2012

Date of Patent: June 24, 2014

Assignee: Google Inc.

Inventors: Sarveshwar Duddu, Kuntal Loya, Minh Tue Vo Thanh, Thorsten Brants
HEADER-TOKEN DRIVEN AUTOMATIC TEXT SEGMENTATION

Publication number: 20140172858

Abstract: A method and a system to automatically segment text based on header tokens is described. A relevance value and an irrelevance value are determined for each token in a description, assuming no tokens are left out of computations. The irrelevance value is based on occurrences of a token in a sample set of descriptions. The relevance value is an estimated probability of relevance based on the header of the description being segmented.

Type: Application

Filed: December 9, 2013

Publication date: June 19, 2014

Applicant: eBay Inc.

Inventors: Badrul M. Sarwar, John A. Mount
Information theory entropy reduction program

Patent number: 8756234

Abstract: A system and process for automated text analysis which can be used to identify phrases in reports such as medical reports includes identifying a phrase contained within a text, extracting the phrase from the text, determining a value of the phrase and, in response to the phrase having at least a threshold value, reducing the phrase to a root meaning. In one embodiment, the value of the phrase is assigned via lexicon-based hierarchical decision trees.

Type: Grant

Filed: November 16, 2004

Date of Patent: June 17, 2014

Assignee: The General Hospital Corporation

Inventor: Keith J. Dreyer
Indexing documents

Patent number: 8756215

Abstract: A document to be indexed is initially indexed in dependence upon language-specific rules of a single language. A success metric is used to assess the effectiveness of the single language indexing. If a threshold level of success is not attained, the document is identified as multi-lingual. In response to identifying the document as multi-lingual, the document is queued for multi-lingual indexing. A document may be fragmented into a number of smaller documents, each of which is indexed separately.

Type: Grant

Filed: December 2, 2009

Date of Patent: June 17, 2014

Assignee: International Business Machines Corporation

Inventor: Deep Shikha
Computer product, operation and management support apparatus and method

Patent number: 8751503

Abstract: A computer-readable, non-transitory medium stores therein an operation management support program that causes a computer to execute a process that includes acquiring execution history information recording for each element group included in activity diagrams expressing work procedures for operation processes executed by a system, correlations between elements and access destinations thereof; searching among elements not yet selected from among all element groups, for a second element having an access destination coinciding with that of a first element selected from among all element groups, the searching performed by referring to the acquired execution history information; setting the first and the second elements as synonymous elements, if a second element is retrieved at the searching; extracting from among the element groups included in the activity diagrams including synonymous elements, a common element string of elements common among the activity diagrams that include the synonymous elements; and output

Type: Grant

Filed: May 24, 2011

Date of Patent: June 10, 2014

Assignee: Fujitsu Limited

Inventor: Masataka Sonoda
Visually-represented results to search queries in rich media content

Patent number: 8751502

Abstract: When executed, a computer program product generates a graphical user interface that renders results that are responsive to a search query of a rich media file. The graphical user interface includes a chronological representation of the rich media file, one or more occurrence markers along the chronological representation corresponding to actual occurrences of a desired term at an indicated chronological location in the rich media file, and an execution icon configured to launch a rich media application that renders a relevant portion that is responsive to the search query.

Type: Grant

Filed: December 30, 2005

Date of Patent: June 10, 2014

Assignee: AOL Inc.

Inventor: Rakesh Agrawal
AUTOMATIC DOCUMENT CLASSIFICATION VIA CONTENT ANALYSIS AT STORAGE TIME

Publication number: 20140156665

Abstract: Techniques are disclosed for efficiently and automatically classifying textual documents or files. In some embodiments, the classification process is integrated into or otherwise made part of the storage function, such that when the user initiates a save process for a given file, the file is processed through a classifier prior to (or contemporaneously with) completing the save function. In some such embodiments, textual content of the file is analyzed using natural language processing to identify a main or substantial concept discussed in the file, and one or more corresponding tags are then assigned to that file. Subsequently, the user can access that file based on the one or more tags, for instance, through a user interface that allows the user to select one or more content categories associated with the assigned tags. The files can be text-based, but may include other content as well, such as images, video, and audio.

Type: Application

Filed: December 3, 2012

Publication date: June 5, 2014

Applicant: ADOBE SYSTEMS INCORPORATED

Inventor: Michael Kraley
Determining reading levels of electronic books

Patent number: 8744855

Abstract: Architectures and techniques are described to determine a reading level of an electronic book. In particular, words, phrases, clauses, and parts of speech of an electronic book may be tagged and used to determine the reading level of the electronic book. In some cases, the reading level of the electronic book is based on a level of complexity of sentences of the electronic book and a level of complexity of words of the electronic book.

Type: Grant

Filed: August 9, 2010

Date of Patent: June 3, 2014

Assignee: Amazon Technologies, Inc.

Inventor: Daniel B. Rausch
STOCHASTIC DOCUMENT CLUSTERING USING RARE FEATURES

Publication number: 20140143253

Abstract: Systems, methods, and apparatus for clustering resources using rare features are provided. For example, an environment includes an extraction module, an index module, and a cluster module. The extractions module identifies a set of resources and extracts a plurality of features from the resources. The plurality of features may be rare features. The index module identifies and generates a rare features index. The cluster module identifies at least two resources that share rare features, creates one or more clusters based on the identified at least two resources, and associates resources that share similar features with the one or more clusters. Resources that do not share similar features are not associated with the one or more clusters. Identifying at least two resources that share rare features is based at least upon a threshold.

Type: Application

Filed: November 15, 2013

Publication date: May 22, 2014

Inventor: Joshua Powers
Method, system and computer software product for pre-selecting a folder for a message

Patent number: 8732245

Abstract: In a computer system, a system, method and computer program product for pre-selecting a folder for a current message. The system, method and computer program product involve (a) providing a folder pre-selection cache having n configurable entries, n being a predetermined positive integer greater than one, each configurable entry being configured to include an associated pre-selection criterion for matching with the current message, and an associated folder identification for identifying an associated folder in the plurality of folders; (b) for at least one entry in the folder pre-selection cache, comparing a comparison criterion, obtained from the current message, with the associated pre-selection criterion to determine a matching entry in the folder pre-selection cache; and, (c) pre-selecting the folder identified by the associated folder identification of the matching entry when the message comparison module determines the matching entry in the folder pre-selection cache.

Type: Grant

Filed: February 7, 2003

Date of Patent: May 20, 2014

Assignee: BlackBerry Limited

Inventors: Anthony F. Scian, David P. Yach, R. Scotte Zinn, Gerhard D. Klassen
Systems and methods for using metadata to enhance data identification operations

Patent number: 8725737

Abstract: Systems and methods for managing electronic data are disclosed. Various data management operations can be performed based on a metabase formed from metadata. Such metadata can be identified from an index of data interactions generated by a journaling module, and obtained from their associated data objects stored in one or more storage devices. In various embodiments, such processing of the index and storing of the metadata can facilitate, for example, enhanced data management operations, enhanced data identification operations, enhanced storage operations, data classification for organizing and storing the metadata, cataloging of metadata for the stored metadata, and/or user interfaces for managing data. In various embodiments, the metabase can be configured in different ways. For example, the metabase can be stored separately from the data objects so as to allow obtaining of information about the data objects without accessing the data objects or a data structure used by a file system.

Type: Grant

Filed: September 11, 2012

Date of Patent: May 13, 2014

Assignee: CommVault Systems, Inc.

Inventors: Anand Prahlad, Jeremy Alan Schwartz, David Ngo, Brian Brockway, Marcus S. Muller
REAL-TIME DATA MANAGEMENT FOR A POWER GRID

Publication number: 20140129746

Abstract: The present disclosure relates to real-time data management for a power grid and presents a real-time data management system, a system, method, apparatus and tangible computer readable medium for accessing data in a power grid, a system, method, apparatus and tangible computer readable medium for controlling a transmission delay of real-time data delivered via a real-time bus, and a system, method, apparatus and tangible computer readable medium for delivering real-time data in a power grid. In the real-time data management system of the present disclosure, a unified data model covering various organizations and various data resource is designed and a management scheme for clustered data is used to provide a transparent and high speed data access. Besides, multi-bus collaboration and bus performance optimization approaches are utilized to improve efficiency and performance of the buses.

Type: Application

Filed: February 26, 2013

Publication date: May 8, 2014

Applicant: ACCENTURE GLOBAL SERVICES LIMITED

Inventors: Qin Zhou, Zhihui Yang, Xiaopei Cheng, Yan Gao, Guo Ma
Unsupervised document clustering using latent semantic density analysis

Patent number: 8713021

Abstract: According to one embodiment, a latent semantic mapping (LSM) space is generated from a collection of a plurality of documents, where the LSM space includes a plurality of document vectors, each representing one of the documents in the collection. For each of the document vectors considered as a centroid document vector, a group of document vectors is identified in the LSM space that are within a predetermined hypersphere diameter from the centroid document vector. As a result, multiple groups of document vectors are formed. The predetermined hypersphere diameter represents a predetermined closeness measure among the document vectors in the LSM space. Thereafter, a group from the plurality of groups is designated as a cluster of document vectors, where the designated group contains a maximum number of document vectors among the plurality of groups.

Type: Grant

Filed: July 7, 2010

Date of Patent: April 29, 2014

Assignee: Apple Inc.

Inventor: Jerome R. Bellegarda
METHOD AND SYSTEM FOR SOCIAL MEDIA BURST CLASSIFICATIONS

Publication number: 20140114978

Abstract: The present invention is directed to a method, system, and article of manufacture for systematically and automatically identifying abnormal or collective behavior patterns in microblogging messages that produce burst phenomena, such as Twitter storms. A microblogging storm engine in a storm detection server is configured to detect and classify the volume, shape, and type of a Twitter storm when keying on topics such as, but not limited to, a brand, an event, a person, an entity, a country, or a controversial issue. The microblogging storm engine comprises a storm detection module, a storm classification module, a database interface module, and a sentiment process module. The storm detection module is configured to detect different patterns of microblogging storms by capturing the volume of a particular storm to assist in output statistical analysis. The storm classification module is configured to classify the storms into different types of a particular storm category.

Type: Application

Filed: October 24, 2013

Publication date: April 24, 2014

Applicant: METAVANA, INC.

Inventors: Manjirnath CHATTERJEE, Rabia TURAN, Brian LUE
Periodic ambient waveform analysis for enhanced social functions

Patent number: 8706499

Abstract: Client devices periodically capture ambient audio waveforms, generate waveform fingerprints, and upload the fingerprints to a server for analysis. The server compares the waveforms to a database of stored waveform fingerprints, and upon finding a match, pushes content or other information to the client device. The fingerprints in the database may be uploaded by other users, and compared to the received client waveform fingerprint based on common location or other social factors. Thus a client's location may be enhanced if the location of users whose fingerprints match the client's is known. In particular embodiments, the server may instruct clients whose fingerprints partially match to capture waveform data at a particular time and duration for further analysis and increased match confidence.

Type: Grant

Filed: August 16, 2011

Date of Patent: April 22, 2014

Assignee: Facebook, Inc.

Inventors: Matthew Nicholas Papakipos, David Harry Garcia
METHOD AND SYSTEM FOR RECOMMENDING SEMANTIC ANNOTATIONS

Publication number: 20140101162

Abstract: A method for recommending semantic annotations on a main document and sub documents is provided. The method includes: extracting a keyword of the main document; extracting a or a set of keyword of each sub document; and generating a or a set of keyword similarity of each of the sub documents based on a degree of similarity between the keyword of the main document and the keyword of each of the sub documents. The method also includes: obtaining a plurality of words appeared on each of the sub documents and calculating a frequency of each of the words; generating a semantic capacity of each of the sub documents according to the frequencies; grouping the main document and at least one of the sub documents into a semantic document set based on the semantic capacities and the keyword similarities; and annotating the main document according to the semantic document set.

Type: Application

Filed: October 9, 2012

Publication date: April 10, 2014

Applicant: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE

Inventors: Hsiang-Yuan Hsueh, Ko-Li Kan, Chi-Chou Chiang
Methods and systems for technology analysis and mapping

Patent number: 8694504

Abstract: Systems and methods for cladistics-based content searching, analysis, and/or diagrammatic representation of results in graphical user interface format for viewing by at least one user on a computer-type device or network, in particular for technology and patent-related content stored in at least one database.

Type: Grant

Filed: March 4, 2004

Date of Patent: April 8, 2014

Assignee: Spore, Inc.

Inventors: Guy R. Beretich, Jr., JiNan Glasgow
System and method for probabilistic relational clustering

Patent number: 8676805

Abstract: Relational clustering has attracted more and more attention due to its phenomenal impact in various important applications which involve multi-type interrelated data objects, such as Web mining, search marketing, bioinformatics, citation analysis, and epidemiology. A probabilistic model is presented for relational clustering, which also provides a principal framework to unify various important clustering tasks including traditional attributes-based clustering, semi-supervised clustering, co-clustering and graph clustering. The model seeks to identify cluster structures for each type of data objects and interaction patterns between different types of objects. Under this model, parametric hard and soft relational clustering algorithms are provided under a large number of exponential family distributions.

Type: Grant

Filed: September 27, 2012

Date of Patent: March 18, 2014

Assignee: The Research Foundation for The State University of New York

Inventors: Bo Long, Zhongfei Zhang
METHODS, SYSTEMS, AND COMPUTER-READABLE MEDIA FOR SEMANTICALLY ENRICHING CONTENT AND FOR SEMANTIC NAVIGATION

Publication number: 20140074845

Abstract: Content of different formats may be sourced from various data sources such as content servers and ingested into a data integration server by an ingestion broker embodied on a non-transitory computer readable medium. The ingestion broker may normalize the content of different formats into a uniform representation that can be indexed and delivered across multiple digital channels for a variety of applications. The normalized content may be analyzed and semantic metadata may be determined from the normalized content. The normalized content can be semantically enriched by associating the semantic metadata and the like with the content. The semantic metadata can be stored in a semantic index that can be used for searching via the data integration server. During search, the semantic metadata can be instantiated as facets for user navigation and refinement of search criteria and additional semantic relationships can be assigned to the words in the normalized content.

Type: Application

Filed: November 13, 2013

Publication date: March 13, 2014

Applicant: Open Text Corporation

Inventors: Pascal Dimassimo, Steve Pettigrew, Martin Brousseau, Charles-Olivier Simard, Eric Williams, Francis Lacroix, Alex Dowgailenko, Agostino Deligia, Jean-Michel Texier
METHOD AND SYSTEM FOR IMPLEMENTING SEMANTIC ANALYSIS OF INTERNAL SOCIAL NETWORK CONTENT

Publication number: 20140074844

Abstract: Disclosed is a method, system, and computer program product for semantically analyzing the content within an internal social network. Using the results of the analysis, the executives can gain a better understanding of, and insight into, the organization and its employees. A dashboard tool may be used in some embodiments of the invention to visualize the results of the semantic analysis.

Type: Application

Filed: September 9, 2013

Publication date: March 13, 2014

Applicant: Oracle International Corporation

Inventors: Srividhya SUBRAMANIAN, Mary E.G. BEAR, Mehrshad SETAYESH, Noah HORTON
Sharing parts of a document using search framework

Patent number: 8671078

Abstract: Embodiments are configured to provide sharing of business logic items. A document may contain business logic items, for example, sets, members, or measures. Some business logic items may be created by a publisher who wants to make the business logic available to other users so that others can access the business logic. Embodiments provide for using an integrated server platform search component to automatically retrieve business logic items which exist in one or more documents stored in a document library. This may allow for a publisher to provide business logic to other users without having to rely on the other users to retrieve the business logic from a specific document, and without requiring the other users to know of the existence of the business logic. Restrictions may be placed so that a publisher can control what specific pieces of business logic may be made available.

Type: Grant

Filed: August 31, 2011

Date of Patent: March 11, 2014

Assignee: Microsoft Corporation

Inventors: Josh C. Zimmerman, David Scott Gustafson, Kurt Leonard Ziegler
Labeling Product Identifiers and Navigating Products

Publication number: 20140067815

Abstract: The present disclosure provides example methods and apparatuses of labeling product identifiers and methods of navigating products. Description information of one or more products is extracted. The description information of the products is clustered into a text. A subject analysis is applied to the text by using a text analysis method based on subject models to obtain one or more subjects and definition names for the subjects. A subject that is correlated to the description information of the product is used as an identifier of the product to label the product. The present techniques label the products with identifiers that have one or more user dimension attributes so that users may easily and intuitively find their desired products.

Type: Application

Filed: September 3, 2013

Publication date: March 6, 2014

Applicant: Alibaba Group Holding Limited

Inventors: Changlong Sun, Anxiang Zeng
Method and system for ontology candidate selection, comparison, and alignment

Patent number: 8655882

Abstract: A system for ontology candidate selection and comparison including a microprocessor and an ontology candidate selection component executing on the microprocessor and configured to compare at least a portion of a plurality of ontology candidates based on a candidate selection rule, and based on said comparison, select from the plurality of ontology candidates a pair of ontologies. The system further includes an ontology similarity component coupled to the ontology candidate selection component and configured to generate a similarity outcome related to the pair of ontologies based on a similarity rule and evaluate at least one of: the candidate selection rule or the similarity rule based on the similarity outcome.

Type: Grant

Filed: August 31, 2011

Date of Patent: February 18, 2014

Assignee: Raytheon Company

Inventors: Donald R. Kretz, William D. Phillips, Bruce E. Peoples, Justin W. Toennies
Computer-implemented system and method for generating a display of document clusters

Patent number: 8650190

Abstract: A computer-implemented system and method for generating a display of document clusters is described. Clusters of documents are presented in a multi-dimensional concept space. At least one document is selected from a collection of documents to be clusters. An angle ? of the document relative to a common origin of the multi-dimensional concept space is computed. The selected document is compared with each of the clusters. An angle ? from the common origin is determined for each cluster. A difference between the angle ? for the document and the angle ? for the cluster is determined. The difference is compared to the variance, and a new cluster is created when the difference exceeds the variance for all the clusters.

Type: Grant

Filed: March 14, 2013

Date of Patent: February 11, 2014

Assignee: FTI Technology LLC

Inventor: Dan Gallivan
METHOD AND APPARATUS FOR ANALYZING A DOCUMENT

Publication number: 20140040270

Abstract: Method, apparatus, and computer-readable medium are provided for analyzing a document including text. In one example, a method for identifying patterns in a document is described. The method includes identifying a plurality of candidate phrases in the document based on candidate identification criteria, grouping the candidate phrases of the plurality of candidate phrases with a phrase family based on family criteria and comparison between candidate phrases of the plurality of candidate phrases to obtain consistent phrases, and, for remaining phrases not meeting all of the candidate identification criteria, associating at least one of the remaining phrases with a phrase family based on inconsistent phrase criteria to obtain inconsistent phrases. Identified in this manner, the inconsistent phrase may be displayed via a user interface to permit a user the opportunity to determine whether an inconsistent phrase requires modification.

Type: Application

Filed: July 31, 2012

Publication date: February 6, 2014

Applicant: Freedom Solutions Group, LLC, d/b/a Microsystems

Inventors: Thomas O'Sullivan, Andrzej Jachowicz
ELECTRONIC CONTENT CHANGE TRACKING

Publication number: 20140033088

Abstract: Apparatus, systems, and methods may operate to transmit and receive information, such as between a client and a server, that enables the display of a plurality of version indicators corresponding to a plurality of versions of electronic content, the plurality of versions comprising a first version newer than a second version. Further activities may include detecting selection of, and then displaying, a first selection indicator to indicate selection of the first version and a second selection indicator to indicate selection of the second version. Further activity may include communicate information to enable displaying, at substantially the same time as the first and second selection indicators, at least a portion of a plurality of changes between the first version and the second version. Additional apparatus, systems, and methods are disclosed.

Type: Application

Filed: October 8, 2008

Publication date: January 30, 2014

Inventor: Robert Shaver
Fact-based indexing for natural language search

Patent number: 8639708

Abstract: Computer-readable media and a computer system for implementing a natural language search using fact-based structures and for generating such fact-based structures are provided. A fact-based structure is generated using a semantic structure, which represents information, such as text, from a document, such as a web page. Typically, a natural language parser is used to create a semantic structure of the information, and the parser identifies terms, as well as the relationship between the terms. A fact-based structure of a semantic structure allows for a linear structure of these terms and their relationships to be created, while also maintaining identifiers of the terms to convey the dependency of one fact-based structure on another fact-based structure. Additionally, synonyms and hypernyms are identified while generating the fact-based structure to improve the accuracy of the overall search.

Type: Grant

Filed: August 29, 2008

Date of Patent: January 28, 2014

Assignee: Microsoft Corporation

Inventors: Martin Henk Van Den Berg, Daniel Bobrow, Robert D. Cheslow, Barney Pell, Giovanni Lorenzo Thione, Chad Walters
System and method for identifying phrases in text

Patent number: 8639496

Abstract: A method includes accessing text that includes a plurality of words, tagging each of the plurality of words with one of a plurality of parts of speech (POS) tags, and creating a plurality of tokens, each token comprising one of the plurality of words and its associated POS tag. The method further includes clustering one or more of the created tokens into a chunk of tokens, the one or more tokens clustered into the chunk of tokens based on the POS tags of the one or more tokens, and forming a phrase based on the chunk of tokens, the phrase comprising the words of the one or more tokens clustered into the chunk of tokens.

Type: Grant

Filed: January 2, 2013

Date of Patent: January 28, 2014

Assignee: PureDiscovery Corporation

Inventor: Paul A. Jakubik
Automatic expansion of an advertisement offer inventory

Patent number: 8635107

Abstract: An extensible offer inventory database of offers in a domain is established. Further, an offer ontology is generated based on the extensible offer inventory database. The offer ontology provides an extensible vocabulary that correlates to categories in the offer inventory database. In addition, offers are automatically located. The offers are also semantically analyzed to generate semantic analysis data. Further, user data is obtained. In addition, an optimal offer match is automatically determined based upon the semantic analysis data and the user data.

Type: Grant

Filed: June 3, 2011

Date of Patent: January 21, 2014

Assignee: Adobe Systems Incorporated

Inventors: Walter Chang, Geoff Baum
Contextual text interpretation

Patent number: 8620918

Abstract: Among other disclosed subject matter, a computer-implemented method includes receiving a plurality of electronic documents associated with a domain at a server. Each of the plurality of electronic documents includes meta-data and textual content. The method includes identifying one or more text strings in the textual content that are to be processed differently than an identical or similar text string in other electronic documents, and associating, with the electronic document, data indicating that each of the identified text strings is to be processed differently than an identical or similar text string in other electronic documents. The method also includes performing an analysis of the electronic documents to identify one or more subsets of the electronic documents that include related subject matter. A plurality of degrees of relatedness can be associated with text strings associated with data indicating that each of the text strings is to be processed differently.

Type: Grant

Filed: February 1, 2012

Date of Patent: December 31, 2013

Assignee: Google Inc.

Inventors: Aner Ben-Artzi, Kirill Buryak, Glenn M. Lewis, Jun Peng, Nadav Benbarak
System and method for a unified semantic ranking of compositions of ontological subjects and the applications thereof

Patent number: 8612445

Abstract: The present invention discloses methods, systems, and tools for unified semantic ranking of compositions of ontological subjects. The method breaks a composition into a plurality of partitions as well as its constituent ontological subjects of different orders and builds a participation matrix indicating the participation of ontological subjects of the composition in other ontological subjects, i.e. the partitions, of the composition. Using the participation information of the ontological subjects into each other a similarity matrix is built from which the semantic importance ranks of the partitions of the composition are calculated. The method, systematically, enables the calculation of semantic ranks of the ontological subjects of different orders of the composition. Various systems for implementing the method and numerous applications and services are disclosed.

Type: Grant

Filed: April 7, 2010

Date of Patent: December 17, 2013

Inventor: Hamid Hatami-Hanza
Clustering documents using citation patterns

Patent number: 8612411

Abstract: Systems and methods for clustering documents, such as for scientific documents, taking into account the citation patterns of the documents are disclosed. In one embodiment, the method includes locating citations to other documents, e.g., search result documents, comparing each pair of documents to be clustered for overlapping citations in a first, a more specific second, and an even more specific optional third citation generality, and determining clusters of related documents based on the comparisons. The levels of generalities may be, for example, document-, paragraph-, and/or citation-level generalities. The locating may locate only citations to the other documents to be clustered. The clusters may be determined based on a weighted score of the amount of overlapping citations in the various generalities and/or by performing factor analysis using the comparison results. The clusters may be ranked to determine the dominant clusters.

Type: Grant

Filed: December 31, 2003

Date of Patent: December 17, 2013

Assignee: Google Inc.

Inventor: Vibhu O. Mittal
LEXICAL ENRICHMENT OF STRUCTURED AND SEMI-STRUCTURED DATA

Publication number: 20130332458

Abstract: Generally discussed herein are systems and methods for lexically enriching structured and semi-structured data. In one or more embodiments, a method can include receiving a code, lexicalizing the code, lexically combining the lexicalized code with a lexical descriptor, and sending the lexical combination to a keyword database.

Type: Application

Filed: June 12, 2013

Publication date: December 12, 2013

Inventor: Arthur R. Culbertson
Latent metonymical analysis and indexing (LMAI)

Patent number: 8583419

Abstract: The present invention relates to Latent Metonymical analysis and Indexing (LMai) is a novel concept for Advance Machine Learning or Unsupervised Machine Learning Techniques, which uses a statistical approach to identify the relationship between the words in a set of given documents (Unstructured Data). This approach does not necessarily need training data to make decisions on matching the related words together but actually has the ability to do the classification by itself. All that is needed is to give the algorithm a set of natural documents. The method is elegant enough to classify the relationships automatically without any human guidance during the process as shown in FIGS. 6 and 7.

Type: Grant

Filed: April 2, 2007

Date of Patent: November 12, 2013

Inventor: Syed Yasin
Document representation transitioning

Patent number: 8584011

Abstract: One or more techniques and/or systems are provided for transitioning between representations of an electronic document. Elements, such as visual elements, common between a first set of elements from a first representation of the document and a second set of elements from a second representation of the document are identified. The non-intersecting elements from the first and second sets are respectively ranked in accordance with a representation relevance. First set non-intersecting elements are removed from an intermediate representation of the document, and second set non-intersecting elements are added to the intermediate representation, while the intermediate representation is not equivalent to the second representation; and respective iterations of the intermediate representation are output, such as to a display to depict a transition from the first representation of the document to the second representation of the document.

Type: Grant

Filed: June 22, 2010

Date of Patent: November 12, 2013

Assignee: Microsoft Corporation

Inventors: Jaime Teevan, Susan T. Dumais, Daniel J. Liebling
METHOD AND APPARATUS FOR PROCESSING ELECTRONIC DATA

Publication number: 20130290338

Abstract: A system (100) for generating a computer readable data file representative of a mapping between a first representation of a set of concepts or of a data structure (e.g. a database schema) and a second representation of a set of concepts or of a data structure (e.g. an ontology), each representation comprising a plurality of complex representational elements (e.g. tables in a database schema and concepts in an ontology) each of which may itself include a number of associated subordinate representational elements (e.g. columns/fields of a table in a database schema and attributes of a concept in an ontology).

Type: Application

Filed: December 23, 2011

Publication date: October 31, 2013

Applicant: BRITISH TELECOMMUNICATIONS PUBLIC LIMITED COMPANY

Inventors: Beum Seuk Lee, Zhan Cui
Generation of annotation tags based on multimodal metadata and structured semantic descriptors

Patent number: 8572086

Abstract: In one embodiment, a method of generating annotation tags (28) for a digital image (22) includes maintaining a library (16) of human-meaningful words or phrases organized as category entries (72) according to a number of defined image description categories (70), and receiving context metadata (20) associated with the capture of a given digital image (22). The method further includes selecting particular category entries (72-1, 72-2) as vocabulary metadata (24) for the digital image (22) by mapping the context metadata (20) into the library (16), and generating annotation tags (28) for the digital image (22) by logically combining the vocabulary metadata (24) according to a defined set of deductive logic rules (30) that are predicated on the defined image description categories (70). In another embodiment, a processing apparatus (12), such as a digital processor (18, 26) and supporting memory (14), etc., is configured to carry out the above method, or to carry out variations of the above method.

Type: Grant

Filed: January 21, 2009

Date of Patent: October 29, 2013

Assignee: Telefonaktiebolaget LM Ericsson (publ)

Inventors: Joakim Soderberg, Jonas Bjork, Andreas Fasbender
Entity clustering via data services

Patent number: 8572089

Abstract: A method is provided for forming an entity cluster. In this method, a plurality of entities found in one or more data sources are identified. An entity may represent a word or a phrase found in the one or more data sources. The plurality of entities may then be organized into groups, where each group has a master entity and a set of subordinate entities. The groups are formed using a first comparison criteria. Then, using a second comparison criteria, a first group is associated with a second group. The second comparison criteria may compare the master entities associated with the first and second groups. Based on the association between the first group and the second group, the method can then determine that the first entity is related to the second entity.

Type: Grant

Filed: December 15, 2011

Date of Patent: October 29, 2013

Assignee: Business Objects Software Ltd.

Inventor: Kimberly Starks
Automated rich presentation of a semantic topic

Patent number: 8572088

Abstract: Automated rich presentation of a semantic topic is described. In one aspect, respective portions of multimodal information corresponding to a semantic topic are evaluated to locate events associated with the semantic topic. The probability that a document belongs to an event is determined based on document inclusion of one or more of persons, times, locations, and keywords, and document distribution along a timeline associated with the event. For each event, one or more documents objectively determined to be substantially representative of the event are identified. One or more other types of media (e.g., video, images, etc.) related to the event are then extracted from the multimodal information. The representative documents and the other media are for presentation to a user in a storyboard.

Type: Grant

Filed: October 21, 2005

Date of Patent: October 29, 2013

Assignee: Microsoft Corporation

Inventors: Lie Lu, Wei-Ying Ma, Zhiwei Li

prev 1 2 3 4 5 6 7 8 9 … next