Latent Semantic Index Or Analysis (lsi Or Lsa) Patents (Class 707/739)
  • Publication number: 20130013612
    Abstract: Certain example embodiments relate to techniques for analyzing documents. A plurality of documents/document portions are imported into a database, with at least some of the documents/document portions being structured and at least some being unstructured. The imported documents/document portions are organized into one or more collections. A selection of at least one of the one or more collections is made. An index of words and/or groups of words is built (and optionally refined in accordance with one or more predefined rules) based on each of the document or document portion in each selection. A document-word matrix is built (and optionally weighted using a semantic approach), with the matrix including a value indicative of a number of times each word and/or group of words in the index appears in each document/document portion. One or more clusters of documents are generated using the document-word matrix.
    Type: Application
    Filed: July 7, 2011
    Publication date: January 10, 2013
    Applicant: Software AG
    Inventors: Klaus FITTGES, Khalid El Mansouri
  • Patent number: 8352472
    Abstract: Systems and methods for managing electronic data are disclosed. Various data management operations can be performed based on a metabase formed from metadata. Such metadata can be identified from an index of data interactions generated by a journaling module, and obtained from their associated data objects stored in one or more storage devices. In various embodiments, such processing of the index and storing of the metadata can facilitate, for example, enhanced data management operations, enhanced data identification operations, enhanced storage operations, data classification for organizing and storing the metadata, cataloging of metadata for the stored metadata, and/or user interfaces for managing data. In various embodiments, the metabase can be configured in different ways. For example, the metabase can be stored separately from the data objects so as to allow obtaining of information about the data objects without accessing the data objects or a data structure used by a file system.
    Type: Grant
    Filed: March 2, 2012
    Date of Patent: January 8, 2013
    Assignee: CommVault Systems, Inc.
    Inventors: Anand Prahlad, Jeremy Alan Schwartz, David Ngo, Brian Brockway, Marcus S. Muller
  • Patent number: 8346775
    Abstract: The different illustrative embodiments provide a method, a computer program product, and an apparatus for managing information. A request to store text in a table in a database is received. A determination is made as to whether a first collection of textual information having a first concept that is related to a second concept for the text is present in the database responsive to receiving the request containing the text. The text is associated with the first collection of textual information in the database responsive to a determination that the first collection of textual information in the database having the first concept that is related to the second concept for the text is present in the database. A second collection for the data with a third concept that is related to the second concept for the text within the degree of relatedness is created.
    Type: Grant
    Filed: August 31, 2010
    Date of Patent: January 1, 2013
    Assignee: International Business Machines Corporation
    Inventors: Sandra K. Johnson, Grant D. Miller, Robert F. Pryor
  • Publication number: 20120330959
    Abstract: A method for assessing a person's security risk includes receiving data from a plurality of disparate data sources in which at least two of the plurality of disparate data sources maintain their respective data in different manners. The method also includes identifying at least one item of data from at least two different data sources that correspond to a first real-world person. The method further includes merging the items from the at least two different data sources into a first record associated with the first real-world person. The method additionally includes identifying one or more relationships between the first real-world person and one or more other real-world people. The method also includes adding the identified one or more relationships to the first record associated with the first real-world person. The method further includes determining a level of risk associated with the first real-world person based on the first record.
    Type: Application
    Filed: June 27, 2011
    Publication date: December 27, 2012
    Applicant: Raytheon Company
    Inventors: Donald R. Kretz, Roderic W. Paulk
  • Patent number: 8341158
    Abstract: A computer-implemented method includes receiving a dataset representing a plurality of users, a plurality of items, and a plurality of ratings given to items by users; clustering the plurality of users into a plurality of user-groups such that at least one user belongs to more than one user-group; clustering the plurality of items into a plurality of item-groups such that at least one item belongs to more than one item-group; inducing a model describing a probabilistic relationship between the plurality of users, items, ratings, user-groups, and item-groups, the induced model defined by a plurality of model parameters; and predicting a rating of a user for an item using the induced model.
    Type: Grant
    Filed: November 21, 2005
    Date of Patent: December 25, 2012
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventor: Chiranjit Acharya
  • Patent number: 8341159
    Abstract: Methods, apparatus and systems are provided to generate from a set of training documents a set of training data and a set of features for a taxonomy of categories. In this generated taxonomy the degree of feature overlap among categories is minimized in order to optimize use with a machine-based categorizer. However, the categories still make sense to a human because a human makes the decisions regarding category definitions. In an example embodiment, for each category, a plurality of training documents selected using Web search engines is generated, the documents winnowed to produce a more refined set of training documents, and a set of features highly differentiating for that category within a set of categories (a supercategory) extracted. This set of training documents or differentiating features is used as input to a categorizer, which determines for a plurality of test documents the plurality of categories to which they best belong.
    Type: Grant
    Filed: April 12, 2007
    Date of Patent: December 25, 2012
    Assignee: International Business Machines Corporation
    Inventor: Stephen C. Gates
  • Publication number: 20120323920
    Abstract: A method for creating a semantically aggregated index in an indexer-agnostic index building system includes: extracting documents from a data source, each document including a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.
    Type: Application
    Filed: August 24, 2012
    Publication date: December 20, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Asim V. Singh, Kevin B. Wang
  • Patent number: 8335791
    Abstract: Tools and techniques are described herein for detecting synonyms and merging synonyms into search indexes. The tools provide methods that include receiving input documents for indexing into a search index file. The tools may compare parts of the input documents to parts of other documents already indexed into the search index file. The methods may also evaluate, based on these comparisons, whether the input document and the existing document are sufficiently similar to justify an inference that any dissimilar terms between the input document and the existing document are candidate synonyms. Other methods may include receiving requests to perforin searches that include one or more input keywords. The method then searches for links to synonyms of the input keyword, and returns search results responsive to the input keyword and to the synonyms.
    Type: Grant
    Filed: December 28, 2006
    Date of Patent: December 18, 2012
    Assignee: Amazon Technologies, Inc.
    Inventors: Michel L. Goldstein, Walter Manching Tseng, Randall Winston Puttick
  • Patent number: 8332416
    Abstract: A specification establishing method for controlling semiconductor process, the steps includes: sampling a plurality of sample groups from a population, each sample group being a non-normal distribution; filtering the sample groups; summarizing the filtered sample groups to form a non-normal distribution diagram; getting a value-at-risk and a median by calculating from the non-normal distribution diagram; getting a critical value by calculating the value-at-risk and the median with a critical formula; getting a plurality of state values by calculating the filtered sample groups with a proportion formula; and getting an index value by calculating the non-normal distribution diagram with the proportion formula. Thus, the state values indicate the states of the sample groups are abnormal or not by comparing the state values to the index value.
    Type: Grant
    Filed: January 11, 2011
    Date of Patent: December 11, 2012
    Assignee: Inotera Memories, Inc.
    Inventors: Cheng-Hao Chen, Yun-Zong Tian, Shih-Chang Kao, Yij Chieh Chu, Wei Jun Chen
  • Publication number: 20120310939
    Abstract: In accordance with the teachings described herein, systems and methods are provided for clustering time series based on forecast distributions. A method for clustering time series based on forecast distributions may include: receiving time series data relating to one or more aspects of a physical process; applying a forecasting model to the time series data to generate forecasted values and confidence intervals associated with the forecasted values, the confidence intervals being generated based on distribution information relating to the forecasted values; generating a distance matrix that identifies divergence in the forecasted values, the distance matrix being generated based the distribution information relating to the forecasted values; and performing a clustering operation on the plurality of forecasted values based on the distance matrix. The distance matrix may be generated using a symmetric Kullback-Leibler divergence algorithm.
    Type: Application
    Filed: June 6, 2011
    Publication date: December 6, 2012
    Inventors: Taiyeong Lee, David Rawlins Duling
  • Patent number: 8326836
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing time series information with search results. In one aspect, a method includes determining that a first query is indicative of a request for time series information; generating a cost estimate that quantifies one or more costs of including the time series information with one or more search results, each search result including a resource locator that references a corresponding resource determined to be responsive to the query; generating a benefit estimate; determining to generate the time series information when the benefit estimate is greater than the cost estimate and generating the time series information in response to the determination, wherein generating the time series information includes collecting responsive time series information from one or more resources; and determining to not generate the time series information when the cost estimate is greater than the benefit estimate.
    Type: Grant
    Filed: July 13, 2010
    Date of Patent: December 4, 2012
    Assignee: Google Inc.
    Inventors: Geoffrey Roeder Pike, Luigi Semenzato
  • Publication number: 20120303610
    Abstract: A system and method are provided for determining a dynamic relation tree based on images in an image collection. An example system includes a memory for storing computer executable instructions, and a processing unit for accessing the memory and executing the computer executable instructions. The computer executable instructions include an event classifier to classify main characters of images in an image collection as to an event identification based on events in which the main characters appear, wherein each main character is characterized as to at least one attribute; a relation determination engine to determine relation circles of the main characters; and a construction engine to construct a dynamic relation tree representative of relations among the main characters, where the dynamic relation tree provides representations of the positions of the main characters in the relation circles, and where views of the dynamic relation tree change when different time periods are specified.
    Type: Application
    Filed: May 25, 2011
    Publication date: November 29, 2012
    Inventor: Tong Zhang
  • Publication number: 20120290571
    Abstract: Aggregation, analysis, and presentation of patent and business data in a common interface are described. The analysis includes techniques for evaluating a patent or patent application by examining claim-related information. These techniques include deriving unique signatures of individual claims and ascertaining scope of individual claims relative to other claims in a collection (such as claims found in a common class). The signature and scope of patent claims may be graphically depicted using various graphics elements in a user interface.
    Type: Application
    Filed: April 15, 2012
    Publication date: November 15, 2012
    Applicant: IP Street
    Inventors: Lewis C. Lee, John Charles Vogel, Chad Eberle
  • Patent number: 8312005
    Abstract: A semantically aware relational database management system includes suitable programming to relate attributes of the relational database to semantic equivalents of such attributes. In response to receiving a query, the relational database management system performs at least one semantically aware operation on the data in the relational database in order to determine what data is to be retrieved in response to the query. Results of the query presented to a user may include data derived from performing the semantically aware operations.
    Type: Grant
    Filed: December 31, 2009
    Date of Patent: November 13, 2012
    Assignee: SAP AG
    Inventors: Maria E. Orlowska, Wasim Sadiq, Shazia Sadiq
  • Patent number: 8312021
    Abstract: One embodiment of the present invention provides a system that builds an association tensor (such as a matrix) to facilitate document and word-level processing operations. During operation, the system uses terms from a collection of documents to build an association tensor, which contains values representing pair-wise similarities between terms in the collection of documents. During this process, if a given value in the association tensor is calculated based on an insufficient number of samples, the system determines a corresponding value from a reference document collection, and then substitutes the corresponding value for the given value in the association tensor. After the association tensor is obtained, a dimensionality reduction method is applied to compute a low-dimensional vector space representation for the vocabulary terms. Document vectors are computed as linear combinations of term vectors.
    Type: Grant
    Filed: September 16, 2005
    Date of Patent: November 13, 2012
    Assignee: Palo Alto Research Center Incorporated
    Inventors: Irina Matveeva, Ayman Farahart
  • Patent number: 8306983
    Abstract: Representing in a database, a collection of items characterized by features. In a data processing system, determining a semantic space representations of the features across the collection. Each representation characterized by parameters and settings, and differing from each other by only one of: the value of one parameter, and the configuration of one setting. Determining, for each feature pair of a set of feature pairs, the relatedness of the first feature to the second feature in each semantic space representation. And representing the collection by the semantic space that provides the best aggregate relatedness across the set of feature pairs.
    Type: Grant
    Filed: October 26, 2009
    Date of Patent: November 6, 2012
    Assignee: Agilex Technologies, Inc.
    Inventor: Roger B. Bradford
  • Patent number: 8301633
    Abstract: Systems and methods for semantic search are provided. A corpus of information grouped into passages are indexed by semantic key terms generated from packed knowledge representations that document the semantic relationships of information within those passages. When a search is conducted, a query is similarly transformed into a packed knowledge representation that documents the semantic relationships from which semantic key terms are also generated. An inverted index relating the semantic key terms associated to the passages is searched using the semantic key terms generated from the query. A set of candidate passages is selected and refined by analysis of the semantic key terms and other information. The semantic representations associated with the set of candidate passages are then matched to the semantic representation of the query to determine a search result set.
    Type: Grant
    Filed: October 1, 2007
    Date of Patent: October 30, 2012
    Assignee: Palo Alto Research Center Incorporated
    Inventor: Robert D. Cheslow
  • Publication number: 20120271828
    Abstract: In one implementation, a method includes receiving a request for translation of one or more first keywords from a source language to a target language; and translating, using a machine translation process, the first keywords from the source language into a plurality of second keywords in the target language. The method can also include determining, by a computer system, frequencies with which each of the second keywords occur in a corpus associated with the target language. The method can further include selecting, by the computer system, a subset of the second keywords to use in the target language based on the determined frequencies of occurrence.
    Type: Application
    Filed: April 21, 2011
    Publication date: October 25, 2012
    Applicant: Google Inc.
    Inventor: Mandayam Thondanur Raghunath
  • Patent number: 8296297
    Abstract: A content analysis and correlation service system can include a summary manager service for generating content correlation summaries, wherein the generated content correlation summaries are based on discovered content and analyzed content based on the discovered content. The system can include a content search manager service for generating the discovered content based on search criteria and correlation criteria and a semantic analysis service for generating the analyzed content based on the discovered content. The system can also include a data store for storing the generated content correlation summaries and a notification service for providing notifications based on the generated content correlation summaries.
    Type: Grant
    Filed: December 30, 2008
    Date of Patent: October 23, 2012
    Assignee: Novell, Inc.
    Inventors: Tammy Green, Stephen R. Carter, Scott Alan Isaacson
  • Patent number: 8296302
    Abstract: The present invention provides a method and system for extending content based on the semantic meaning of content. It divides content into multiple content regions and finds words and/or phrases that are semantically relevant to the current content region and appends these words and/or phrases to the current content region as extended content. The extended content matches semantically with the original content in such a seamless way that users may think it is a part of the content.
    Type: Grant
    Filed: May 4, 2009
    Date of Patent: October 23, 2012
    Inventor: Gang Qiu
  • Patent number: 8290958
    Abstract: A system and method may be disclosed for facilitating the creation or modification of a document by providing a mechanism for locating relevant data from external sources and organizing and incorporating some or all of said data into the document. In the method for reusing data, there may be a set of documents that may be queried, where each document may be divided into a plurality of sections. A plurality of section text groups may be formed based on the set of documents, where each section text group may be associated with a respective section from the plurality of sections and each section group includes a plurality of items. Each item may be associated with a respective section from each document of the set of documents. A selected item within a selected section text group may be focused. The selected item may be extracted to a current document. The current document may be exported to a host application.
    Type: Grant
    Filed: May 30, 2003
    Date of Patent: October 16, 2012
    Assignee: Dictaphone Corporation
    Inventors: Keith W. Boone, Sunitha Chaparala, Cameron Fordyce, Sean Gervais, Roubik Manoukian, Harry J. Ogrinc, Robert G. Titemore, Jeffrey G. Hopkins
  • Publication number: 20120259854
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium including receiving user interaction data, wherein the user interaction specifies user interactions with content items and conversion items. A conversion item is a user action that satisfies a predetermined conversion criteria. The method includes receiving conversion data including conversion path data for a plurality of conversion paths, wherein each conversion path includes user interaction data prior to and including a conversion event. The method includes determining a first interaction, an assist interaction or a last interaction with content items for the conversion event. The method includes providing an ability to define a segment, using a processor, the conversion path data based on path-level dimensions and path-level metrics.
    Type: Application
    Filed: April 11, 2011
    Publication date: October 11, 2012
    Inventors: Sissie Ling-Ie HSIAO, Cameron Tangney, Nicholas Seckar, Brian Chatham
  • Publication number: 20120259853
    Abstract: Methods and systems for relating breaking news stories across content providers include receiving a breaking news headline for a breaking news from a content provider. The breaking news headline is tokenized in substantial real time by identifying a plurality of headline tokens. A plurality of news stories is received from a plurality of content providers. Each of the plurality of news stories is tokenized to identify a plurality of story tokens. The plurality of headline tokens and story tokens are analyzed to determine if one or more of the news stories are related to the breaking news headline. Based on the analysis, one or more of the news stories are mapped to the breaking news headline. The mapping enables presentation of the one or more news stories from one or more of the content providers while rendering the breaking news headline.
    Type: Application
    Filed: April 11, 2011
    Publication date: October 11, 2012
    Applicant: Yahoo!, Inc.
    Inventors: Abhijit Khasnis, Subramanian Narayanan
  • Publication number: 20120259856
    Abstract: A Website may be automatically categorized by (a) accepting Website information, (b) determining a set of scored clusters (e.g., semantic, term co-occurrence, etc.) for the Website using the Website information, and (c) determining at least one category (e.g., a vertical category) of a predefined taxonomy using at least some of the set of clusters.
    Type: Application
    Filed: June 20, 2012
    Publication date: October 11, 2012
    Inventors: David GEHRKING, Ching LAW, Andrew MAXWELL
  • Publication number: 20120259855
    Abstract: In the provided document clustering system (100), a concept tree structure accumulation unit (11) stores a concept tree structure that represents a hierarchical relationship among concepts represented by each of a plurality of words. For any two words, a concept similarity computation unit (12) obtains a concept similarity, which is an index indicating how close the concepts represented by the two words are. Using concept similarities for words that appear in two documents in a document set, an inter-document similarity computation unit (13) obtains an inter-document similarity, which indicates how similar the two documents are semantically. A clustering unit (14) uses inter-document similarities to cluster the documents in the document set.
    Type: Application
    Filed: December 21, 2010
    Publication date: October 11, 2012
    Applicant: NEC CORPORATION
    Inventors: Hironori Mizuguchi, Dai Kusui
  • Patent number: 8285719
    Abstract: Relational clustering has attracted more and more attention due to its phenomenal impact in various important applications which involve multi-type interrelated data objects, such as Web mining, search marketing, bioinformatics, citation analysis, and epidemiology. A probabilistic model is presented for relational clustering, which also provides a principal framework to unify various important clustering tasks including traditional attributes-based clustering, semi-supervised clustering, co-clustering and graph clustering. The model seeks to identify cluster structures for each type of data objects and interaction patterns between different types of objects. Under this model, parametric hard and soft relational clustering algorithms are provided under a large number of exponential family distributions.
    Type: Grant
    Filed: August 10, 2009
    Date of Patent: October 9, 2012
    Assignee: The Research Foundation of State University of New York
    Inventors: Bo Long, Zhongfei (Mark) Zhang
  • Patent number: 8285745
    Abstract: Systems and methods to determine relevant keywords from a user's search query sessions are disclosed. The described method includes identifying search session logs of a user, segmenting the search session logs into one or more search sessions. After the segmentation, the search sessions are analyzed to compose a list of semantically relevant keyword sets including at least a first keyword set and a second keyword set. The described method further includes determining a semantic relevance between the first and second keyword sets according to the frequency at which the first and second keyword sets are reported in the query results and displaying one or more semantically high relevant keyword sets after being filtered by a threshold.
    Type: Grant
    Filed: August 31, 2007
    Date of Patent: October 9, 2012
    Assignee: Microsoft Corporation
    Inventors: Hua Li, HuaJun Zeng, Jian Hu, Zheng Chen, Jian Wang
  • Patent number: 8280877
    Abstract: Systems and methods for implementing diverse topic phrase extraction are disclosed. According to one implementation, multiple word candidate phrases are extracted from a corpus and weighed. One or more documents are re-weighed to identify less obvious candidate topics using latent semantic analysis (LSA). Phrase diversification is then used to remove redundancy and select informative and distinct topic phrases.
    Type: Grant
    Filed: September 21, 2007
    Date of Patent: October 2, 2012
    Assignee: Microsoft Corporation
    Inventors: Benyu Zhang, Jilin Chen, Zheng Chen, HuaJun Zeng, Jian Wang
  • Patent number: 8275774
    Abstract: A streaming query system for extensible markup language is provided. An XPath query translator receives and analyzes a user-input XPath document. An abstract syntax tree analyzer establishes an abstract syntax tree. A XML parser receives and parses an XML document. An index generator generates an index for the XML document. A computation module performs a format calculation based on the abstract syntax tree and the index, and generates a query result accordingly.
    Type: Grant
    Filed: July 23, 2010
    Date of Patent: September 25, 2012
    Assignee: National Taiwan University of Science and Technology
    Inventors: Hahn-Ming Lee, Li-Zhen Liu, Chieh-Hung Lin, Jerome Yeh, Chia-Hsin Huang
  • Publication number: 20120239655
    Abstract: A system for storing digital images and accessing and storing digital image information using a communication network includes a plurality of independently controlled digital storage repositories associated with one or more different authorization groups, wherein a first digital storage repository includes a first digital image with associated first semantic information and wherein a second digital storage repository in a common authorization group with the first digital storage repository includes a second digital image with associated second semantic information and an associated second category, and wherein the processor of the first digital storage repository uses its computer program to independently access and match the first semantic information with the second semantic information, to associate the second category with the first semantic information, and to store the second category in association with the first semantic information in the first digital storage repository.
    Type: Application
    Filed: March 15, 2011
    Publication date: September 20, 2012
    Inventors: Ronald Steven Cok, Joseph Anthony Manico
  • Patent number: 8271496
    Abstract: A computer-readable medium stores computer-readable instructions that control a communication on a communication apparatus that obtains a content summary information having at least content location information from a server. The instructions cause the communication apparatus to perform steps. The steps include receiving a delivery source information inputted through a user operation, determining whether the delivery source information includes a predetermined character string. Content summary information corresponding to the inputted delivery source information is obtained when the determining step determines that the predetermined character string is not included in the delivery source information. Content summary information corresponding to a predetermined alternative delivery source information is obtained when the determining step determines that the predetermined character string is included in the delivery source information.
    Type: Grant
    Filed: September 30, 2009
    Date of Patent: September 18, 2012
    Assignee: Brother Kogyo Kabushiki Kaisha
    Inventor: Yusaku Takahashi
  • Publication number: 20120233150
    Abstract: Technologies pertaining to annotation aggregation are described herein. A user of a computing device assigns an annotation to a portion of a document, wherein the annotation comprises a tuple. The tuple comprises semantic relationships amongst words or phrases in the document. Relationship data is also generated, wherein the relationship data identifies the document, the author of the document, the author of the annotation, and other data.
    Type: Application
    Filed: March 11, 2011
    Publication date: September 13, 2012
    Applicant: Microsoft Corporation
    Inventors: Oscar Gerardo Naim, Lucretia Henrica Vanderwende, Krist Wongsuphasawat
  • Patent number: 8260664
    Abstract: Advertisements are selected for presentation on search result pages and web pages based on phrases generated from lateral concepts and topics identified for the search result pages and web pages. A search query or an indication of a web page is received for which advertisements are to be provided. Lateral concepts and topics are identified based on the search query or content of the web page. The lateral concepts and topics are used as phrases for selecting advertisements from an advertisement inventory. Selected advertisements are provided for presentation on a search results page in response to a search query or on a web page initially identified.
    Type: Grant
    Filed: February 5, 2010
    Date of Patent: September 4, 2012
    Assignee: Microsoft Corporation
    Inventors: Viswanath Vadlamani, Abhinai Srivastava, Tarek Najm, Munirathnam Srikanth, Phani Vaddadi, Arungunram Chandrasekaran Surendran, Rajeev Prasad
  • Publication number: 20120221574
    Abstract: A pivot is determined from enrolled data by a pivot determination unit, raw data is acquired, features are extracted from the raw data, a score is calculated as one of a distance and a degree of similarity between the features, an index vector is generated by using the score for the pivot, a ? score is calculated as one of a distance and a degree of similarity between the index vectors, a parameter of each non-pivot including a regression coefficient is trained by using training data, order to select the non-pivots is, by using the ? score between search data and the non-pivot as well as the regression coefficient, determined in descending order of posterior probability through logistic regression, and a search result is outputted based on the score between the search data and the enrolled data.
    Type: Application
    Filed: February 9, 2012
    Publication date: August 30, 2012
    Applicant: HITACHI, LTD.
    Inventors: Takao Murakami, Kenta Takahashi
  • Publication number: 20120209851
    Abstract: An apparatus and a method manage a received mobile transaction coupon in a mobile terminal. The apparatus includes a communication unit, an information analyzer, a schedule manager, an output unit, and a controller. The communication unit receives a mobile transaction coupon. The information analyzer obtains the received mobile transaction coupon information. The schedule manager registers the obtained mobile transaction coupon information in an alarm program. The output unit outputs the registered mobile transaction coupon information on a relevant date via the alarm program. The controller controls to register the mobile transaction coupon information in the alarm program, and controls to store the received mobile transaction coupon in a storage area corresponding to a reception type or a folder for a widget function.
    Type: Application
    Filed: February 9, 2012
    Publication date: August 16, 2012
    Applicant: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Byung-Kwon Kong, Soon-Mi Cho
  • Patent number: 8244701
    Abstract: Systems and methods for applying user behavior data to improve search query result ranking are provided. Upon receiving an update file indicating that recent, significant user behavior data is available for a document associated with an inverted index, the update file is published periodically and frequently to an index server. After filtering out the relevant update information from the update file, the index server extracts identifiers of the documents having the associated user behavior data. The update file and the identifier of the documents are utilized to update an in-memory index containing representations of metadata indicative of the user behavior. The in-memory index is continuously updated and utilized to serve search query results in response to user search queries. Search query results from the in-memory index are ranked using the user behavior data prior to serving. Thus, results associated with recent, significant user-behavior metadata receive prominent placement on the search results page.
    Type: Grant
    Filed: June 27, 2011
    Date of Patent: August 14, 2012
    Assignee: Microsoft Corporation
    Inventors: Walter Sun, Jay Kumar Goyal, Pratibha Permandla, Yinzhe Yu, Jingfeng Li
  • Patent number: 8244700
    Abstract: Systems and methods for performing an updating process to an in-memory index are provided. Upon receiving notice of document modifications covered by an inverted index associated with a search engine, in the form of an update file, a representation of the modification is published onto various index serving machines. Each index serving machine receiving the update file determines if the modifications are applicable to the index serving machine. If an index serving machine determines that it contains mapping information corresponding to the modified documents, the index serving machine utilizes the update file and associated mapping information to update an in-memory index. In embodiments, the in-memory index is used to provide results to user queries in tandem with the inverted index. In some embodiments, an extra in-memory index is maintained that is revised with constantly incoming metadata updates and the existing in-memory index is periodically swapped with the revised in-memory index.
    Type: Grant
    Filed: February 12, 2010
    Date of Patent: August 14, 2012
    Assignee: Microsoft Corporation
    Inventors: Pratibha Permandla, Yinzhe Yu, Guarav Sareen, Abhas Kumar
  • Patent number: 8234279
    Abstract: A streaming text data comparator performs real-time text data mining on streaming text data. The comparator receives a streaming text data document and generates a vector representation of the term frequencies relating to an existing document collection. The comparator then transforms the term frequency vector into a projection in a precomputed multidimensional subspace that represents the original document collection. The comparator further calculates a relationship value representing the similarities or differences between the vector representation and the subspace, and compares the relationship value to a predetermined threshold to determine whether the streaming text data document is related to the original document collection. If the streaming text data document is related, the streaming text data comparator intercalates the new document into the document collection. If the new document is not related, the comparator may store or delete the unrelated document.
    Type: Grant
    Filed: October 11, 2005
    Date of Patent: July 31, 2012
    Assignee: The Boeing Company
    Inventors: Yuan-Jye Wu, Anne S-W Kao, Stephen R. Poteet, William Ferng, Robert E. Cranfill
  • Publication number: 20120179684
    Abstract: A computer program product for an indexer-agnostic index building system includes a computer readable storage medium to store a computer readable program, wherein the computer readable program, when executed on a computer, causes the computer to perform operations for creating a semantically aggregated index. The operations include: extracting documents from a data source, wherein each document includes a data object; distributing the documents to a plurality of processing nodes within the system; for each node: indexing the data objects for each document into fields using semantic rules; and grouping indexed data objects for related fields by: classifying the documents into logical groups based on the semantic rules; and creating a searchable index shard for related logical groups.
    Type: Application
    Filed: January 12, 2011
    Publication date: July 12, 2012
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Alfredo Alba, Chad E. DeLuca, Vuk Ercegovac, Thomas D. Griffin, Jun Rao, Asim V. Singh, Kevin B. Wang
  • Publication number: 20120173532
    Abstract: According to one embodiment, a determination tree generating apparatus includes a determination unit, a condition generating unit, a determining unit, and a point branch generating unit. The determination unit provisionally and sequentially determines all component categories to be classification component categories for a first point of a determination tree. The point branch generating unit generates a first point assigned to a classification component category, and generates component names to be assigned to one or more branches leading from an assigned first point to one or more child points.
    Type: Application
    Filed: March 15, 2012
    Publication date: July 5, 2012
    Applicant: KABUSHIKI KAISHA TOSHIBA
    Inventor: Shigeta KUNINOBU
  • Publication number: 20120173531
    Abstract: Systems and methods for managing electronic data are disclosed. Various data management operations can be performed based on a metabase formed from metadata. Such metadata can be identified from an index of data interactions generated by a journaling module, and obtained from their associated data objects stored in one or more storage devices. In various embodiments, such processing of the index and storing of the metadata can facilitate, for example, enhanced data management operations, enhanced data identification operations, enhanced storage operations, data classification for organizing and storing the metadata, cataloging of metadata for the stored metadata, and/or user interfaces for managing data. In various embodiments, the metabase can be configured in different ways. For example, the metabase can be stored separately from the data objects so as to allow obtaining of information about the data objects without accessing the data objects or a data structure used by a file system.
    Type: Application
    Filed: March 2, 2012
    Publication date: July 5, 2012
    Applicant: COMMVAULT SYSTEMS, INC.
    Inventors: Anand Prahlad, Jeremy Alan Schwartz, David Ngo, Brian Brockway, Marcus S. Muller
  • Patent number: 8214368
    Abstract: An extracting unit extracts keywords from metadata extracted from played scenes. An attaching unit attaches a semantic class to the keywords. A semantic class determining unit determines whether the semantic class is a should-be-played class. When there is a keyword with the should-be-played class attached, an acquiring unit acquires at least one keyword without having the should-be-played class as a should-be-observed keyword. When the metadata includes the should-be-observed keyword and a keyword to which a should-be-stopped class is attached, an appearance determining unit determines that a scene including the should-be-observed keyword appears in contents.
    Type: Grant
    Filed: September 22, 2008
    Date of Patent: July 3, 2012
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Tomohiro Yamasaki, Takahiro Kawamura
  • Patent number: 8214367
    Abstract: Systems for recording, searching, and outputting display information are provided. In some embodiments, systems for recording display information are provided. The systems include a virtual display that: intercepts display-changes describing changes to be made to a state of a display; sends the display-changes to a client for display; records the display-changes; and a context recorder that records context information describing a state of the display derived from a source independently of the display changes and independently of screen-images. In some embodiments, the systems further include a display system that generates an output screen-image based at least in part on at least one of the display-changes and in response to a search of the context information. In some embodiments, the virtual display further records screen-images; and the display system further generates the output screen-image based at least in part on a recorded-screen-image of the recorded screen-images.
    Type: Grant
    Filed: February 27, 2008
    Date of Patent: July 3, 2012
    Assignee: The Trustees of Columbia University in the City of New York
    Inventors: Ricardo Baratto, Oren Laadan, Dan Phung, Shaya Joseph Potter, Jason Nieh
  • Patent number: 8209321
    Abstract: Computer-readable media, computerized methods, and computer systems for conducting semantic processes to present search results that include highlighted regions which are relevant to a conceptual meaning of a query are provided. Initially, content of document(s) is accessed and semantic representations are derived by distilling linguistic representations from the content. These semantic representations may be stored at a semantic index. Also, a proposition is derived from the query by parsing search terms of the query, and distilling the proposition from the search terms. Typically, the proposition is a logical representation of the conceptual meaning of the query. The proposition is compared against the semantic representations at the semantic index to identify a matching set. Regions of the content within the document, from which the matching set of semantic representations are derived, are targeted.
    Type: Grant
    Filed: August 29, 2008
    Date of Patent: June 26, 2012
    Assignee: Microsoft Corporation
    Inventors: Barney Pell, Scott Prevost, Giovanni Lorenzo Thione, Brendan O'Connor, Lukas Biewald
  • Patent number: 8204903
    Abstract: Semantic queries are expressed and executed within a relational database. This can be done by defining semantic rules applied to execute the semantic queries using table valued functions and common table expressions, and then simply calling the defined table valued functions to execute the queries.
    Type: Grant
    Filed: February 16, 2010
    Date of Patent: June 19, 2012
    Assignee: Microsoft Corporation
    Inventors: Stuart M. Bowers, Thomas E. Jackson, Chris Demetrios Karkanias, Allen L. Brown, David G. Campbell, Brian S. Aust
  • Patent number: 8204736
    Abstract: A mechanism is provided for determining a second document of a set of documents in a second language having the same textual content as a first document in a first language. A first histogram that is indicative of the textual content of the first document is generated. A second histogram is generated for each document of the set of documents. Each second histogram is indicative of the textual content of a document of the set of documents. Each second histogram is compared with the first histogram to determine at least one histogram from the plurality of second histograms which matches the first histogram. The second document is then identified as the document having the at least one histogram.
    Type: Grant
    Filed: November 6, 2008
    Date of Patent: June 19, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ossama Emam, Ahmed Hassan, Hany M. Hassan
  • Patent number: 8200672
    Abstract: In a search support server, a related word extraction unit generates frequency information and co-occurrence information of keywords, a graph generation unit generates coordinate information of a spring graph including the keywords as nodes, on the basis of the co-occurrence information, a cluster generation unit groups the nodes into clusters and thereby generates cluster definition information, and a display information generation unit generates display information of the spring graph. In addition, an operation determination unit determines which operation is performed on the spring graph. Then, when a level change is instructed, the display information generation unit generates display information of the spring graph after the level is changed. When a node change is instructed, a cluster re-generation unit changes the cluster definition information and the frequency information.
    Type: Grant
    Filed: June 24, 2009
    Date of Patent: June 12, 2012
    Assignee: International Business Machines Corporation
    Inventors: Noritaka Adachi, Shinya Kawanaka, Yoshitaka Matsumoto, Raymond Harry Putra Rudy
  • Patent number: 8195662
    Abstract: A density-based data clustering method, comprising a parameter-setting step, a first retrieving step, a first determination step, a second determination step, a second retrieving step, a third determination step and first and second termination determination steps. The parameter-setting step sets parameters. The first retrieving step retrieves one data point and defines neighboring points. The first determination step determines whether the number of the data points exceeds the minimum threshold value. The second determination step arranges a plurality of first border symbols. The second retrieving step retrieves one seed data point from the seed list, arranges a plurality of second border symbols and defines seed neighboring points. The third determination step determines whether a data point density of searching ranges of the seed neighboring points is the same. The first termination determination step determines whether the clustering is finished.
    Type: Grant
    Filed: January 6, 2010
    Date of Patent: June 5, 2012
    Assignee: National Pingtung University Of Science & Technology
    Inventors: Cheng-Fa Tsai, Yi-Ching Huang
  • Publication number: 20120136865
    Abstract: An approach is provided for determining and utilizing geographical locations contextually relevant to a user. A contextually relevant location platform determines location-based data associated with a user and/or user device. The contextually relevant location platform determines stationary points based, at least in part, on the location-based data. The contextually relevant location platform determines context data associated with the stationary points. The contextually relevant location platform determines at least one location anchor based, at least in part, on the stationary points and the associated context data, wherein the at least one location anchor represents a bounded geographical area of contextual relevance to the user.
    Type: Application
    Filed: November 30, 2010
    Publication date: May 31, 2012
    Applicant: Nokia Corporation
    Inventors: Jan Otto Blom, Gian Paolo Perrucci, Mats Lönngren, Juha Kalevi Laurila, Niko Tapani Kiukkonen, Julien Eberle, Daniel Gatica-Perez, Raul Montoliu-Colas, Julian Charles Nolan
  • Publication number: 20120124050
    Abstract: A system for harmonized commodity description and coding system (HS) code recommendation includes an ontology editor for creating an HS code ontology based on HS codes of export and import items, and a feature vector processor for extracting feature vectors of a product of a company requesting for an HS code of the product by with reference to the description of the product in response to the request. An HS code recommendation unit extracts one or more HS codes appropriate for the product by comparing the extracted feature vectors with feature vectors of the product searched from a feature vector database. The extracted HS codes are provided to the company requesting for an HS code of the product.
    Type: Application
    Filed: November 16, 2011
    Publication date: May 17, 2012
    Applicant: Electronics and Telecommunications Research Institute
    Inventors: Kyung-Ah YANG, Moonyoung CHUNG, Kyong-I KU