Latent Semantic Index Or Analysis (lsi Or Lsa) Patents (Class 707/739)
  • Patent number: 8566323
    Abstract: Methods and apparatus teach a digital spectrum of a file. The digital spectrum is used to map a file's position. This position relative to another file's position reveals closest neighbors. When multiple such neighbors are arranged, first “patterns” of data are created that further define digital spectrums of new files. It is within this sorted new data that emergent relationships or second “patterns” are examined, according to the techniques for its underlying files, or “patterns of patterns.” Representatively, original files are stored on computing devices. If encoded, they have pluralities of symbols representing an underlying data stream of original bits of data. The original files are examined for relationships between each of the files. The original relationships are converted to new files. The new files are representatively encoded and examined for other relationships.
    Type: Grant
    Filed: December 29, 2009
    Date of Patent: October 22, 2013
    Assignee: Novell, Inc.
    Inventors: Scott A. Isaacson, Craig N. Teerlink, Nadeem A. Nazeer
  • Publication number: 20130275432
    Abstract: A server includes an input information database (14) that stores input information where position information indicating a geographic position, a word given to the position, and a user ID identifying a user having given the word to the position are associated with one another, a dictionary database (15) that stores dictionary data indicating associations between words, and an association unit (17) that extracts a plurality of input information where the geographic positions are included in one geographic range and the words are associated with each other by referring to those databases, associates the extracted plurality of input information with each other by assigning a common identifier to the plurality of input information, and enters the plurality of input information into the input information database (14).
    Type: Application
    Filed: August 23, 2011
    Publication date: October 17, 2013
    Applicant: RAKUTEN, INC.
    Inventor: Udana Bandara
  • Patent number: 8560548
    Abstract: A computer-implemented method for accessing content items in a content store are described. In one embodiment, the computer-implemented method includes maintaining a text index of content items in a content store to enable a keyword search on the content items, receiving a query having a keyword and generating a hit list from the text index using the keyword, and extracting frequent phrases from text within content items of the hit list. The computer-implemented method also includes assigning a relative relevance to the frequent phrases and grouping content items into topics based on presence of relevant phrases within the content items of the hit list. The hit list includes one or more content items of the content store. The frequent phrases having a relatively high relevance are relevant phrases.
    Type: Grant
    Filed: August 19, 2009
    Date of Patent: October 15, 2013
    Assignee: International Business Machines Corporation
    Inventors: Akanksha Baid, Berthold Reinwald, Alkis Simitsis, John Sismanis
  • Patent number: 8560549
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating action trails from web history are described. In one aspect, a method includes receiving a web content access history of a user, the content access history including one or more user actions, each user action being associated with a content item upon which the user action is performed and identifying one or more action trails from the content access history, each action trail including a sequence of user actions performed one content items relating to a topic. Identifying a particular action trail includes clustering the user actions into a series of segments using temporal criteria; calculating semantic similarities between the content items, and adding a segment of the series of segments to the action trail when the semantic similarities between the segment and another segment satisfy a similarity threshold.
    Type: Grant
    Filed: May 14, 2012
    Date of Patent: October 15, 2013
    Assignee: Google Inc.
    Inventors: Elin R. Pedersen, Karl A. Gyllstrom, Shengyin Gu, Peter Jin Hong
  • Publication number: 20130268516
    Abstract: Systems and methods for analyzing and visualizing social events include historical, real-time, and predictive analytics and visualization of physical or virtual social events based on social network communications.
    Type: Application
    Filed: April 8, 2013
    Publication date: October 10, 2013
    Inventors: Imran Noor Chaudhri, Musa Ghani
  • Patent number: 8548999
    Abstract: An expanded queries data structure is described. The data structure is produced on the basis of a set of seed queries, and consists of entries each specifying an expanded query submitted by a user that has been determined to have a high degree of relatedness to at least a plurality of the seed queries of the set. The expanded queries specified by the entries of the expanded queries data structure can be used to define a segment of users expected to have interests characterized by the seed queries.
    Type: Grant
    Filed: August 15, 2011
    Date of Patent: October 1, 2013
    Assignee: AudienceScience Inc.
    Inventors: Yair Even-Zohar, Basem Nayfeh
  • Patent number: 8543380
    Abstract: In one embodiment, determining a document specificity includes accessing a record that records the clusters of documents. The number of themes of a document is determined from the number of clusters of the document. The specificity of the document is determined from the number of themes.
    Type: Grant
    Filed: October 1, 2008
    Date of Patent: September 24, 2013
    Assignee: Fujitsu Limited
    Inventors: David L. Marvit, Jawahar Jain, Stergios Stergiou
  • Publication number: 20130238606
    Abstract: The invention provides a system and method for retrieving and storing industrial data, the system comprising a data retriever that includes a data retrieval manager and one or more watchers for monitoring data associated with one or more industrial devices, wherein if the data associated with the one or more industrial devices is new or modified, the one or more watchers notifies the data retrieval manager thereof and the data retrieval manager uploads the new or modified data. The system further includes a database manager for receiving the new or modified data in a first form from the data retrieval manager and for storing the new or modified data in a structural data form in one or more databases.
    Type: Application
    Filed: May 2, 2013
    Publication date: September 12, 2013
    Applicant: Rockwell Automation Technologies, Inc.
    Inventors: Marek Obitko, Ivan Havel, Michal Fortik, Robert Mavrov, Radek Marik
  • Patent number: 8533195
    Abstract: Electronic documents are retrieved from a database and/or from a network of servers. The documents are topic modeled in accordance with a Regularized Latent Semantic Indexing approach. The Regularized Latent Semantic Indexing approach may allow an equation involving an approximation of a term-document matrix to be solved in parallel by multiple calculating units. The equation may include terms that are regularized via either l1 norm and/or via l2 norm. The Regularized Latent Semantic Indexing approach may be applied to a set, or a fixed number, of documents such that the set of documents is topic modeled. Alternatively, the Regularized Latent Semantic Indexing approach may be applied to a variable number of documents such that, over time, the variable of number of documents is topic modeled.
    Type: Grant
    Filed: June 27, 2011
    Date of Patent: September 10, 2013
    Assignee: Microsoft Corporation
    Inventors: Jun Xu, Hang Li, Nicholas Craswell
  • Patent number: 8527497
    Abstract: An indexing system for graph data. In particular implementations, the indexing system provides for denormalization and replica index functionality to improve query performance.
    Type: Grant
    Filed: September 8, 2011
    Date of Patent: September 3, 2013
    Assignee: Facebook, Inc.
    Inventors: Sanjeev Singh, Bret Steven Taylor, Paul Buchheit, James Norris, Tudor Bosman, Benjamin Darnell
  • Patent number: 8521744
    Abstract: An apparatus for authoring data in a communication system includes: an extraction unit configured to receive media corresponding to contents and extract contents information regarding the contents from the received media; a generation unit configured to generate a DMB ECG XML-based metadata comprising the extracted contents information; and a processing unit configured to visualize particulars of the DMB ECG XML-based metadata through a user interface and process the user interface so that the DMB ECG XML-based metadata is generated and edited on a template.
    Type: Grant
    Filed: November 12, 2010
    Date of Patent: August 27, 2013
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Seung-Jun Yang, Min-Sik Park, Han-Kyu Lee, Jin-Woo Hong
  • Patent number: 8521745
    Abstract: One or more classification algorithms are applied to at least one natural language document in order to extract both attributes and values of a given product. Supervised classification algorithms, semi-supervised classification algorithms, unsupervised classification algorithms or combinations of such classification algorithms may be employed for this purpose. The at least one natural language document may be obtained via a public communication network. Two or more attributes (or two or more values) thus identified may be merged to form one or more attribute phrases or value phrases. Once attributes and values have been extracted in this manner, association or linking operations may be performed to establish attribute-value pairs that are descriptive of the product. In a presently preferred embodiment, an (unsupervised) algorithm is used to generate seed attributes and values which can then support a supervised or semi-supervised classification algorithm.
    Type: Grant
    Filed: June 13, 2011
    Date of Patent: August 27, 2013
    Assignee: Accenture Global Services Limited
    Inventors: Katharina Probst, Rayid Ghani, Andrew E. Fano, Marko Krema, Yan Liu
  • Patent number: 8515959
    Abstract: A self-organizing personal file system is disclosed that evaluates the “importance” of terms and phrases in a document in a personal corpus relative to usage in a reference corpus. A personalized term weighting scheme assigns a weight to terms or phrases based on the frequency of occurrence of the corresponding term or phrase in a reference corpus. The personalized term weighting for a given term or phrase can be used to store and access documents containing the corresponding term or phrase in the spatial file system and provides coordinates in a spatial file system, for one or more documents containing the corresponding term or phrase. The location of a given document in a file space may be specified by the relative frequency distribution of the stems of its significant terms or phrases compared to the occurrence of such terms or phrases in a reference corpus.
    Type: Grant
    Filed: April 25, 2005
    Date of Patent: August 20, 2013
    Assignee: International Business Machines Corporation
    Inventors: Thomas A. Cofino, Jonathan Lenchner
  • Publication number: 20130212107
    Abstract: Enlargement values indicating a degree of enlargement when spatial data is stored in a partial spatial region are calculated for one or more partial spatial regions within a multidimensional index, and in the case where the enlargement value is greater than or equal to a threshold value, a new partial spatial region that contains at least the spatial data is generated.
    Type: Application
    Filed: January 22, 2013
    Publication date: August 15, 2013
    Applicant: CANON KABUSHIKI KAISHA
    Inventor: CANON KABUSHIKI KAISHA
  • Publication number: 20130212098
    Abstract: A computer-implemented system and method for generating a display of document clusters is described. Clusters of documents are presented in a multi-dimensional concept space. At least one document is selected from a collection of documents to be clusters. An angle ? of the document relative to a common origin of the multi-dimensional concept space is computed. The selected document is compared with each of the clusters. An angle ? from the common origin is determined for each cluster. A difference between the angle ? for the document and the angle ? for the cluster is determined. The difference is compared to the variance, and a new cluster is created when the difference exceeds the variance for all the clusters.
    Type: Application
    Filed: March 14, 2013
    Publication date: August 15, 2013
    Applicant: FTI TECHNOLOGY LLC
    Inventor: FTI TECHNOLOGY LLC
  • Publication number: 20130204877
    Abstract: A method, system, and computer program product for semantic attribution of a request. Source data statements for the request are received. A selection of a domain for the received source data statements is received. The received source data statements are semantically analyzed, which includes matching elements in the received source data statements to respective one or more entries in an ontology associated with the selected domain. The ontology includes items and relationships that define the selected domain. Each element in the received source data statements is a word or a phrase. The one or more entries are assigned to the matched elements, respectively, to annotate each matched element with a respective annotation consisting of the respective one or more entries. The annotated elements are saved with the respective annotations.
    Type: Application
    Filed: November 30, 2012
    Publication date: August 8, 2013
    Applicant: International Business Machines Corporation
    Inventor: International Business Machines Corporation
  • Patent number: 8504564
    Abstract: A method, apparatus and computer program product provides for a semantic analyzer to produce and rank semantic terms to reflect their relationship to the theme and topics of a document. The text and the document can have no relationship to any pre-selected keywords before the semantic analyzer performs text extraction. The semantic analyzer extracts text from a document and performs semantic analysis on the extracted text. The semantic analyzer provides a plurality of ranked semantic terms as a result of the semantic analysis and associates semantic terms with the document as semantic keywords. The semantic terms define content to be presented with the document where the content is an advertisement, a link to a remote information resource or a second document.
    Type: Grant
    Filed: December 15, 2010
    Date of Patent: August 6, 2013
    Assignee: Adobe Systems Incorporated
    Inventors: Walter Chang, Nadia Ghamrawi
  • Patent number: 8504565
    Abstract: A hierarchical distributed search mechanism is integrated into a distributed file system. Traditional file system APIs (create, open, close, read, write, link, rename, delete, . . . ) and the over-the-wire protocols employed to project these APIs into remote client sites (CIFS, NFS, DDS, Appletalk) are extended to enable the dynamic creation of temporary directories containing links to objects identified by search engines (executing at sites “close” to “their” data) as meeting the search criteria specified by the first parameter of a search function call. The search function, derived from the standard file system API function create, is added to the file system API.
    Type: Grant
    Filed: September 9, 2005
    Date of Patent: August 6, 2013
    Inventor: William M. Pitts
  • Patent number: 8498972
    Abstract: Inverted indexes for terms and for term separators are separately provided to minimize data redundancy. Search queries are parsed to identify terms and term separators, if any, and the corresponding inverted indexes are searched for responsive documents. Related apparatus, systems, techniques and articles are also described.
    Type: Grant
    Filed: December 16, 2010
    Date of Patent: July 30, 2013
    Assignee: SAP AG
    Inventors: Frederik Transier, Franz Faerber
  • Patent number: 8495513
    Abstract: Method and system for merging two objects in a business intelligence system. A first member is selected in the business intelligence system, the business intelligence system includes a user space, a content space, a data space, a master-data space and a metadata space. A relationship between the first member and a plurality of members selected from the group consisting of the user space, the content space, the data space, the master-data space, the metadata space is determined, which results in determined relationships for every member in the business intelligence system. Two members in the content space are then selected. Relationships between the two members in the plurality of determined relationships are traversed to determine the members in the traversed relationships. A preference is assigned to the members in the traversed relationships with close or exact relationships; and the members with the preference are merged.
    Type: Grant
    Filed: May 12, 2009
    Date of Patent: July 23, 2013
    Assignee: International Business Machines Corporation
    Inventors: Graham Douglas MacKintosh, John Andrew Kowal
  • Patent number: 8489607
    Abstract: A method and system are described for providing context sensitive data to a system user. The method includes the steps of identifying the user and querying databases to create a user context. Information is aggregated from the network databases and filtered using the user context. Providing the correct data needed by the user for that particular time, location and job function.
    Type: Grant
    Filed: November 10, 2009
    Date of Patent: July 16, 2013
    Assignee: General Electric Company
    Inventors: Christopher Scott Fuselier, John James Dougherty, Joseph John Fisher, Thomas A. Digate, Richard Alan Carpenter, Bernardo Anger
  • Patent number: 8484218
    Abstract: In one implementation, a method includes receiving a request for translation of one or more first keywords from a source language to a target language; and translating, using a machine translation process, the first keywords from the source language into a plurality of second keywords in the target language. The method can also include determining, by a computer system, frequencies with which each of the second keywords occur in a corpus associated with the target language. The method can further include selecting, by the computer system, a subset of the second keywords to use in the target language based on the determined frequencies of occurrence.
    Type: Grant
    Filed: April 21, 2011
    Date of Patent: July 9, 2013
    Assignee: Google Inc.
    Inventor: Mandayam Thondanur Raghunath
  • Patent number: 8484212
    Abstract: In an embodiment, a method comprises dividing collected data into data clusters based on proximity of the data and adjusting the clusters based on density of data in individual clusters. Based on first data points in a first cluster, a first average point in the first cluster is determined. Based on second data points in a second cluster, a second average point in the second cluster is determined. Aggregate data, comprising the first average point and the second average point, are stored in storage. Upon receiving a request to provide data for a particular coordinate, the reconstructed data point is determined by interpolating between the first average point and the second average point at the particular coordinate. Accordingly, aggregated data may be stored and when a request specifies data that was not actually stored, a reconstructed data point with an approximated data value may be provided as a substitute.
    Type: Grant
    Filed: January 21, 2011
    Date of Patent: July 9, 2013
    Assignee: Cisco Technology, Inc.
    Inventors: Ying Liu, Shahrokh Sadjadi
  • Publication number: 20130166561
    Abstract: An application server includes a Semantic Analysis Core Service (SACS) function that communicates with a Semantic Analysis Client (SAC) in a Set Top Box (STB). The SACS groups programs available for rendering to a subscriber into program clusters. The SACS generates the program clusters based on a determined semantic similarity between the programs, and on parameters that indicate a subscriber's preference for certain program content. The program that are semantically similar to existing clusters within a predetermined viewing window are provided to the STB and output to the subscriber on a display as a program preference list or channel line-up. The STB also monitors the subscriber's interaction with the programs and calculates a preference score for each program indicating the subscriber's continuing, or waning, interest in a given program. The preference score is used to update the score of the program cluster to which the program belongs.
    Type: Application
    Filed: December 22, 2011
    Publication date: June 27, 2013
    Inventors: Sorin Marian GEORGESCU, Edoardo GAVITA
  • Patent number: 8473845
    Abstract: An online video search system, including a tag discoverer including a web encyclopedia crawler for (i) accessing a web encyclopedia to find web pages related to at least one designated reference topic, and (ii) retrieving a plurality of web pages by performing an n-level depth recursive traversal of the web pages found, and web pages that are hyper-linked thereto, a concept extractor for extracting important concepts founds in the retrieved plurality of web pages, and a user interface for providing at least of the important concepts extracted by the web page processor to an online video search engine. A method and a computer-readable storage medium are also described and claimed.
    Type: Grant
    Filed: June 6, 2007
    Date of Patent: June 25, 2013
    Assignee: Reazer Investments L.L.C.
    Inventors: Marvin Igelman, Aleksandar Zivkovic
  • Publication number: 20130159313
    Abstract: A method includes accessing text, identifying a plurality of terms from the text, determining a plurality of term vectors associated with the identified plurality of terms, and clustering the determined plurality of term vectors into a plurality of clusters, the plurality of clusters comprising a first and a second cluster, the first and second clusters each comprising two or more of the determined term vectors. The method further includes creating a first pseudo-document according to the first cluster, creating a second pseudo-document according to the second cluster, identifying a first set of terms associated with the first cluster using latent semantic analysis (LSA) of the first pseudo-document, identifying a second set of terms associated with the second cluster using LSA of the second pseudo-document, and combining the first and second sets of terms into a list of output terms.
    Type: Application
    Filed: December 14, 2011
    Publication date: June 20, 2013
    Applicant: PUREDISCOVERY CORPORATION
    Inventor: Paul A. Jakubik
  • Patent number: 8458182
    Abstract: A method for clustering data or objects in an array, each element of the array corresponding to a similarity between the objects implemented within a computer linked with a database containing the data or objects The method includes determining a number of classes of objects based on values of the relationships computed between an object and a previously established class, for each class found, determining the value of each of the relationships between a class and the other classes, and merging certain classes, and taking each object of each class one by one, determining the value of the relationship of each object with each of the classes other than the class into which the object was initially classed, if the value of the relationship is greater then transferring the object to the new class, this is continued until all the values of the relationships are negative.
    Type: Grant
    Filed: December 21, 2009
    Date of Patent: June 4, 2013
    Assignee: Thales
    Inventor: Hamid Benhadda
  • Patent number: 8452772
    Abstract: Disclosed are methods, systems, and articles of manufactures for addressing popular topics in a social sphere. The method or the system continuously monitors conversations in online forum(s), identifies trend(s) of interest, identifies one or more content items that match the trend(s), and delivers the one or more content items to appropriate forum(s). The method or the system may aggregates conversations in a target forum to identify a trend and automatically responds to the trend by identifying and delivering matching existing content items to a target forum. The method or the system may further catalog a newly created content item upon creation and may identify a trend by employing some third-party products or services, by executing one or more Internet bots to monitor the online conversions, or by using trending application(s) offered by forums or social network websites.
    Type: Grant
    Filed: August 1, 2011
    Date of Patent: May 28, 2013
    Assignee: Intuit Inc.
    Inventors: Aliza D. Carpio, Alan F. Buhler, Joseph P. Elwell
  • Patent number: 8452760
    Abstract: According to one embodiment, a relevancy presentation apparatus includes a storage, an extraction unit, a first expansion unit, a second expansion unit, a determination unit and a generation unit. The storage stores topic networks. The extraction unit extracts subject keywords. The first expansion unit acquires first relevant words from the topic networks. The second expansion unit searches an ontologies for the subject keywords. The determination unit extracts common relevant words, and determines whether frequencies of appearances of relevant words are stationary. The generation unit generates search queries based on whether the frequencies of appearances are stationary, and generates search results.
    Type: Grant
    Filed: January 25, 2012
    Date of Patent: May 28, 2013
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Tomohiro Yamasaki, Masaru Suzuki
  • Publication number: 20130117268
    Abstract: A method includes identifying a table within a first document. The method includes analyzing at least one of: a column heading in the table, a row heading in the table, and data in a cell in the table. The method includes determining, based on the analysis, that the table contains financial data classifiable according to a taxonomy. The method includes analyzing, by a classification component comprising at least one classification engine, at least one of a column heading in the table and a row heading in the table. The method includes generating, by the classification component, a classification suggestion for at least one element in the table, based on the analysis of the classification component.
    Type: Application
    Filed: November 3, 2011
    Publication date: May 9, 2013
    Inventors: Ben Smith, Paul Warren, David North, Richard Ashby, Martin Hutchinson
  • Patent number: 8433709
    Abstract: Systems and methods for categorizing lexical data, accurately describing the structure of hierarchical data, accommodating lexicons having disparate data structures, pooling data from separate lexicons into aggregate lists, gathering data from participating users, and specified interfaces for handwriting recognition, optical character recognition, and text-to-speech and speech-to-text conversion are described. Some implementations can include a linguistic services center that interfaces with various natural language processing modules such that users of one module can take advantage of linguistic information provided in the system.
    Type: Grant
    Filed: November 25, 2008
    Date of Patent: April 30, 2013
    Inventor: Warren Daniel Child
  • Patent number: 8429178
    Abstract: In a single-signature duplicate document system, a secondary set of attributes is used in addition to a primary set of attributes so as to improve the precision of the system. When the projection of a document onto the primary set of attributes is below a threshold, then a secondary set of attributes is used to supplement the primary lexicon so that the projection is above the threshold.
    Type: Grant
    Filed: July 18, 2011
    Date of Patent: April 23, 2013
    Assignee: Facebook, Inc.
    Inventors: Joshua Alspector, Aleksander Kolcz, Abdur R. Chowdhury
  • Patent number: 8429167
    Abstract: A method and apparatus for determining contexts of information analyzed. Contexts may be determined for words, expressions, and other combinations of words in bodies of knowledge such as encyclopedias. Analysis of use provides a division of the universe of communication or information into domains, and selects words or expressions unique to those domains of subject matter as an aid in classifying information. A vocabulary list is created with a macro-context (context vector) for each, dependent upon the number of occurrences of unique terms from a domain, over each of the domains. This system may be used to find information or classify information by subsequent inputs of text, in calculation of macro-contexts, with ultimate determination of lists of micro-contests including terms closely aligned with the subject matter.
    Type: Grant
    Filed: August 8, 2006
    Date of Patent: April 23, 2013
    Assignee: Google Inc.
    Inventor: David C. Taylor
  • Publication number: 20130097123
    Abstract: A method and system for determining eligible communication partners utilizing an entity discovery engine is provided. The entity discovery engine coordinates the discovery of eligible communication partners. The entity discovery engine enables participants to discover other communication partners through the application of inputs. Starting with a data set of potential communication partners, the entity discovery engine uses inputs to identify eligible communication partners from the data set of potential communication partners. Inputs include policies that are applied broadly to limit categories of potential communication partners from being suggested as eligible communication partners. Identified eligible communication partners are suggested to enable communication relationships. Suggested eligible communication partners may be selected by a user or by an electronic communication device for initiating a communication relationship.
    Type: Application
    Filed: October 18, 2011
    Publication date: April 18, 2013
    Applicant: RESEARCH IN MOTION LIMITED
    Inventors: Brian Edward Anthony McColgan, Bruno Richard Preiss
  • Patent number: 8411305
    Abstract: A system and method is disclosed for identifying a record template within a file having reused objects. The method discloses: identifying, in the input file, a reused object and a set of pages upon which the reused object is located; computing a page distance between at least two adjacent instances of the reused object; generating an object recurrence pattern for the reused object; and reconstructing a record template, based on the object recurrence pattern, thereby identifying the records in the input file. The system discloses a processor, a profiler module, a pattern identification module, and a template reconstruction module for effecting the method.
    Type: Grant
    Filed: October 27, 2009
    Date of Patent: April 2, 2013
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Fabio Giannetti
  • Patent number: 8407242
    Abstract: Described are techniques to facilitate temporal features in a semantic data store. Information about lifetimes of facts in a semantic store is maintained. Even when a fact is logically deleted, a physical record is kept available. The record of a logically deleted or invalid fact has associated lifetime information. For example, valid-from and valid-to time values. The record of a fact not yet deleted may have a valid-from time value indicating when it was created, became valid, etc. Queries against the semantic store may specify a timeslice (a point in time or a time range). The lifetime information can be used to satisfy such time-specific queries. Because records are maintained after they are logically deleted, it is also possible to accurately query a past state of the semantic store. Even if such a query is run at different times, same results may be obtained.
    Type: Grant
    Filed: December 16, 2010
    Date of Patent: March 26, 2013
    Assignee: Microsoft Corporation
    Inventors: Thomas E Jackson, Stuart Bowers, Chris Karkanias, Allen Brown, David Campbell, Brian Aust
  • Patent number: 8407226
    Abstract: Systems, methods, and apparatus, including computer program products, for collaborative filtering are provided. In one implementation, a computer-implemented method is provided. The method includes receiving a shard of data representing a subset of a set of entities and a subset of a set of items, generating an iteration of a maximum likelihood estimate of a probability distribution model of a relationship between the set of entities and the set of items, the probability distribution model comprising a probability distribution of the set of items with respect to latent variables and a probability distribution of the latent variables with respect to the set of users, and generating statistics from results from the generating step which are passed to different shards for use in a next iteration of the maximum likelihood estimate.
    Type: Grant
    Filed: March 2, 2011
    Date of Patent: March 26, 2013
    Assignee: Google Inc.
    Inventors: Abhinandan S. Das, Ashutosh Garg, Mayur Datar
  • Patent number: 8402018
    Abstract: A semantic search system using a semantic ranking scheme including: an ontology analyzer analyzing ontology data related to a search target to determine a weight value of each property according to a weighing method for property; a semantic path extractor extracting all the semantic paths between resources and query keywords and determining a weight value of each extracted semantic path according to the semantic path weight value determination scheme by using the weight value of each property; a relevant resource searcher traversing an instance graph of ontology based on a semantic path having a pre-set length and weight value of more than an expectation level to search resources that have a semantic relationship with the query keywords and are declared as a type presented in the query; and a semantic relevance ranker selecting a top-k results having the highest rank from among the candidate results extracted by the relevant resource researcher by using a relevance scoring function.
    Type: Grant
    Filed: February 12, 2010
    Date of Patent: March 19, 2013
    Assignee: Korea Advanced Institute of Science and Technology
    Inventors: Ji-Hyun Lee, Chin-Wan Chung
  • Patent number: 8402026
    Abstract: A system and method for efficiently generating cluster groupings in a multi-dimensional concept space is described. A plurality of terms is extracted from each document in a collection of stored unstructured documents. A concept space is built over the document collection. Terms substantially correlated between a plurality of documents within the document collection are identified. Each correlated term is expressed as a vector mapped along an angle ? originating from a common axis in the concept space. A difference between the angle ? for each document and an angle ? for each cluster within the concept space is determined. Each such cluster is populated with those documents having such difference between the angle ? for each such document and the angle ? for each such cluster falling within a predetermined variance.
    Type: Grant
    Filed: August 3, 2004
    Date of Patent: March 19, 2013
    Assignee: FTI Technology LLC
    Inventor: Dan Gallivan
  • Patent number: 8392429
    Abstract: Methods, systems, and apparatus, including computer program products are provided for responding to search queries having results that identify books. In one aspect, a search query and multiple web pages that satisfy the search query and have a ranked order as responses to the search query are received. A subset of web pages that are each a reference page for a respective book are selected. A web page is a reference page for a book when the web page includes a reference to the book and satisfies a citation criterion for the book. A book score is assigned to each of the books for which there is at least one reference page in the group of highest ranking web pages. The book scores are used to select one or more of the books. A book reference is generated for each of the books and the book references are provided in response to the search query.
    Type: Grant
    Filed: November 26, 2008
    Date of Patent: March 5, 2013
    Assignee: Google Inc.
    Inventors: Daniel J. Clancy, Xuefu Wang
  • Patent number: 8392421
    Abstract: The present invention relates to a method of profiling an Internet endpoint associated with an Internet Protocol (IP) address, an IP prefix, or a domain name, the method includes generating a profiling rule using an Internet search engine, obtaining a search result by inputting the IP address, the IP prefix, or the domain name to the Internet search engine, and classifying the Internet endpoint based on the search result using the profiling rule.
    Type: Grant
    Filed: March 25, 2011
    Date of Patent: March 5, 2013
    Assignee: Narus, Inc.
    Inventors: Antonio Nucci, Supranamaya Ranjan, Aleksandar Kuzmanovic
  • Publication number: 20130054604
    Abstract: An approach is provided for providing information clustering based on predictive social graphs. An information clustering platform processes and/or facilitates a processing of one or more social graphs associated with one or more users to cause, at least in part, a prediction of one or more future states of the one or more social graphs. The information clustering platform further causes, at least in part, a clustering of one or more data items associated with at least one information space based, at least in part, on the one or more social graphs, the one or more future states, or a combination thereof.
    Type: Application
    Filed: August 30, 2011
    Publication date: February 28, 2013
    Applicant: Nokia Corporation
    Inventors: Sergey Boldyrev, Pavandeep Kalra
  • Patent number: 8386488
    Abstract: A method and system is provided for classifying and labeling information content (e.g., websites, databases, or the like) and also for profiling a user (e.g., interests or responsibilities) for accessing the information content, both using a coordinated labeling technique so that the content from multiple sources may be searched, identified and/or presented to the user according to the user's profile. This technique provides an ongoing update of information content and sources while at the same time filtering out unnecessary information that is irrelevant to the user's profile, resulting in more focused availability of information to the user. The user profile is matched with content of interest (as tagged by content creators reflective of categories that is also employed by a user profile) and matching content information may automatically be updated and made available to a user, in conformity with the user's profile.
    Type: Grant
    Filed: April 27, 2004
    Date of Patent: February 26, 2013
    Assignee: International Business Machines Corporation
    Inventors: Gregory L. Jones, Brian N. Phoenix, Ralph Tamlyn
  • Patent number: 8386490
    Abstract: A method of classifying a set of semantic concepts on a second multimedia collection based upon adapting a set of semantic concept classifiers and updating concept affinity relations that were developed to classify the set of semantic concepts for a first multimedia collection. The method comprises providing the second multimedia collection from a different domain and a processor automatically classifying the semantic concepts from the second multimedia collection by adapting the semantic concept classifiers and updating the concept affinity relations to the second multimedia collection based upon the local smoothness over the concept affinity relations and the local smoothness over data affinity relations.
    Type: Grant
    Filed: October 27, 2010
    Date of Patent: February 26, 2013
    Assignee: Eastman Kodak Company
    Inventors: Wei Jiang, Alexander C. Loui
  • Patent number: 8370362
    Abstract: An improved human user computer interface system, wherein a user characteristic or set of characteristics, such as demographic profile or societal “role”, is employed to define a scope or domain of operation. The operation itself may be a database search, to interactively define a taxonomic context for the operation, a business negotiation, or other activity. After retrieval of results, a scoring or ranking may be applied according to user define criteria, which are, for example, commensurate with the relevance to the context, but may be, for example, by date, source, or other secondary criteria. A user profile is preferably stored in a computer accessible form, and may be used to provide a history of use, persistent customization, collaborative filtering and demographic information for the user.
    Type: Grant
    Filed: September 14, 2010
    Date of Patent: February 5, 2013
    Assignee: Alberti Anemometer LLC
    Inventor: Andrew J. Szabo
  • Publication number: 20130031100
    Abstract: The present invention includes a system and method for generating a discussion group based on different electronic images. A mixed media reality database receives MMR objects that correspond to source material and indexes the MMR objects. A content management engine generates a cluster that includes MMR objects based on a similarity of source material. An MMR engine receives an electronic image from a user device, performs a visual search and identifies an MMR object that is associated with the electronic image. A social network application identifies a discussion group associated with the cluster that includes the MMR object and provides the user device with access to the discussion group.
    Type: Application
    Filed: October 13, 2011
    Publication date: January 31, 2013
    Applicant: RICOH COMPANY, LTD.
    Inventors: Jamey Graham, Timothee Bailloeul, Adit Gupta
  • Patent number: 8364682
    Abstract: Methods, systems, and apparatus, for refining log file join data. In one aspect, a method includes receiving first join data defining first joins of records in a first log file to records in a second log file. Each first join of a record in the first log file to a record in the second log file is based on an association of the first identifier of the record in the first log file to the second identifier of the record in the second log file. Associations of the first identifiers to the second identifiers in the first join data that meet a confidence threshold are stored in a mapping of first identifiers to second identifiers as a mapped association. For each mapped association, records that include the first identifier from the first log file are associated with records that include the second identifier from the second log file.
    Type: Grant
    Filed: May 5, 2011
    Date of Patent: January 29, 2013
    Assignee: Google Inc.
    Inventors: Ori Gershony, Wei Zheng, Andrei Pascovici, Tim Hesterberg
  • Publication number: 20130024440
    Abstract: Methods, systems and computer-readable media enable various techniques related to semantic navigation. One aspect is a technique for displaying semantically derived facets in the search engine interface. Each of the facets comprises faceted search results. Each of the faceted search results is displayed in association with user interface elements for including or excluding the faceted search result as additional search terms to subsequently refine the search query. Another aspect automatically infers new metadata from the content and from existing metadata and then automatically annotates the content with the new metadata to improve recall and navigation. Another aspect identifies semantic annotations by determining semantic connections between the semantic annotations and then dynamically generating a topic page based on the semantic connections.
    Type: Application
    Filed: July 22, 2011
    Publication date: January 24, 2013
    Inventors: Pascal Dimassimo, Steve Pettigrew, Martin Brousseau, Charles-Olivier Simard, Eric Williams, Francis Lacroix, Alex Dowgailenko, Agostino Deligia, Jean-Michel Texier
  • Patent number: 8356025
    Abstract: A method for analyzing sentiment comprising: collecting an object from an external content repository, the collected objects forming a content database; extracting a snippet related to the subject from the content database; calculating a sentiment score for the snippet; classifying the snippet into a sentiment category; creating sentiment taxonomy using the sentiment categories, the sentiment taxonomy classifying the snippets as positive, negative or neutral; identifying topic words within the sentiment taxonomy; classifying the topic words as a sentiment topic word candidates or a non-sentiment topic word candidate, filtering the non-sentiment topic word candidates; identifying the frequency of the non-sentiment topic words in each of the sentiment categories; identifying the importance of the non-sentiment topic word for each of the sentiment categories; and, ranking the topic word, wherein the rank is calculated by combining the frequency of the topic words in each of the categories with its importance.
    Type: Grant
    Filed: December 9, 2009
    Date of Patent: January 15, 2013
    Assignee: International Business Machines Corporation
    Inventors: Keke Cai, Ying Chen, William Scott Spangler, Li Zhang
  • Publication number: 20130013612
    Abstract: Certain example embodiments relate to techniques for analyzing documents. A plurality of documents/document portions are imported into a database, with at least some of the documents/document portions being structured and at least some being unstructured. The imported documents/document portions are organized into one or more collections. A selection of at least one of the one or more collections is made. An index of words and/or groups of words is built (and optionally refined in accordance with one or more predefined rules) based on each of the document or document portion in each selection. A document-word matrix is built (and optionally weighted using a semantic approach), with the matrix including a value indicative of a number of times each word and/or group of words in the index appears in each document/document portion. One or more clusters of documents are generated using the document-word matrix.
    Type: Application
    Filed: July 7, 2011
    Publication date: January 10, 2013
    Applicant: Software AG
    Inventors: Klaus FITTGES, Khalid El Mansouri