Latent Semantic Index Or Analysis (lsi Or Lsa) Patents (Class 707/739)
-
Patent number: 8566323Abstract: Methods and apparatus teach a digital spectrum of a file. The digital spectrum is used to map a file's position. This position relative to another file's position reveals closest neighbors. When multiple such neighbors are arranged, first “patterns” of data are created that further define digital spectrums of new files. It is within this sorted new data that emergent relationships or second “patterns” are examined, according to the techniques for its underlying files, or “patterns of patterns.” Representatively, original files are stored on computing devices. If encoded, they have pluralities of symbols representing an underlying data stream of original bits of data. The original files are examined for relationships between each of the files. The original relationships are converted to new files. The new files are representatively encoded and examined for other relationships.Type: GrantFiled: December 29, 2009Date of Patent: October 22, 2013Assignee: Novell, Inc.Inventors: Scott A. Isaacson, Craig N. Teerlink, Nadeem A. Nazeer
-
Publication number: 20130275432Abstract: A server includes an input information database (14) that stores input information where position information indicating a geographic position, a word given to the position, and a user ID identifying a user having given the word to the position are associated with one another, a dictionary database (15) that stores dictionary data indicating associations between words, and an association unit (17) that extracts a plurality of input information where the geographic positions are included in one geographic range and the words are associated with each other by referring to those databases, associates the extracted plurality of input information with each other by assigning a common identifier to the plurality of input information, and enters the plurality of input information into the input information database (14).Type: ApplicationFiled: August 23, 2011Publication date: October 17, 2013Applicant: RAKUTEN, INC.Inventor: Udana Bandara
-
Patent number: 8560548Abstract: A computer-implemented method for accessing content items in a content store are described. In one embodiment, the computer-implemented method includes maintaining a text index of content items in a content store to enable a keyword search on the content items, receiving a query having a keyword and generating a hit list from the text index using the keyword, and extracting frequent phrases from text within content items of the hit list. The computer-implemented method also includes assigning a relative relevance to the frequent phrases and grouping content items into topics based on presence of relevant phrases within the content items of the hit list. The hit list includes one or more content items of the content store. The frequent phrases having a relatively high relevance are relevant phrases.Type: GrantFiled: August 19, 2009Date of Patent: October 15, 2013Assignee: International Business Machines CorporationInventors: Akanksha Baid, Berthold Reinwald, Alkis Simitsis, John Sismanis
-
Patent number: 8560549Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating action trails from web history are described. In one aspect, a method includes receiving a web content access history of a user, the content access history including one or more user actions, each user action being associated with a content item upon which the user action is performed and identifying one or more action trails from the content access history, each action trail including a sequence of user actions performed one content items relating to a topic. Identifying a particular action trail includes clustering the user actions into a series of segments using temporal criteria; calculating semantic similarities between the content items, and adding a segment of the series of segments to the action trail when the semantic similarities between the segment and another segment satisfy a similarity threshold.Type: GrantFiled: May 14, 2012Date of Patent: October 15, 2013Assignee: Google Inc.Inventors: Elin R. Pedersen, Karl A. Gyllstrom, Shengyin Gu, Peter Jin Hong
-
Publication number: 20130268516Abstract: Systems and methods for analyzing and visualizing social events include historical, real-time, and predictive analytics and visualization of physical or virtual social events based on social network communications.Type: ApplicationFiled: April 8, 2013Publication date: October 10, 2013Inventors: Imran Noor Chaudhri, Musa Ghani
-
Patent number: 8548999Abstract: An expanded queries data structure is described. The data structure is produced on the basis of a set of seed queries, and consists of entries each specifying an expanded query submitted by a user that has been determined to have a high degree of relatedness to at least a plurality of the seed queries of the set. The expanded queries specified by the entries of the expanded queries data structure can be used to define a segment of users expected to have interests characterized by the seed queries.Type: GrantFiled: August 15, 2011Date of Patent: October 1, 2013Assignee: AudienceScience Inc.Inventors: Yair Even-Zohar, Basem Nayfeh
-
Patent number: 8543380Abstract: In one embodiment, determining a document specificity includes accessing a record that records the clusters of documents. The number of themes of a document is determined from the number of clusters of the document. The specificity of the document is determined from the number of themes.Type: GrantFiled: October 1, 2008Date of Patent: September 24, 2013Assignee: Fujitsu LimitedInventors: David L. Marvit, Jawahar Jain, Stergios Stergiou
-
Publication number: 20130238606Abstract: The invention provides a system and method for retrieving and storing industrial data, the system comprising a data retriever that includes a data retrieval manager and one or more watchers for monitoring data associated with one or more industrial devices, wherein if the data associated with the one or more industrial devices is new or modified, the one or more watchers notifies the data retrieval manager thereof and the data retrieval manager uploads the new or modified data. The system further includes a database manager for receiving the new or modified data in a first form from the data retrieval manager and for storing the new or modified data in a structural data form in one or more databases.Type: ApplicationFiled: May 2, 2013Publication date: September 12, 2013Applicant: Rockwell Automation Technologies, Inc.Inventors: Marek Obitko, Ivan Havel, Michal Fortik, Robert Mavrov, Radek Marik
-
Patent number: 8533195Abstract: Electronic documents are retrieved from a database and/or from a network of servers. The documents are topic modeled in accordance with a Regularized Latent Semantic Indexing approach. The Regularized Latent Semantic Indexing approach may allow an equation involving an approximation of a term-document matrix to be solved in parallel by multiple calculating units. The equation may include terms that are regularized via either l1 norm and/or via l2 norm. The Regularized Latent Semantic Indexing approach may be applied to a set, or a fixed number, of documents such that the set of documents is topic modeled. Alternatively, the Regularized Latent Semantic Indexing approach may be applied to a variable number of documents such that, over time, the variable of number of documents is topic modeled.Type: GrantFiled: June 27, 2011Date of Patent: September 10, 2013Assignee: Microsoft CorporationInventors: Jun Xu, Hang Li, Nicholas Craswell
-
Patent number: 8527497Abstract: An indexing system for graph data. In particular implementations, the indexing system provides for denormalization and replica index functionality to improve query performance.Type: GrantFiled: September 8, 2011Date of Patent: September 3, 2013Assignee: Facebook, Inc.Inventors: Sanjeev Singh, Bret Steven Taylor, Paul Buchheit, James Norris, Tudor Bosman, Benjamin Darnell
-
Patent number: 8521744Abstract: An apparatus for authoring data in a communication system includes: an extraction unit configured to receive media corresponding to contents and extract contents information regarding the contents from the received media; a generation unit configured to generate a DMB ECG XML-based metadata comprising the extracted contents information; and a processing unit configured to visualize particulars of the DMB ECG XML-based metadata through a user interface and process the user interface so that the DMB ECG XML-based metadata is generated and edited on a template.Type: GrantFiled: November 12, 2010Date of Patent: August 27, 2013Assignee: Electronics and Telecommunications Research InstituteInventors: Seung-Jun Yang, Min-Sik Park, Han-Kyu Lee, Jin-Woo Hong
-
Patent number: 8521745Abstract: One or more classification algorithms are applied to at least one natural language document in order to extract both attributes and values of a given product. Supervised classification algorithms, semi-supervised classification algorithms, unsupervised classification algorithms or combinations of such classification algorithms may be employed for this purpose. The at least one natural language document may be obtained via a public communication network. Two or more attributes (or two or more values) thus identified may be merged to form one or more attribute phrases or value phrases. Once attributes and values have been extracted in this manner, association or linking operations may be performed to establish attribute-value pairs that are descriptive of the product. In a presently preferred embodiment, an (unsupervised) algorithm is used to generate seed attributes and values which can then support a supervised or semi-supervised classification algorithm.Type: GrantFiled: June 13, 2011Date of Patent: August 27, 2013Assignee: Accenture Global Services LimitedInventors: Katharina Probst, Rayid Ghani, Andrew E. Fano, Marko Krema, Yan Liu
-
Patent number: 8515959Abstract: A self-organizing personal file system is disclosed that evaluates the “importance” of terms and phrases in a document in a personal corpus relative to usage in a reference corpus. A personalized term weighting scheme assigns a weight to terms or phrases based on the frequency of occurrence of the corresponding term or phrase in a reference corpus. The personalized term weighting for a given term or phrase can be used to store and access documents containing the corresponding term or phrase in the spatial file system and provides coordinates in a spatial file system, for one or more documents containing the corresponding term or phrase. The location of a given document in a file space may be specified by the relative frequency distribution of the stems of its significant terms or phrases compared to the occurrence of such terms or phrases in a reference corpus.Type: GrantFiled: April 25, 2005Date of Patent: August 20, 2013Assignee: International Business Machines CorporationInventors: Thomas A. Cofino, Jonathan Lenchner
-
Publication number: 20130212107Abstract: Enlargement values indicating a degree of enlargement when spatial data is stored in a partial spatial region are calculated for one or more partial spatial regions within a multidimensional index, and in the case where the enlargement value is greater than or equal to a threshold value, a new partial spatial region that contains at least the spatial data is generated.Type: ApplicationFiled: January 22, 2013Publication date: August 15, 2013Applicant: CANON KABUSHIKI KAISHAInventor: CANON KABUSHIKI KAISHA
-
Publication number: 20130212098Abstract: A computer-implemented system and method for generating a display of document clusters is described. Clusters of documents are presented in a multi-dimensional concept space. At least one document is selected from a collection of documents to be clusters. An angle ? of the document relative to a common origin of the multi-dimensional concept space is computed. The selected document is compared with each of the clusters. An angle ? from the common origin is determined for each cluster. A difference between the angle ? for the document and the angle ? for the cluster is determined. The difference is compared to the variance, and a new cluster is created when the difference exceeds the variance for all the clusters.Type: ApplicationFiled: March 14, 2013Publication date: August 15, 2013Applicant: FTI TECHNOLOGY LLCInventor: FTI TECHNOLOGY LLC
-
Publication number: 20130204877Abstract: A method, system, and computer program product for semantic attribution of a request. Source data statements for the request are received. A selection of a domain for the received source data statements is received. The received source data statements are semantically analyzed, which includes matching elements in the received source data statements to respective one or more entries in an ontology associated with the selected domain. The ontology includes items and relationships that define the selected domain. Each element in the received source data statements is a word or a phrase. The one or more entries are assigned to the matched elements, respectively, to annotate each matched element with a respective annotation consisting of the respective one or more entries. The annotated elements are saved with the respective annotations.Type: ApplicationFiled: November 30, 2012Publication date: August 8, 2013Applicant: International Business Machines CorporationInventor: International Business Machines Corporation
-
Patent number: 8504564Abstract: A method, apparatus and computer program product provides for a semantic analyzer to produce and rank semantic terms to reflect their relationship to the theme and topics of a document. The text and the document can have no relationship to any pre-selected keywords before the semantic analyzer performs text extraction. The semantic analyzer extracts text from a document and performs semantic analysis on the extracted text. The semantic analyzer provides a plurality of ranked semantic terms as a result of the semantic analysis and associates semantic terms with the document as semantic keywords. The semantic terms define content to be presented with the document where the content is an advertisement, a link to a remote information resource or a second document.Type: GrantFiled: December 15, 2010Date of Patent: August 6, 2013Assignee: Adobe Systems IncorporatedInventors: Walter Chang, Nadia Ghamrawi
-
Full text search capabilities integrated into distributed file systems— incrementally indexing files
Patent number: 8504565Abstract: A hierarchical distributed search mechanism is integrated into a distributed file system. Traditional file system APIs (create, open, close, read, write, link, rename, delete, . . . ) and the over-the-wire protocols employed to project these APIs into remote client sites (CIFS, NFS, DDS, Appletalk) are extended to enable the dynamic creation of temporary directories containing links to objects identified by search engines (executing at sites “close” to “their” data) as meeting the search criteria specified by the first parameter of a search function call. The search function, derived from the standard file system API function create, is added to the file system API.Type: GrantFiled: September 9, 2005Date of Patent: August 6, 2013Inventor: William M. Pitts -
Patent number: 8498972Abstract: Inverted indexes for terms and for term separators are separately provided to minimize data redundancy. Search queries are parsed to identify terms and term separators, if any, and the corresponding inverted indexes are searched for responsive documents. Related apparatus, systems, techniques and articles are also described.Type: GrantFiled: December 16, 2010Date of Patent: July 30, 2013Assignee: SAP AGInventors: Frederik Transier, Franz Faerber
-
Patent number: 8495513Abstract: Method and system for merging two objects in a business intelligence system. A first member is selected in the business intelligence system, the business intelligence system includes a user space, a content space, a data space, a master-data space and a metadata space. A relationship between the first member and a plurality of members selected from the group consisting of the user space, the content space, the data space, the master-data space, the metadata space is determined, which results in determined relationships for every member in the business intelligence system. Two members in the content space are then selected. Relationships between the two members in the plurality of determined relationships are traversed to determine the members in the traversed relationships. A preference is assigned to the members in the traversed relationships with close or exact relationships; and the members with the preference are merged.Type: GrantFiled: May 12, 2009Date of Patent: July 23, 2013Assignee: International Business Machines CorporationInventors: Graham Douglas MacKintosh, John Andrew Kowal
-
Patent number: 8489607Abstract: A method and system are described for providing context sensitive data to a system user. The method includes the steps of identifying the user and querying databases to create a user context. Information is aggregated from the network databases and filtered using the user context. Providing the correct data needed by the user for that particular time, location and job function.Type: GrantFiled: November 10, 2009Date of Patent: July 16, 2013Assignee: General Electric CompanyInventors: Christopher Scott Fuselier, John James Dougherty, Joseph John Fisher, Thomas A. Digate, Richard Alan Carpenter, Bernardo Anger
-
Patent number: 8484218Abstract: In one implementation, a method includes receiving a request for translation of one or more first keywords from a source language to a target language; and translating, using a machine translation process, the first keywords from the source language into a plurality of second keywords in the target language. The method can also include determining, by a computer system, frequencies with which each of the second keywords occur in a corpus associated with the target language. The method can further include selecting, by the computer system, a subset of the second keywords to use in the target language based on the determined frequencies of occurrence.Type: GrantFiled: April 21, 2011Date of Patent: July 9, 2013Assignee: Google Inc.Inventor: Mandayam Thondanur Raghunath
-
Patent number: 8484212Abstract: In an embodiment, a method comprises dividing collected data into data clusters based on proximity of the data and adjusting the clusters based on density of data in individual clusters. Based on first data points in a first cluster, a first average point in the first cluster is determined. Based on second data points in a second cluster, a second average point in the second cluster is determined. Aggregate data, comprising the first average point and the second average point, are stored in storage. Upon receiving a request to provide data for a particular coordinate, the reconstructed data point is determined by interpolating between the first average point and the second average point at the particular coordinate. Accordingly, aggregated data may be stored and when a request specifies data that was not actually stored, a reconstructed data point with an approximated data value may be provided as a substitute.Type: GrantFiled: January 21, 2011Date of Patent: July 9, 2013Assignee: Cisco Technology, Inc.Inventors: Ying Liu, Shahrokh Sadjadi
-
Publication number: 20130166561Abstract: An application server includes a Semantic Analysis Core Service (SACS) function that communicates with a Semantic Analysis Client (SAC) in a Set Top Box (STB). The SACS groups programs available for rendering to a subscriber into program clusters. The SACS generates the program clusters based on a determined semantic similarity between the programs, and on parameters that indicate a subscriber's preference for certain program content. The program that are semantically similar to existing clusters within a predetermined viewing window are provided to the STB and output to the subscriber on a display as a program preference list or channel line-up. The STB also monitors the subscriber's interaction with the programs and calculates a preference score for each program indicating the subscriber's continuing, or waning, interest in a given program. The preference score is used to update the score of the program cluster to which the program belongs.Type: ApplicationFiled: December 22, 2011Publication date: June 27, 2013Inventors: Sorin Marian GEORGESCU, Edoardo GAVITA
-
Patent number: 8473845Abstract: An online video search system, including a tag discoverer including a web encyclopedia crawler for (i) accessing a web encyclopedia to find web pages related to at least one designated reference topic, and (ii) retrieving a plurality of web pages by performing an n-level depth recursive traversal of the web pages found, and web pages that are hyper-linked thereto, a concept extractor for extracting important concepts founds in the retrieved plurality of web pages, and a user interface for providing at least of the important concepts extracted by the web page processor to an online video search engine. A method and a computer-readable storage medium are also described and claimed.Type: GrantFiled: June 6, 2007Date of Patent: June 25, 2013Assignee: Reazer Investments L.L.C.Inventors: Marvin Igelman, Aleksandar Zivkovic
-
Publication number: 20130159313Abstract: A method includes accessing text, identifying a plurality of terms from the text, determining a plurality of term vectors associated with the identified plurality of terms, and clustering the determined plurality of term vectors into a plurality of clusters, the plurality of clusters comprising a first and a second cluster, the first and second clusters each comprising two or more of the determined term vectors. The method further includes creating a first pseudo-document according to the first cluster, creating a second pseudo-document according to the second cluster, identifying a first set of terms associated with the first cluster using latent semantic analysis (LSA) of the first pseudo-document, identifying a second set of terms associated with the second cluster using LSA of the second pseudo-document, and combining the first and second sets of terms into a list of output terms.Type: ApplicationFiled: December 14, 2011Publication date: June 20, 2013Applicant: PUREDISCOVERY CORPORATIONInventor: Paul A. Jakubik
-
Patent number: 8458182Abstract: A method for clustering data or objects in an array, each element of the array corresponding to a similarity between the objects implemented within a computer linked with a database containing the data or objects The method includes determining a number of classes of objects based on values of the relationships computed between an object and a previously established class, for each class found, determining the value of each of the relationships between a class and the other classes, and merging certain classes, and taking each object of each class one by one, determining the value of the relationship of each object with each of the classes other than the class into which the object was initially classed, if the value of the relationship is greater then transferring the object to the new class, this is continued until all the values of the relationships are negative.Type: GrantFiled: December 21, 2009Date of Patent: June 4, 2013Assignee: ThalesInventor: Hamid Benhadda
-
Patent number: 8452772Abstract: Disclosed are methods, systems, and articles of manufactures for addressing popular topics in a social sphere. The method or the system continuously monitors conversations in online forum(s), identifies trend(s) of interest, identifies one or more content items that match the trend(s), and delivers the one or more content items to appropriate forum(s). The method or the system may aggregates conversations in a target forum to identify a trend and automatically responds to the trend by identifying and delivering matching existing content items to a target forum. The method or the system may further catalog a newly created content item upon creation and may identify a trend by employing some third-party products or services, by executing one or more Internet bots to monitor the online conversions, or by using trending application(s) offered by forums or social network websites.Type: GrantFiled: August 1, 2011Date of Patent: May 28, 2013Assignee: Intuit Inc.Inventors: Aliza D. Carpio, Alan F. Buhler, Joseph P. Elwell
-
Patent number: 8452760Abstract: According to one embodiment, a relevancy presentation apparatus includes a storage, an extraction unit, a first expansion unit, a second expansion unit, a determination unit and a generation unit. The storage stores topic networks. The extraction unit extracts subject keywords. The first expansion unit acquires first relevant words from the topic networks. The second expansion unit searches an ontologies for the subject keywords. The determination unit extracts common relevant words, and determines whether frequencies of appearances of relevant words are stationary. The generation unit generates search queries based on whether the frequencies of appearances are stationary, and generates search results.Type: GrantFiled: January 25, 2012Date of Patent: May 28, 2013Assignee: Kabushiki Kaisha ToshibaInventors: Tomohiro Yamasaki, Masaru Suzuki
-
Publication number: 20130117268Abstract: A method includes identifying a table within a first document. The method includes analyzing at least one of: a column heading in the table, a row heading in the table, and data in a cell in the table. The method includes determining, based on the analysis, that the table contains financial data classifiable according to a taxonomy. The method includes analyzing, by a classification component comprising at least one classification engine, at least one of a column heading in the table and a row heading in the table. The method includes generating, by the classification component, a classification suggestion for at least one element in the table, based on the analysis of the classification component.Type: ApplicationFiled: November 3, 2011Publication date: May 9, 2013Inventors: Ben Smith, Paul Warren, David North, Richard Ashby, Martin Hutchinson
-
Patent number: 8433709Abstract: Systems and methods for categorizing lexical data, accurately describing the structure of hierarchical data, accommodating lexicons having disparate data structures, pooling data from separate lexicons into aggregate lists, gathering data from participating users, and specified interfaces for handwriting recognition, optical character recognition, and text-to-speech and speech-to-text conversion are described. Some implementations can include a linguistic services center that interfaces with various natural language processing modules such that users of one module can take advantage of linguistic information provided in the system.Type: GrantFiled: November 25, 2008Date of Patent: April 30, 2013Inventor: Warren Daniel Child
-
Patent number: 8429178Abstract: In a single-signature duplicate document system, a secondary set of attributes is used in addition to a primary set of attributes so as to improve the precision of the system. When the projection of a document onto the primary set of attributes is below a threshold, then a secondary set of attributes is used to supplement the primary lexicon so that the projection is above the threshold.Type: GrantFiled: July 18, 2011Date of Patent: April 23, 2013Assignee: Facebook, Inc.Inventors: Joshua Alspector, Aleksander Kolcz, Abdur R. Chowdhury
-
Patent number: 8429167Abstract: A method and apparatus for determining contexts of information analyzed. Contexts may be determined for words, expressions, and other combinations of words in bodies of knowledge such as encyclopedias. Analysis of use provides a division of the universe of communication or information into domains, and selects words or expressions unique to those domains of subject matter as an aid in classifying information. A vocabulary list is created with a macro-context (context vector) for each, dependent upon the number of occurrences of unique terms from a domain, over each of the domains. This system may be used to find information or classify information by subsequent inputs of text, in calculation of macro-contexts, with ultimate determination of lists of micro-contests including terms closely aligned with the subject matter.Type: GrantFiled: August 8, 2006Date of Patent: April 23, 2013Assignee: Google Inc.Inventor: David C. Taylor
-
Publication number: 20130097123Abstract: A method and system for determining eligible communication partners utilizing an entity discovery engine is provided. The entity discovery engine coordinates the discovery of eligible communication partners. The entity discovery engine enables participants to discover other communication partners through the application of inputs. Starting with a data set of potential communication partners, the entity discovery engine uses inputs to identify eligible communication partners from the data set of potential communication partners. Inputs include policies that are applied broadly to limit categories of potential communication partners from being suggested as eligible communication partners. Identified eligible communication partners are suggested to enable communication relationships. Suggested eligible communication partners may be selected by a user or by an electronic communication device for initiating a communication relationship.Type: ApplicationFiled: October 18, 2011Publication date: April 18, 2013Applicant: RESEARCH IN MOTION LIMITEDInventors: Brian Edward Anthony McColgan, Bruno Richard Preiss
-
Patent number: 8411305Abstract: A system and method is disclosed for identifying a record template within a file having reused objects. The method discloses: identifying, in the input file, a reused object and a set of pages upon which the reused object is located; computing a page distance between at least two adjacent instances of the reused object; generating an object recurrence pattern for the reused object; and reconstructing a record template, based on the object recurrence pattern, thereby identifying the records in the input file. The system discloses a processor, a profiler module, a pattern identification module, and a template reconstruction module for effecting the method.Type: GrantFiled: October 27, 2009Date of Patent: April 2, 2013Assignee: Hewlett-Packard Development Company, L.P.Inventor: Fabio Giannetti
-
Patent number: 8407242Abstract: Described are techniques to facilitate temporal features in a semantic data store. Information about lifetimes of facts in a semantic store is maintained. Even when a fact is logically deleted, a physical record is kept available. The record of a logically deleted or invalid fact has associated lifetime information. For example, valid-from and valid-to time values. The record of a fact not yet deleted may have a valid-from time value indicating when it was created, became valid, etc. Queries against the semantic store may specify a timeslice (a point in time or a time range). The lifetime information can be used to satisfy such time-specific queries. Because records are maintained after they are logically deleted, it is also possible to accurately query a past state of the semantic store. Even if such a query is run at different times, same results may be obtained.Type: GrantFiled: December 16, 2010Date of Patent: March 26, 2013Assignee: Microsoft CorporationInventors: Thomas E Jackson, Stuart Bowers, Chris Karkanias, Allen Brown, David Campbell, Brian Aust
-
Patent number: 8407226Abstract: Systems, methods, and apparatus, including computer program products, for collaborative filtering are provided. In one implementation, a computer-implemented method is provided. The method includes receiving a shard of data representing a subset of a set of entities and a subset of a set of items, generating an iteration of a maximum likelihood estimate of a probability distribution model of a relationship between the set of entities and the set of items, the probability distribution model comprising a probability distribution of the set of items with respect to latent variables and a probability distribution of the latent variables with respect to the set of users, and generating statistics from results from the generating step which are passed to different shards for use in a next iteration of the maximum likelihood estimate.Type: GrantFiled: March 2, 2011Date of Patent: March 26, 2013Assignee: Google Inc.Inventors: Abhinandan S. Das, Ashutosh Garg, Mayur Datar
-
Patent number: 8402018Abstract: A semantic search system using a semantic ranking scheme including: an ontology analyzer analyzing ontology data related to a search target to determine a weight value of each property according to a weighing method for property; a semantic path extractor extracting all the semantic paths between resources and query keywords and determining a weight value of each extracted semantic path according to the semantic path weight value determination scheme by using the weight value of each property; a relevant resource searcher traversing an instance graph of ontology based on a semantic path having a pre-set length and weight value of more than an expectation level to search resources that have a semantic relationship with the query keywords and are declared as a type presented in the query; and a semantic relevance ranker selecting a top-k results having the highest rank from among the candidate results extracted by the relevant resource researcher by using a relevance scoring function.Type: GrantFiled: February 12, 2010Date of Patent: March 19, 2013Assignee: Korea Advanced Institute of Science and TechnologyInventors: Ji-Hyun Lee, Chin-Wan Chung
-
Patent number: 8402026Abstract: A system and method for efficiently generating cluster groupings in a multi-dimensional concept space is described. A plurality of terms is extracted from each document in a collection of stored unstructured documents. A concept space is built over the document collection. Terms substantially correlated between a plurality of documents within the document collection are identified. Each correlated term is expressed as a vector mapped along an angle ? originating from a common axis in the concept space. A difference between the angle ? for each document and an angle ? for each cluster within the concept space is determined. Each such cluster is populated with those documents having such difference between the angle ? for each such document and the angle ? for each such cluster falling within a predetermined variance.Type: GrantFiled: August 3, 2004Date of Patent: March 19, 2013Assignee: FTI Technology LLCInventor: Dan Gallivan
-
Patent number: 8392429Abstract: Methods, systems, and apparatus, including computer program products are provided for responding to search queries having results that identify books. In one aspect, a search query and multiple web pages that satisfy the search query and have a ranked order as responses to the search query are received. A subset of web pages that are each a reference page for a respective book are selected. A web page is a reference page for a book when the web page includes a reference to the book and satisfies a citation criterion for the book. A book score is assigned to each of the books for which there is at least one reference page in the group of highest ranking web pages. The book scores are used to select one or more of the books. A book reference is generated for each of the books and the book references are provided in response to the search query.Type: GrantFiled: November 26, 2008Date of Patent: March 5, 2013Assignee: Google Inc.Inventors: Daniel J. Clancy, Xuefu Wang
-
Patent number: 8392421Abstract: The present invention relates to a method of profiling an Internet endpoint associated with an Internet Protocol (IP) address, an IP prefix, or a domain name, the method includes generating a profiling rule using an Internet search engine, obtaining a search result by inputting the IP address, the IP prefix, or the domain name to the Internet search engine, and classifying the Internet endpoint based on the search result using the profiling rule.Type: GrantFiled: March 25, 2011Date of Patent: March 5, 2013Assignee: Narus, Inc.Inventors: Antonio Nucci, Supranamaya Ranjan, Aleksandar Kuzmanovic
-
Publication number: 20130054604Abstract: An approach is provided for providing information clustering based on predictive social graphs. An information clustering platform processes and/or facilitates a processing of one or more social graphs associated with one or more users to cause, at least in part, a prediction of one or more future states of the one or more social graphs. The information clustering platform further causes, at least in part, a clustering of one or more data items associated with at least one information space based, at least in part, on the one or more social graphs, the one or more future states, or a combination thereof.Type: ApplicationFiled: August 30, 2011Publication date: February 28, 2013Applicant: Nokia CorporationInventors: Sergey Boldyrev, Pavandeep Kalra
-
Patent number: 8386488Abstract: A method and system is provided for classifying and labeling information content (e.g., websites, databases, or the like) and also for profiling a user (e.g., interests or responsibilities) for accessing the information content, both using a coordinated labeling technique so that the content from multiple sources may be searched, identified and/or presented to the user according to the user's profile. This technique provides an ongoing update of information content and sources while at the same time filtering out unnecessary information that is irrelevant to the user's profile, resulting in more focused availability of information to the user. The user profile is matched with content of interest (as tagged by content creators reflective of categories that is also employed by a user profile) and matching content information may automatically be updated and made available to a user, in conformity with the user's profile.Type: GrantFiled: April 27, 2004Date of Patent: February 26, 2013Assignee: International Business Machines CorporationInventors: Gregory L. Jones, Brian N. Phoenix, Ralph Tamlyn
-
Patent number: 8386490Abstract: A method of classifying a set of semantic concepts on a second multimedia collection based upon adapting a set of semantic concept classifiers and updating concept affinity relations that were developed to classify the set of semantic concepts for a first multimedia collection. The method comprises providing the second multimedia collection from a different domain and a processor automatically classifying the semantic concepts from the second multimedia collection by adapting the semantic concept classifiers and updating the concept affinity relations to the second multimedia collection based upon the local smoothness over the concept affinity relations and the local smoothness over data affinity relations.Type: GrantFiled: October 27, 2010Date of Patent: February 26, 2013Assignee: Eastman Kodak CompanyInventors: Wei Jiang, Alexander C. Loui
-
Patent number: 8370362Abstract: An improved human user computer interface system, wherein a user characteristic or set of characteristics, such as demographic profile or societal “role”, is employed to define a scope or domain of operation. The operation itself may be a database search, to interactively define a taxonomic context for the operation, a business negotiation, or other activity. After retrieval of results, a scoring or ranking may be applied according to user define criteria, which are, for example, commensurate with the relevance to the context, but may be, for example, by date, source, or other secondary criteria. A user profile is preferably stored in a computer accessible form, and may be used to provide a history of use, persistent customization, collaborative filtering and demographic information for the user.Type: GrantFiled: September 14, 2010Date of Patent: February 5, 2013Assignee: Alberti Anemometer LLCInventor: Andrew J. Szabo
-
Publication number: 20130031100Abstract: The present invention includes a system and method for generating a discussion group based on different electronic images. A mixed media reality database receives MMR objects that correspond to source material and indexes the MMR objects. A content management engine generates a cluster that includes MMR objects based on a similarity of source material. An MMR engine receives an electronic image from a user device, performs a visual search and identifies an MMR object that is associated with the electronic image. A social network application identifies a discussion group associated with the cluster that includes the MMR object and provides the user device with access to the discussion group.Type: ApplicationFiled: October 13, 2011Publication date: January 31, 2013Applicant: RICOH COMPANY, LTD.Inventors: Jamey Graham, Timothee Bailloeul, Adit Gupta
-
Patent number: 8364682Abstract: Methods, systems, and apparatus, for refining log file join data. In one aspect, a method includes receiving first join data defining first joins of records in a first log file to records in a second log file. Each first join of a record in the first log file to a record in the second log file is based on an association of the first identifier of the record in the first log file to the second identifier of the record in the second log file. Associations of the first identifiers to the second identifiers in the first join data that meet a confidence threshold are stored in a mapping of first identifiers to second identifiers as a mapped association. For each mapped association, records that include the first identifier from the first log file are associated with records that include the second identifier from the second log file.Type: GrantFiled: May 5, 2011Date of Patent: January 29, 2013Assignee: Google Inc.Inventors: Ori Gershony, Wei Zheng, Andrei Pascovici, Tim Hesterberg
-
Publication number: 20130024440Abstract: Methods, systems and computer-readable media enable various techniques related to semantic navigation. One aspect is a technique for displaying semantically derived facets in the search engine interface. Each of the facets comprises faceted search results. Each of the faceted search results is displayed in association with user interface elements for including or excluding the faceted search result as additional search terms to subsequently refine the search query. Another aspect automatically infers new metadata from the content and from existing metadata and then automatically annotates the content with the new metadata to improve recall and navigation. Another aspect identifies semantic annotations by determining semantic connections between the semantic annotations and then dynamically generating a topic page based on the semantic connections.Type: ApplicationFiled: July 22, 2011Publication date: January 24, 2013Inventors: Pascal Dimassimo, Steve Pettigrew, Martin Brousseau, Charles-Olivier Simard, Eric Williams, Francis Lacroix, Alex Dowgailenko, Agostino Deligia, Jean-Michel Texier
-
Patent number: 8356025Abstract: A method for analyzing sentiment comprising: collecting an object from an external content repository, the collected objects forming a content database; extracting a snippet related to the subject from the content database; calculating a sentiment score for the snippet; classifying the snippet into a sentiment category; creating sentiment taxonomy using the sentiment categories, the sentiment taxonomy classifying the snippets as positive, negative or neutral; identifying topic words within the sentiment taxonomy; classifying the topic words as a sentiment topic word candidates or a non-sentiment topic word candidate, filtering the non-sentiment topic word candidates; identifying the frequency of the non-sentiment topic words in each of the sentiment categories; identifying the importance of the non-sentiment topic word for each of the sentiment categories; and, ranking the topic word, wherein the rank is calculated by combining the frequency of the topic words in each of the categories with its importance.Type: GrantFiled: December 9, 2009Date of Patent: January 15, 2013Assignee: International Business Machines CorporationInventors: Keke Cai, Ying Chen, William Scott Spangler, Li Zhang
-
Publication number: 20130013612Abstract: Certain example embodiments relate to techniques for analyzing documents. A plurality of documents/document portions are imported into a database, with at least some of the documents/document portions being structured and at least some being unstructured. The imported documents/document portions are organized into one or more collections. A selection of at least one of the one or more collections is made. An index of words and/or groups of words is built (and optionally refined in accordance with one or more predefined rules) based on each of the document or document portion in each selection. A document-word matrix is built (and optionally weighted using a semantic approach), with the matrix including a value indicative of a number of times each word and/or group of words in the index appears in each document/document portion. One or more clusters of documents are generated using the document-word matrix.Type: ApplicationFiled: July 7, 2011Publication date: January 10, 2013Applicant: Software AGInventors: Klaus FITTGES, Khalid El Mansouri