Latent Semantic Index Or Analysis (lsi Or Lsa) Patents (Class 707/739)

Grouping and differentiating files based on underlying grouped and differentiated files

Patent number: 8566323

Abstract: Methods and apparatus teach a digital spectrum of a file. The digital spectrum is used to map a file's position. This position relative to another file's position reveals closest neighbors. When multiple such neighbors are arranged, first “patterns” of data are created that further define digital spectrums of new files. It is within this sorted new data that emergent relationships or second “patterns” are examined, according to the techniques for its underlying files, or “patterns of patterns.” Representatively, original files are stored on computing devices. If encoded, they have pluralities of symbols representing an underlying data stream of original bits of data. The original files are examined for relationships between each of the files. The original relationships are converted to new files. The new files are representatively encoded and examined for other relationships.

Type: Grant

Filed: December 29, 2009

Date of Patent: October 22, 2013

Assignee: Novell, Inc.

Inventors: Scott A. Isaacson, Craig N. Teerlink, Nadeem A. Nazeer
SERVER, INFORMATION-MANAGEMENT METHOD, INFORMATION-MANAGEMENT PROGRAM, AND COMPUTER-READABLE RECORDING MEDIUM WITH SAID PROGRAM RECORDED THEREON

Publication number: 20130275432

Abstract: A server includes an input information database (14) that stores input information where position information indicating a geographic position, a word given to the position, and a user ID identifying a user having given the word to the position are associated with one another, a dictionary database (15) that stores dictionary data indicating associations between words, and an association unit (17) that extracts a plurality of input information where the geographic positions are included in one geographic range and the words are associated with each other by referring to those databases, associates the extracted plurality of input information with each other by assigning a common identifier to the plurality of input information, and enters the plurality of input information into the input information database (14).

Type: Application

Filed: August 23, 2011

Publication date: October 17, 2013

Applicant: RAKUTEN, INC.

Inventor: Udana Bandara
System, method, and apparatus for multidimensional exploration of content items in a content store

Patent number: 8560548

Abstract: A computer-implemented method for accessing content items in a content store are described. In one embodiment, the computer-implemented method includes maintaining a text index of content items in a content store to enable a keyword search on the content items, receiving a query having a keyword and generating a hit list from the text index using the keyword, and extracting frequent phrases from text within content items of the hit list. The computer-implemented method also includes assigning a relative relevance to the frequent phrases and grouping content items into topics based on presence of relevant phrases within the content items of the hit list. The hit list includes one or more content items of the content store. The frequent phrases having a relatively high relevance are relevant phrases.

Type: Grant

Filed: August 19, 2009

Date of Patent: October 15, 2013

Assignee: International Business Machines Corporation

Inventors: Akanksha Baid, Berthold Reinwald, Alkis Simitsis, John Sismanis
Generating action trails from web history

Patent number: 8560549

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating action trails from web history are described. In one aspect, a method includes receiving a web content access history of a user, the content access history including one or more user actions, each user action being associated with a content item upon which the user action is performed and identifying one or more action trails from the content access history, each action trail including a sequence of user actions performed one content items relating to a topic. Identifying a particular action trail includes clustering the user actions into a series of segments using temporal criteria; calculating semantic similarities between the content items, and adding a segment of the series of segments to the action trail when the semantic similarities between the segment and another segment satisfy a similarity threshold.

Type: Grant

Filed: May 14, 2012

Date of Patent: October 15, 2013

Assignee: Google Inc.

Inventors: Elin R. Pedersen, Karl A. Gyllstrom, Shengyin Gu, Peter Jin Hong
Systems And Methods For Analyzing And Visualizing Social Events

Publication number: 20130268516

Abstract: Systems and methods for analyzing and visualizing social events include historical, real-time, and predictive analytics and visualization of physical or virtual social events based on social network communications.

Type: Application

Filed: April 8, 2013

Publication date: October 10, 2013

Inventors: Imran Noor Chaudhri, Musa Ghani
Query expansion

Patent number: 8548999

Abstract: An expanded queries data structure is described. The data structure is produced on the basis of a set of seed queries, and consists of entries each specifying an expanded query submitted by a user that has been determined to have a high degree of relatedness to at least a plurality of the seed queries of the set. The expanded queries specified by the entries of the expanded queries data structure can be used to define a segment of users expected to have interests characterized by the seed queries.

Type: Grant

Filed: August 15, 2011

Date of Patent: October 1, 2013

Assignee: AudienceScience Inc.

Inventors: Yair Even-Zohar, Basem Nayfeh
Determining a document specificity

Patent number: 8543380

Abstract: In one embodiment, determining a document specificity includes accessing a record that records the clusters of documents. The number of themes of a document is determined from the number of clusters of the document. The specificity of the document is determined from the number of themes.

Type: Grant

Filed: October 1, 2008

Date of Patent: September 24, 2013

Assignee: Fujitsu Limited

Inventors: David L. Marvit, Jawahar Jain, Stergios Stergiou
SYSTEM AND METHOD FOR RETRIEVING AND STORING INDUSTRIAL DATA

Publication number: 20130238606

Abstract: The invention provides a system and method for retrieving and storing industrial data, the system comprising a data retriever that includes a data retrieval manager and one or more watchers for monitoring data associated with one or more industrial devices, wherein if the data associated with the one or more industrial devices is new or modified, the one or more watchers notifies the data retrieval manager thereof and the data retrieval manager uploads the new or modified data. The system further includes a database manager for receiving the new or modified data in a first form from the data retrieval manager and for storing the new or modified data in a structural data form in one or more databases.

Type: Application

Filed: May 2, 2013

Publication date: September 12, 2013

Applicant: Rockwell Automation Technologies, Inc.

Inventors: Marek Obitko, Ivan Havel, Michal Fortik, Robert Mavrov, Radek Marik
Regularized latent semantic indexing for topic modeling

Patent number: 8533195

Abstract: Electronic documents are retrieved from a database and/or from a network of servers. The documents are topic modeled in accordance with a Regularized Latent Semantic Indexing approach. The Regularized Latent Semantic Indexing approach may allow an equation involving an approximation of a term-document matrix to be solved in parallel by multiple calculating units. The equation may include terms that are regularized via either l1 norm and/or via l2 norm. The Regularized Latent Semantic Indexing approach may be applied to a set, or a fixed number, of documents such that the set of documents is topic modeled. Alternatively, the Regularized Latent Semantic Indexing approach may be applied to a variable number of documents such that, over time, the variable of number of documents is topic modeled.

Type: Grant

Filed: June 27, 2011

Date of Patent: September 10, 2013

Assignee: Microsoft Corporation

Inventors: Jun Xu, Hang Li, Nicholas Craswell
Composite term index for graph data

Patent number: 8527497

Abstract: An indexing system for graph data. In particular implementations, the indexing system provides for denormalization and replica index functionality to improve query performance.

Type: Grant

Filed: September 8, 2011

Date of Patent: September 3, 2013

Assignee: Facebook, Inc.

Inventors: Sanjeev Singh, Bret Steven Taylor, Paul Buchheit, James Norris, Tudor Bosman, Benjamin Darnell
Apparatus and method for authoring data in communication system

Patent number: 8521744

Abstract: An apparatus for authoring data in a communication system includes: an extraction unit configured to receive media corresponding to contents and extract contents information regarding the contents from the received media; a generation unit configured to generate a DMB ECG XML-based metadata comprising the extracted contents information; and a processing unit configured to visualize particulars of the DMB ECG XML-based metadata through a user interface and process the user interface so that the DMB ECG XML-based metadata is generated and edited on a template.

Type: Grant

Filed: November 12, 2010

Date of Patent: August 27, 2013

Assignee: Electronics and Telecommunications Research Institute

Inventors: Seung-Jun Yang, Min-Sik Park, Han-Kyu Lee, Jin-Woo Hong
Extraction of attributes and values from natural language documents

Patent number: 8521745

Abstract: One or more classification algorithms are applied to at least one natural language document in order to extract both attributes and values of a given product. Supervised classification algorithms, semi-supervised classification algorithms, unsupervised classification algorithms or combinations of such classification algorithms may be employed for this purpose. The at least one natural language document may be obtained via a public communication network. Two or more attributes (or two or more values) thus identified may be merged to form one or more attribute phrases or value phrases. Once attributes and values have been extracted in this manner, association or linking operations may be performed to establish attribute-value pairs that are descriptive of the product. In a presently preferred embodiment, an (unsupervised) algorithm is used to generate seed attributes and values which can then support a supervised or semi-supervised classification algorithm.

Type: Grant

Filed: June 13, 2011

Date of Patent: August 27, 2013

Assignee: Accenture Global Services Limited

Inventors: Katharina Probst, Rayid Ghani, Andrew E. Fano, Marko Krema, Yan Liu
Method and apparatus for maintaining and navigating a non-hierarchical personal spatial file system

Patent number: 8515959

Abstract: A self-organizing personal file system is disclosed that evaluates the “importance” of terms and phrases in a document in a personal corpus relative to usage in a reference corpus. A personalized term weighting scheme assigns a weight to terms or phrases based on the frequency of occurrence of the corresponding term or phrase in a reference corpus. The personalized term weighting for a given term or phrase can be used to store and access documents containing the corresponding term or phrase in the spatial file system and provides coordinates in a spatial file system, for one or more documents containing the corresponding term or phrase. The location of a given document in a file space may be specified by the relative frequency distribution of the stems of its significant terms or phrases compared to the occurrence of such terms or phrases in a reference corpus.

Type: Grant

Filed: April 25, 2005

Date of Patent: August 20, 2013

Assignee: International Business Machines Corporation

Inventors: Thomas A. Cofino, Jonathan Lenchner
INFORMATION PROCESSING APPARATUS AND CONTROL METHOD THEREOF

Publication number: 20130212107

Abstract: Enlargement values indicating a degree of enlargement when spatial data is stored in a partial spatial region are calculated for one or more partial spatial regions within a multidimensional index, and in the case where the enlargement value is greater than or equal to a threshold value, a new partial spatial region that contains at least the spatial data is generated.

Type: Application

Filed: January 22, 2013

Publication date: August 15, 2013

Applicant: CANON KABUSHIKI KAISHA

Inventor: CANON KABUSHIKI KAISHA
Computer-Implemented System And Method For Generating A Display Of Document Clusters

Publication number: 20130212098

Abstract: A computer-implemented system and method for generating a display of document clusters is described. Clusters of documents are presented in a multi-dimensional concept space. At least one document is selected from a collection of documents to be clusters. An angle ? of the document relative to a common origin of the multi-dimensional concept space is computed. The selected document is compared with each of the clusters. An angle ? from the common origin is determined for each cluster. A difference between the angle ? for the document and the angle ? for the cluster is determined. The difference is compared to the variance, and a new cluster is created when the difference exceeds the variance for all the clusters.

Type: Application

Filed: March 14, 2013

Publication date: August 15, 2013

Applicant: FTI TECHNOLOGY LLC

Inventor: FTI TECHNOLOGY LLC
ATTRIBUTION USING SEMANTIC ANALYISIS

Publication number: 20130204877

Abstract: A method, system, and computer program product for semantic attribution of a request. Source data statements for the request are received. A selection of a domain for the received source data statements is received. The received source data statements are semantically analyzed, which includes matching elements in the received source data statements to respective one or more entries in an ontology associated with the selected domain. The ontology includes items and relationships that define the selected domain. Each element in the received source data statements is a word or a phrase. The one or more entries are assigned to the matched elements, respectively, to annotate each matched element with a respective annotation consisting of the respective one or more entries. The annotated elements are saved with the respective annotations.

Type: Application

Filed: November 30, 2012

Publication date: August 8, 2013

Applicant: International Business Machines Corporation

Inventor: International Business Machines Corporation
Semantic analysis of documents to rank terms

Patent number: 8504564

Abstract: A method, apparatus and computer program product provides for a semantic analyzer to produce and rank semantic terms to reflect their relationship to the theme and topics of a document. The text and the document can have no relationship to any pre-selected keywords before the semantic analyzer performs text extraction. The semantic analyzer extracts text from a document and performs semantic analysis on the extracted text. The semantic analyzer provides a plurality of ranked semantic terms as a result of the semantic analysis and associates semantic terms with the document as semantic keywords. The semantic terms define content to be presented with the document where the content is an advertisement, a link to a remote information resource or a second document.

Type: Grant

Filed: December 15, 2010

Date of Patent: August 6, 2013

Assignee: Adobe Systems Incorporated

Inventors: Walter Chang, Nadia Ghamrawi
Full text search capabilities integrated into distributed file systems— incrementally indexing files

Patent number: 8504565

Abstract: A hierarchical distributed search mechanism is integrated into a distributed file system. Traditional file system APIs (create, open, close, read, write, link, rename, delete, . . . ) and the over-the-wire protocols employed to project these APIs into remote client sites (CIFS, NFS, DDS, Appletalk) are extended to enable the dynamic creation of temporary directories containing links to objects identified by search engines (executing at sites “close” to “their” data) as meeting the search criteria specified by the first parameter of a search function call. The search function, derived from the standard file system API function create, is added to the file system API.

Type: Grant

Filed: September 9, 2005

Date of Patent: August 6, 2013

Inventor: William M. Pitts
String and sub-string searching using inverted indexes

Patent number: 8498972

Abstract: Inverted indexes for terms and for term separators are separately provided to minimize data redundancy. Search queries are parsed to identify terms and term separators, if any, and the corresponding inverted indexes are searched for responsive documents. Related apparatus, systems, techniques and articles are also described.

Type: Grant

Filed: December 16, 2010

Date of Patent: July 30, 2013

Assignee: SAP AG

Inventors: Frederik Transier, Franz Faerber
Automated content generation through selective combination

Patent number: 8495513

Abstract: Method and system for merging two objects in a business intelligence system. A first member is selected in the business intelligence system, the business intelligence system includes a user space, a content space, a data space, a master-data space and a metadata space. A relationship between the first member and a plurality of members selected from the group consisting of the user space, the content space, the data space, the master-data space, the metadata space is determined, which results in determined relationships for every member in the business intelligence system. Two members in the content space are then selected. Relationships between the two members in the plurality of determined relationships are traversed to determine the members in the traversed relationships. A preference is assigned to the members in the traversed relationships with close or exact relationships; and the members with the preference are merged.

Type: Grant

Filed: May 12, 2009

Date of Patent: July 23, 2013

Assignee: International Business Machines Corporation

Inventors: Graham Douglas MacKintosh, John Andrew Kowal
Methods and system for providing context sensitive information

Patent number: 8489607

Abstract: A method and system are described for providing context sensitive data to a system user. The method includes the steps of identifying the user and querying databases to create a user context. Information is aggregated from the network databases and filtered using the user context. Providing the correct data needed by the user for that particular time, location and job function.

Type: Grant

Filed: November 10, 2009

Date of Patent: July 16, 2013

Assignee: General Electric Company

Inventors: Christopher Scott Fuselier, John James Dougherty, Joseph John Fisher, Thomas A. Digate, Richard Alan Carpenter, Bernardo Anger
Translating keywords from a source language to a target language

Patent number: 8484218

Abstract: In one implementation, a method includes receiving a request for translation of one or more first keywords from a source language to a target language; and translating, using a machine translation process, the first keywords from the source language into a plurality of second keywords in the target language. The method can also include determining, by a computer system, frequencies with which each of the second keywords occur in a corpus associated with the target language. The method can further include selecting, by the computer system, a subset of the second keywords to use in the target language based on the determined frequencies of occurrence.

Type: Grant

Filed: April 21, 2011

Date of Patent: July 9, 2013

Assignee: Google Inc.

Inventor: Mandayam Thondanur Raghunath
Providing reconstructed data based on stored aggregate data in response to queries for unavailable data

Patent number: 8484212

Abstract: In an embodiment, a method comprises dividing collected data into data clusters based on proximity of the data and adjusting the clusters based on density of data in individual clusters. Based on first data points in a first cluster, a first average point in the first cluster is determined. Based on second data points in a second cluster, a second average point in the second cluster is determined. Aggregate data, comprising the first average point and the second average point, are stored in storage. Upon receiving a request to provide data for a particular coordinate, the reconstructed data point is determined by interpolating between the first average point and the second average point at the particular coordinate. Accordingly, aggregated data may be stored and when a request specifies data that was not actually stored, a reconstructed data point with an approximated data value may be provided as a substitute.

Type: Grant

Filed: January 21, 2011

Date of Patent: July 9, 2013

Assignee: Cisco Technology, Inc.

Inventors: Ying Liu, Shahrokh Sadjadi
SYMANTIC FRAMEWORK FOR DYNAMICALLY CREATING A PROGRAM GUIDE

Publication number: 20130166561

Abstract: An application server includes a Semantic Analysis Core Service (SACS) function that communicates with a Semantic Analysis Client (SAC) in a Set Top Box (STB). The SACS groups programs available for rendering to a subscriber into program clusters. The SACS generates the program clusters based on a determined semantic similarity between the programs, and on parameters that indicate a subscriber's preference for certain program content. The program that are semantically similar to existing clusters within a predetermined viewing window are provided to the STB and output to the subscriber on a display as a program preference list or channel line-up. The STB also monitors the subscriber's interaction with the programs and calculates a preference score for each program indicating the subscriber's continuing, or waning, interest in a given program. The preference score is used to update the score of the program cluster to which the program belongs.

Type: Application

Filed: December 22, 2011

Publication date: June 27, 2013

Inventors: Sorin Marian GEORGESCU, Edoardo GAVITA
Video manager and organizer

Patent number: 8473845

Abstract: An online video search system, including a tag discoverer including a web encyclopedia crawler for (i) accessing a web encyclopedia to find web pages related to at least one designated reference topic, and (ii) retrieving a plurality of web pages by performing an n-level depth recursive traversal of the web pages found, and web pages that are hyper-linked thereto, a concept extractor for extracting important concepts founds in the retrieved plurality of web pages, and a user interface for providing at least of the important concepts extracted by the web page processor to an online video search engine. A method and a computer-readable storage medium are also described and claimed.

Type: Grant

Filed: June 6, 2007

Date of Patent: June 25, 2013

Assignee: Reazer Investments L.L.C.

Inventors: Marvin Igelman, Aleksandar Zivkovic
Multi-Concept Latent Semantic Analysis Queries

Publication number: 20130159313

Abstract: A method includes accessing text, identifying a plurality of terms from the text, determining a plurality of term vectors associated with the identified plurality of terms, and clustering the determined plurality of term vectors into a plurality of clusters, the plurality of clusters comprising a first and a second cluster, the first and second clusters each comprising two or more of the determined term vectors. The method further includes creating a first pseudo-document according to the first cluster, creating a second pseudo-document according to the second cluster, identifying a first set of terms associated with the first cluster using latent semantic analysis (LSA) of the first pseudo-document, identifying a second set of terms associated with the second cluster using LSA of the second pseudo-document, and combining the first and second sets of terms into a list of output terms.

Type: Application

Filed: December 14, 2011

Publication date: June 20, 2013

Applicant: PUREDISCOVERY CORPORATION

Inventor: Paul A. Jakubik
Method and system for clustering data arising from a database

Patent number: 8458182

Abstract: A method for clustering data or objects in an array, each element of the array corresponding to a similarity between the objects implemented within a computer linked with a database containing the data or objects The method includes determining a number of classes of objects based on values of the relationships computed between an object and a previously established class, for each class found, determining the value of each of the relationships between a class and the other classes, and merging certain classes, and taking each object of each class one by one, determining the value of the relationship of each object with each of the classes other than the class into which the object was initially classed, if the value of the relationship is greater then transferring the object to the new class, this is continued until all the values of the relationships are negative.

Type: Grant

Filed: December 21, 2009

Date of Patent: June 4, 2013

Assignee: Thales

Inventor: Hamid Benhadda
Methods, systems, and articles of manufacture for addressing popular topics in a socials sphere

Patent number: 8452772

Abstract: Disclosed are methods, systems, and articles of manufactures for addressing popular topics in a social sphere. The method or the system continuously monitors conversations in online forum(s), identifies trend(s) of interest, identifies one or more content items that match the trend(s), and delivers the one or more content items to appropriate forum(s). The method or the system may aggregates conversations in a target forum to identify a trend and automatically responds to the trend by identifying and delivering matching existing content items to a target forum. The method or the system may further catalog a newly created content item upon creation and may identify a trend by employing some third-party products or services, by executing one or more Internet bots to monitor the online conversions, or by using trending application(s) offered by forums or social network websites.

Type: Grant

Filed: August 1, 2011

Date of Patent: May 28, 2013

Assignee: Intuit Inc.

Inventors: Aliza D. Carpio, Alan F. Buhler, Joseph P. Elwell
Relevancy presentation apparatus, method, and program

Patent number: 8452760

Abstract: According to one embodiment, a relevancy presentation apparatus includes a storage, an extraction unit, a first expansion unit, a second expansion unit, a determination unit and a generation unit. The storage stores topic networks. The extraction unit extracts subject keywords. The first expansion unit acquires first relevant words from the topic networks. The second expansion unit searches an ontologies for the subject keywords. The determination unit extracts common relevant words, and determines whether frequencies of appearances of relevant words are stationary. The generation unit generates search queries based on whether the frequencies of appearances are stationary, and generates search results.

Type: Grant

Filed: January 25, 2012

Date of Patent: May 28, 2013

Assignee: Kabushiki Kaisha Toshiba

Inventors: Tomohiro Yamasaki, Masaru Suzuki
Identifying and suggesting classifications for financial data according to a taxonomy

Publication number: 20130117268

Abstract: A method includes identifying a table within a first document. The method includes analyzing at least one of: a column heading in the table, a row heading in the table, and data in a cell in the table. The method includes determining, based on the analysis, that the table contains financial data classifiable according to a taxonomy. The method includes analyzing, by a classification component comprising at least one classification engine, at least one of a column heading in the table and a row heading in the table. The method includes generating, by the classification component, a classification suggestion for at least one element in the table, based on the analysis of the classification component.

Type: Application

Filed: November 3, 2011

Publication date: May 9, 2013

Inventors: Ben Smith, Paul Warren, David North, Richard Ashby, Martin Hutchinson
Modular system and method for managing chinese, japanese and korean linguistic data in electronic form

Patent number: 8433709

Abstract: Systems and methods for categorizing lexical data, accurately describing the structure of hierarchical data, accommodating lexicons having disparate data structures, pooling data from separate lexicons into aggregate lists, gathering data from participating users, and specified interfaces for handwriting recognition, optical character recognition, and text-to-speech and speech-to-text conversion are described. Some implementations can include a linguistic services center that interfaces with various natural language processing modules such that users of one module can take advantage of linguistic information provided in the system.

Type: Grant

Filed: November 25, 2008

Date of Patent: April 30, 2013

Inventor: Warren Daniel Child
Reliability of duplicate document detection algorithms

Patent number: 8429178

Abstract: In a single-signature duplicate document system, a secondary set of attributes is used in addition to a primary set of attributes so as to improve the precision of the system. When the projection of a document onto the primary set of attributes is below a threshold, then a secondary set of attributes is used to supplement the primary lexicon so that the projection is above the threshold.

Type: Grant

Filed: July 18, 2011

Date of Patent: April 23, 2013

Assignee: Facebook, Inc.

Inventors: Joshua Alspector, Aleksander Kolcz, Abdur R. Chowdhury
User-context-based search engine

Patent number: 8429167

Abstract: A method and apparatus for determining contexts of information analyzed. Contexts may be determined for words, expressions, and other combinations of words in bodies of knowledge such as encyclopedias. Analysis of use provides a division of the universe of communication or information into domains, and selects words or expressions unique to those domains of subject matter as an aid in classifying information. A vocabulary list is created with a macro-context (context vector) for each, dependent upon the number of occurrences of unique terms from a domain, over each of the domains. This system may be used to find information or classify information by subsequent inputs of text, in calculation of macro-contexts, with ultimate determination of lists of micro-contests including terms closely aligned with the subject matter.

Type: Grant

Filed: August 8, 2006

Date of Patent: April 23, 2013

Assignee: Google Inc.

Inventor: David C. Taylor
Method and System for Determining Eligible Communication Partners Utilizing an Entity Discovery Engine

Publication number: 20130097123

Abstract: A method and system for determining eligible communication partners utilizing an entity discovery engine is provided. The entity discovery engine coordinates the discovery of eligible communication partners. The entity discovery engine enables participants to discover other communication partners through the application of inputs. Starting with a data set of potential communication partners, the entity discovery engine uses inputs to identify eligible communication partners from the data set of potential communication partners. Inputs include policies that are applied broadly to limit categories of potential communication partners from being suggested as eligible communication partners. Identified eligible communication partners are suggested to enable communication relationships. Suggested eligible communication partners may be selected by a user or by an electronic communication device for initiating a communication relationship.

Type: Application

Filed: October 18, 2011

Publication date: April 18, 2013

Applicant: RESEARCH IN MOTION LIMITED

Inventors: Brian Edward Anthony McColgan, Bruno Richard Preiss
System and method for identifying a record template within a file having reused objects

Patent number: 8411305

Abstract: A system and method is disclosed for identifying a record template within a file having reused objects. The method discloses: identifying, in the input file, a reused object and a set of pages upon which the reused object is located; computing a page distance between at least two adjacent instances of the reused object; generating an object recurrence pattern for the reused object; and reconstructing a record template, based on the object recurrence pattern, thereby identifying the records in the input file. The system discloses a processor, a profiler module, a pattern identification module, and a template reconstruction module for effecting the method.

Type: Grant

Filed: October 27, 2009

Date of Patent: April 2, 2013

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Fabio Giannetti
Temporal binding for semantic queries

Patent number: 8407242

Abstract: Described are techniques to facilitate temporal features in a semantic data store. Information about lifetimes of facts in a semantic store is maintained. Even when a fact is logically deleted, a physical record is kept available. The record of a logically deleted or invalid fact has associated lifetime information. For example, valid-from and valid-to time values. The record of a fact not yet deleted may have a valid-from time value indicating when it was created, became valid, etc. Queries against the semantic store may specify a timeslice (a point in time or a time range). The lifetime information can be used to satisfy such time-specific queries. Because records are maintained after they are logically deleted, it is also possible to accurately query a past state of the semantic store. Even if such a query is run at different times, same results may be obtained.

Type: Grant

Filed: December 16, 2010

Date of Patent: March 26, 2013

Assignee: Microsoft Corporation

Inventors: Thomas E Jackson, Stuart Bowers, Chris Karkanias, Allen Brown, David Campbell, Brian Aust
Collaborative filtering

Patent number: 8407226

Abstract: Systems, methods, and apparatus, including computer program products, for collaborative filtering are provided. In one implementation, a computer-implemented method is provided. The method includes receiving a shard of data representing a subset of a set of entities and a subset of a set of items, generating an iteration of a maximum likelihood estimate of a probability distribution model of a relationship between the set of entities and the set of items, the probability distribution model comprising a probability distribution of the set of items with respect to latent variables and a probability distribution of the latent variables with respect to the set of users, and generating statistics from results from the generating step which are passed to different shards for use in a next iteration of the maximum likelihood estimate.

Type: Grant

Filed: March 2, 2011

Date of Patent: March 26, 2013

Assignee: Google Inc.

Inventors: Abhinandan S. Das, Ashutosh Garg, Mayur Datar
Semantic search system using semantic ranking scheme

Patent number: 8402018

Abstract: A semantic search system using a semantic ranking scheme including: an ontology analyzer analyzing ontology data related to a search target to determine a weight value of each property according to a weighing method for property; a semantic path extractor extracting all the semantic paths between resources and query keywords and determining a weight value of each extracted semantic path according to the semantic path weight value determination scheme by using the weight value of each property; a relevant resource searcher traversing an instance graph of ontology based on a semantic path having a pre-set length and weight value of more than an expectation level to search resources that have a semantic relationship with the query keywords and are declared as a type presented in the query; and a semantic relevance ranker selecting a top-k results having the highest rank from among the candidate results extracted by the relevant resource researcher by using a relevance scoring function.

Type: Grant

Filed: February 12, 2010

Date of Patent: March 19, 2013

Assignee: Korea Advanced Institute of Science and Technology

Inventors: Ji-Hyun Lee, Chin-Wan Chung
System and method for efficiently generating cluster groupings in a multi-dimensional concept space

Patent number: 8402026

Abstract: A system and method for efficiently generating cluster groupings in a multi-dimensional concept space is described. A plurality of terms is extracted from each document in a collection of stored unstructured documents. A concept space is built over the document collection. Terms substantially correlated between a plurality of documents within the document collection are identified. Each correlated term is expressed as a vector mapped along an angle ? originating from a common axis in the concept space. A difference between the angle ? for each document and an angle ? for each cluster within the concept space is determined. Each such cluster is populated with those documents having such difference between the angle ? for each such document and the angle ? for each such cluster falling within a predetermined variance.

Type: Grant

Filed: August 3, 2004

Date of Patent: March 19, 2013

Assignee: FTI Technology LLC

Inventor: Dan Gallivan
Informational book query

Patent number: 8392429

Abstract: Methods, systems, and apparatus, including computer program products are provided for responding to search queries having results that identify books. In one aspect, a search query and multiple web pages that satisfy the search query and have a ranked order as responses to the search query are received. A subset of web pages that are each a reference page for a respective book are selected. A web page is a reference page for a book when the web page includes a reference to the book and satisfies a citation criterion for the book. A book score is assigned to each of the books for which there is at least one reference page in the group of highest ranking web pages. The book scores are used to select one or more of the books. A book reference is generated for each of the books and the book references are provided in response to the search query.

Type: Grant

Filed: November 26, 2008

Date of Patent: March 5, 2013

Assignee: Google Inc.

Inventors: Daniel J. Clancy, Xuefu Wang
System and method for internet endpoint profiling

Patent number: 8392421

Abstract: The present invention relates to a method of profiling an Internet endpoint associated with an Internet Protocol (IP) address, an IP prefix, or a domain name, the method includes generating a profiling rule using an Internet search engine, obtaining a search result by inputting the IP address, the IP prefix, or the domain name to the Internet search engine, and classifying the Internet endpoint based on the search result using the profiling rule.

Type: Grant

Filed: March 25, 2011

Date of Patent: March 5, 2013

Assignee: Narus, Inc.

Inventors: Antonio Nucci, Supranamaya Ranjan, Aleksandar Kuzmanovic
METHOD AND APPARATUS FOR INFORMATION CLUSTERING BASED ON PREDICTIVE SOCIAL GRAPHS

Publication number: 20130054604

Abstract: An approach is provided for providing information clustering based on predictive social graphs. An information clustering platform processes and/or facilitates a processing of one or more social graphs associated with one or more users to cause, at least in part, a prediction of one or more future states of the one or more social graphs. The information clustering platform further causes, at least in part, a clustering of one or more data items associated with at least one information space based, at least in part, on the one or more social graphs, the one or more future states, or a combination thereof.

Type: Application

Filed: August 30, 2011

Publication date: February 28, 2013

Applicant: Nokia Corporation

Inventors: Sergey Boldyrev, Pavandeep Kalra
Method and system for matching appropriate content with users by matching content tags and profiles

Patent number: 8386488

Abstract: A method and system is provided for classifying and labeling information content (e.g., websites, databases, or the like) and also for profiling a user (e.g., interests or responsibilities) for accessing the information content, both using a coordinated labeling technique so that the content from multiple sources may be searched, identified and/or presented to the user according to the user's profile. This technique provides an ongoing update of information content and sources while at the same time filtering out unnecessary information that is irrelevant to the user's profile, resulting in more focused availability of information to the user. The user profile is matched with content of interest (as tagged by content creators reflective of categories that is also employed by a user profile) and matching content information may automatically be updated and made available to a user, in conformity with the user's profile.

Type: Grant

Filed: April 27, 2004

Date of Patent: February 26, 2013

Assignee: International Business Machines Corporation

Inventors: Gregory L. Jones, Brian N. Phoenix, Ralph Tamlyn
Adaptive multimedia semantic concept classifier

Patent number: 8386490

Abstract: A method of classifying a set of semantic concepts on a second multimedia collection based upon adapting a set of semantic concept classifiers and updating concept affinity relations that were developed to classify the set of semantic concepts for a first multimedia collection. The method comprises providing the second multimedia collection from a different domain and a processor automatically classifying the semantic concepts from the second multimedia collection by adapting the semantic concept classifiers and updating the concept affinity relations to the second multimedia collection based upon the local smoothness over the concept affinity relations and the local smoothness over data affinity relations.

Type: Grant

Filed: October 27, 2010

Date of Patent: February 26, 2013

Assignee: Eastman Kodak Company

Inventors: Wei Jiang, Alexander C. Loui
Database access system

Patent number: 8370362

Abstract: An improved human user computer interface system, wherein a user characteristic or set of characteristics, such as demographic profile or societal “role”, is employed to define a scope or domain of operation. The operation itself may be a database search, to interactively define a taxonomic context for the operation, a business negotiation, or other activity. After retrieval of results, a scoring or ranking may be applied according to user define criteria, which are, for example, commensurate with the relevance to the context, but may be, for example, by date, source, or other secondary criteria. A user profile is preferably stored in a computer accessible form, and may be used to provide a history of use, persistent customization, collaborative filtering and demographic information for the user.

Type: Grant

Filed: September 14, 2010

Date of Patent: February 5, 2013

Assignee: Alberti Anemometer LLC

Inventor: Andrew J. Szabo
Generating a Discussion Group in a Social Network Based on Similar Source Materials

Publication number: 20130031100

Abstract: The present invention includes a system and method for generating a discussion group based on different electronic images. A mixed media reality database receives MMR objects that correspond to source material and indexes the MMR objects. A content management engine generates a cluster that includes MMR objects based on a similarity of source material. An MMR engine receives an electronic image from a user device, performs a visual search and identifies an MMR object that is associated with the electronic image. A social network application identifies a discussion group associated with the cluster that includes the MMR object and provides the user device with access to the discussion group.

Type: Application

Filed: October 13, 2011

Publication date: January 31, 2013

Applicant: RICOH COMPANY, LTD.

Inventors: Jamey Graham, Timothee Bailloeul, Adit Gupta
Identifier mapping from joined data

Patent number: 8364682

Abstract: Methods, systems, and apparatus, for refining log file join data. In one aspect, a method includes receiving first join data defining first joins of records in a first log file to records in a second log file. Each first join of a record in the first log file to a record in the second log file is based on an association of the first identifier of the record in the first log file to the second identifier of the record in the second log file. Associations of the first identifiers to the second identifiers in the first join data that meet a confidence threshold are stored in a mapping of first identifiers to second identifiers as a mapped association. For each mapped association, records that include the first identifier from the first log file are associated with records that include the second identifier from the second log file.

Type: Grant

Filed: May 5, 2011

Date of Patent: January 29, 2013

Assignee: Google Inc.

Inventors: Ori Gershony, Wei Zheng, Andrei Pascovici, Tim Hesterberg
METHODS, SYSTEMS, AND COMPUTER-READABLE MEDIA FOR SEMANTICALLY ENRICHING CONTENT AND FOR SEMANTIC NAVIGATION

Publication number: 20130024440

Abstract: Methods, systems and computer-readable media enable various techniques related to semantic navigation. One aspect is a technique for displaying semantically derived facets in the search engine interface. Each of the facets comprises faceted search results. Each of the faceted search results is displayed in association with user interface elements for including or excluding the faceted search result as additional search terms to subsequently refine the search query. Another aspect automatically infers new metadata from the content and from existing metadata and then automatically annotates the content with the new metadata to improve recall and navigation. Another aspect identifies semantic annotations by determining semantic connections between the semantic annotations and then dynamically generating a topic page based on the semantic connections.

Type: Application

Filed: July 22, 2011

Publication date: January 24, 2013

Inventors: Pascal Dimassimo, Steve Pettigrew, Martin Brousseau, Charles-Olivier Simard, Eric Williams, Francis Lacroix, Alex Dowgailenko, Agostino Deligia, Jean-Michel Texier
Systems and methods for detecting sentiment-based topics

Patent number: 8356025

Abstract: A method for analyzing sentiment comprising: collecting an object from an external content repository, the collected objects forming a content database; extracting a snippet related to the subject from the content database; calculating a sentiment score for the snippet; classifying the snippet into a sentiment category; creating sentiment taxonomy using the sentiment categories, the sentiment taxonomy classifying the snippets as positive, negative or neutral; identifying topic words within the sentiment taxonomy; classifying the topic words as a sentiment topic word candidates or a non-sentiment topic word candidate, filtering the non-sentiment topic word candidates; identifying the frequency of the non-sentiment topic words in each of the sentiment categories; identifying the importance of the non-sentiment topic word for each of the sentiment categories; and, ranking the topic word, wherein the rank is calculated by combining the frequency of the topic words in each of the categories with its importance.

Type: Grant

Filed: December 9, 2009

Date of Patent: January 15, 2013

Assignee: International Business Machines Corporation

Inventors: Keke Cai, Ying Chen, William Scott Spangler, Li Zhang
TECHNIQUES FOR COMPARING AND CLUSTERING DOCUMENTS

Publication number: 20130013612

Abstract: Certain example embodiments relate to techniques for analyzing documents. A plurality of documents/document portions are imported into a database, with at least some of the documents/document portions being structured and at least some being unstructured. The imported documents/document portions are organized into one or more collections. A selection of at least one of the one or more collections is made. An index of words and/or groups of words is built (and optionally refined in accordance with one or more predefined rules) based on each of the document or document portion in each selection. A document-word matrix is built (and optionally weighted using a semantic approach), with the matrix including a value indicative of a number of times each word and/or group of words in the index appears in each document/document portion. One or more clusters of documents are generated using the document-word matrix.

Type: Application

Filed: July 7, 2011

Publication date: January 10, 2013

Applicant: Software AG

Inventors: Klaus FITTGES, Khalid El Mansouri

prev … 2 3 4 5 6 7 8 9 10 next