Based On Term Frequency Of Appearance Patents (Class 707/750)

Grouping and differentiating files based on underlying grouped and differentiated files

Patent number: 8566323

Abstract: Methods and apparatus teach a digital spectrum of a file. The digital spectrum is used to map a file's position. This position relative to another file's position reveals closest neighbors. When multiple such neighbors are arranged, first “patterns” of data are created that further define digital spectrums of new files. It is within this sorted new data that emergent relationships or second “patterns” are examined, according to the techniques for its underlying files, or “patterns of patterns.” Representatively, original files are stored on computing devices. If encoded, they have pluralities of symbols representing an underlying data stream of original bits of data. The original files are examined for relationships between each of the files. The original relationships are converted to new files. The new files are representatively encoded and examined for other relationships.

Type: Grant

Filed: December 29, 2009

Date of Patent: October 22, 2013

Assignee: Novell, Inc.

Inventors: Scott A. Isaacson, Craig N. Teerlink, Nadeem A. Nazeer
PSEUDO-DOCUMENTS TO FACILITATE DATA DISCOVERY

Publication number: 20130275436

Abstract: Various embodiments promote the discoverability of data that can be contained within a database. In one or more embodiments, data within a database is organized in a structure having a schema. The structure and data can be processed in a manner that renders one or more pseudo-documents each of which constitutes a sub-structure that can be indexed. Once produced and indexed, the pseudo-documents constitute a set of searchable objects each of which relationally points back to its associated structure within the database. Searches can now be performed against the pseudo-documents which, in turn, returns a set of search results. The set of search results can include multiple sub-sets of pseudo-documents, each sub-set of which is associated with a different structure.

Type: Application

Filed: April 11, 2012

Publication date: October 17, 2013

Applicant: Microsoft Corporation

Inventors: Surajit Chaudhuri, Lev Novik, John C. Platt
Apparatus and method for generating additional information about moving picture content

Patent number: 8559724

Abstract: An apparatus and method for generating additional information about moving picture content, including: comparing image feature information about each image frame in moving picture content with image feature information about each image frame in web information, searching for an image frame in the moving picture content, the image frame matching the image frame in the web information, determining location information about the found image frame in the moving picture content, and generating additional information by use of the determined location information and the web information.

Type: Grant

Filed: February 24, 2010

Date of Patent: October 15, 2013

Assignee: Samsung Electronics Co., Ltd.

Inventors: Yoon-hee Choi, Il-hwan Choi, Hee-seon Park
DETERMINING ENTITY POPULARITY USING SEARCH QUERIES

Publication number: 20130268482

Abstract: Systems, methods, and computer-readable media for determining the Internet search popularity of an entity are provided. Embodiments of the present invention include receiving a group of Internet search records and assigning a popularity ranking based on the number of times an entity descriptor associated with an entity occurs within the group of Internet search records created over a designated time period. An entity descriptor is one or more terms commonly used to identify an entity. The trend in an entity's popularity rank may also be calculated. An entity's popularity rank and trend in popularity rank may be presented in a graph or in a list.

Type: Application

Filed: March 14, 2013

Publication date: October 10, 2013

Inventors: Tabreez Govani, Hugh Williams, Jamie Buckley, Nitin Agrawal, Andy Lam, Kenneth A. Moss
ASSISTED HYBRID MOBILE BROWSER

Publication number: 20130262481

Abstract: A system and a method are disclosed for identifying video files on a webpage and streaming video files to a client device. A server receives browsing data including uniform resource locator for a webpage and identifies missing videos on the webpage. The server identifies a source file for the missing videos including identifying a location for each missing video. The server retrieves a thumbnail for each missing video and provides it to a client device. Additionally, the server transcodes the video file responsive to a user input provided by a user. The transcoded video is streamed to the client device.

Type: Application

Filed: May 10, 2013

Publication date: October 3, 2013

Applicant: Skyfire Labs, Inc.

Inventors: Nitin Bhandari, Erik R. Swenson, Geoffrey Dale Benson, Ishika Paul, James Marzano, Jaime Heilpern, Robert Oberhofer, Michael Guzewicz, Vijay Kumar
IDENTIFYING KEY PHRASES WITHIN DOCUMENTS

Publication number: 20130246386

Abstract: Systems are used for identifying key phrases within documents. These systems utilize a tags and a tag index to determine what a document primarily relates to. For example, an integrated data flow and extract-transform-load pipeline, crawls, parses and word breaks large corpuses of documents in database tables. Documents can be broken into tuples. The tuples can be sent to a heuristically based algorithm that uses statistical language models and weight plus cross-entropy threshold functions to summarize the document into its “top N” most statistically significant phrases. These systems can scale efficiently (e.g., linearly) and (potentially large numbers of) documents can be characterized by salient and relevant key phrases (tags).

Type: Application

Filed: March 11, 2013

Publication date: September 19, 2013

Applicant: MICROSOFT CORPORATION

Inventors: Sorin Gherman, Kunal Mukerjee
Regularized latent semantic indexing for topic modeling

Patent number: 8533195

Abstract: Electronic documents are retrieved from a database and/or from a network of servers. The documents are topic modeled in accordance with a Regularized Latent Semantic Indexing approach. The Regularized Latent Semantic Indexing approach may allow an equation involving an approximation of a term-document matrix to be solved in parallel by multiple calculating units. The equation may include terms that are regularized via either l1 norm and/or via l2 norm. The Regularized Latent Semantic Indexing approach may be applied to a set, or a fixed number, of documents such that the set of documents is topic modeled. Alternatively, the Regularized Latent Semantic Indexing approach may be applied to a variable number of documents such that, over time, the variable of number of documents is topic modeled.

Type: Grant

Filed: June 27, 2011

Date of Patent: September 10, 2013

Assignee: Microsoft Corporation

Inventors: Jun Xu, Hang Li, Nicholas Craswell
SOCIAL NETWORK MESSAGE CATEGORIZATION SYSTEMS AND METHODS

Publication number: 20130232154

Abstract: Systems and methods of identifying and categorizing social network messages that are relevant to selected categories and text terms are provided. The frequency of text terms appearing in social network messages are calculated for multiple categories. Based on the calculated text term frequency, social network messages can be identified and/or categorized that match a provided set of text terms. Selecting and/or associating text terms and categories are determined by repeatedly analyzing social network messages.

Type: Application

Filed: April 11, 2013

Publication date: September 5, 2013

Applicant: CitizenNet Inc.

Inventors: Michael Aaron Hall, Daniel Benyamin, Aaron Chu
Method for assisting in making a decision on biometric data

Patent number: 8515971

Abstract: The present invention relates to a method for assisting a user in making a decision to compare biometric data of an individual with data from a database relating to a large number of individuals, and biometric data is acquired for an individual concerned, that this data is encoded, that the data items are compared in pairs with corresponding data from the database, that, for each comparison score the duplicate occurrence frequency/non-duplicate occurrence frequency ration is established, that the product of all the available ratios is calculated, that this product is standardized, that the standardized ratio is compared to a pre-set threshold, that the values greater than the pre-set threshold are kept and that this result is submitted to the user for him to validate it as appropriate.

Type: Grant

Filed: November 2, 2006

Date of Patent: August 20, 2013

Assignee: Thales

Inventor: Jean Beaudet
Search entity transition matrix and applications of the transition matrix

Patent number: 8515975

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using search entity transition probabilities. In some implementations, data identifying entities and transition probabilities between entities is stored in a computer readable medium. Each transition probability represents a strength of a relationship between a pair of entities as they are related in search history data. In some implementations, an increase in popularity for a query is identified and a different query is identified as temporally related to the query. Scoring data for documents responsive to the different query is modified to favor newer documents. In other implementations, data identifying a first session as spam is received, and a spam score is calculated for either a second session of queries or a single query using transition probabilities. The second session (or single query) is identified as spam from the spam score.

Type: Grant

Filed: December 7, 2009

Date of Patent: August 20, 2013

Assignee: Google Inc.

Inventor: Diego Federici
Using message sampling to determine the most frequent words in a user mailbox

Patent number: 8515974

Abstract: A method is presented for generating a list of frequently used words for an email application on a server computer. When a request is received for a word frequency list for emails stored in a user's mailbox, a word frequency list is returned if one exists. If the word frequency list does not exist, an asynchronous process is started on the server computer to generate a word frequency list. If the word frequency list exists but it is older than an aging limit, an asynchronous process is started on the server computer to regenerate the word frequency list. The word frequency list is stored in the user's mailbox along with a timestamp indicating the date and time that the list was created or updated.

Type: Grant

Filed: September 2, 2011

Date of Patent: August 20, 2013

Assignee: Microsoft Corporation

Inventors: Ashish Consul, Suryanarayana M. Gorti, Michael Geoffrey Andrew Wilson, James C. Kleewein
Identifying geographic features from query prefixes

Patent number: 8515973

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for identifying geographic features. In one aspect, a method includes receiving a query. Geographic features are identified, each geographic feature being associated with one or more names, each geographic feature being associated with at least one name that includes the query. A feature-query score is computed for each geographic feature, including: for each name of the geographic feature that includes the query, identifying a computed feature-name score, wherein the feature-name score is computed based on a count of a number of occurrences of the name in a query log, wherein each occurrence is attributed to the feature; and computing the feature-query score based on the identified feature-name scores. The geographic features are ranked according to the feature-query scores.

Type: Grant

Filed: March 25, 2011

Date of Patent: August 20, 2013

Assignee: Google Inc.

Inventors: Radu Jurca, Anja Hauth, Ivan Zauharodneu, Matsvei Zhdanovich, Luuk Van Dijk, Steffen Meschkat, David E. Lecomte
Finding relevant documents

Patent number: 8515972

Abstract: A programmed computer receives one or more documents that contain text that is relevant to a user (“interest documents”). The programmed computer automatically identifies groups of words that match the interest documents. The matching word groups are ranked by a weight that is assigned based on how infrequently a word group matches a reference corpus and how frequently the word group matches one or more interest document(s), in comparison to other word groups. A set of word groups are automatically identified based on ranking, and displayed to a user to select documents from a corpus. Selected documents are displayed to the user, e.g. with one or more group of words used in selecting the documents.

Type: Grant

Filed: February 10, 2010

Date of Patent: August 20, 2013

Assignee: Python 4 Fun, Inc.

Inventors: Devabhaktuni Srikrishna, Marc Coram
Book content item search

Patent number: 8510314

Abstract: Methods, systems, and apparatus, including computer program products are provided for ranking distinct book content items based on implicit links to other distinct book content items. The implicit links are defined based on the identification of matching features in the distinct book content items. In some implementations, the matching features are uncommon phrases in textual content of the distinct book content items. Edges representing implicit links are generated between distinct nodes representing distinct book content items in a weighted graph. Search results for distinct book content items can be ordered based on the edges connected to the distinct nodes in the weighted graph that represent the distinct book content items.

Type: Grant

Filed: October 6, 2011

Date of Patent: August 13, 2013

Assignee: Google Inc.

Inventors: Shumeet Baluja, Yushi Jing
Method and apparatus for sorting inquiry results

Patent number: 8504563

Abstract: Sorting inquiry results includes, based on extracted inquiry results matching search conditions of a user, collecting features of the inquiry results. The collected features may be used as features of a respective inquiry result and feature fitting may be conducted based on a support vector machine (SVM) regression model to obtain a feature fitting value of the respective inquiry result. The inquiry results may be sorted based on relevancy values of the inquiry results, and, for inquiry results having a same relevancy level, the inquiry results may be sorted in a top-down manner based on feature fitting values of the inquiry results.

Type: Grant

Filed: July 22, 2011

Date of Patent: August 6, 2013

Assignee: Alibaba Group Holding Limited

Inventors: Chao Chen, Xiaomei Han
Related word presentation device

Patent number: 8504357

Abstract: A related word presentation device includes a program information storage unit that stores program information of each program; and an information dividing unit that generates, for each of the attributes of the words included in the program information, at least one group which includes a reference word belonging to the attribute and a set of words which co-occur with the reference word in a program. A degree-of-relevance calculating unit stores attribute-based association dictionaries each of which indicates, for the corresponding attribute of words, (i) the words and (ii) the degrees of relevance between the words calculated based on the frequency of co-occurrence in each of groups. A search condition obtaining unit obtains the search word and the attribute; a substitute word obtaining unit selects substitute words from the attribute-based association dictionary for the obtained attribute; and an output unit presents the selected substitute word.

Type: Grant

Filed: July 30, 2008

Date of Patent: August 6, 2013

Assignee: Panasonic Corporation

Inventors: Takashi Tsuzuki, Satoshi Matsuura, Kazutoyo Takata
Semantic analysis of documents to rank terms

Patent number: 8504564

Abstract: A method, apparatus and computer program product provides for a semantic analyzer to produce and rank semantic terms to reflect their relationship to the theme and topics of a document. The text and the document can have no relationship to any pre-selected keywords before the semantic analyzer performs text extraction. The semantic analyzer extracts text from a document and performs semantic analysis on the extracted text. The semantic analyzer provides a plurality of ranked semantic terms as a result of the semantic analysis and associates semantic terms with the document as semantic keywords. The semantic terms define content to be presented with the document where the content is an advertisement, a link to a remote information resource or a second document.

Type: Grant

Filed: December 15, 2010

Date of Patent: August 6, 2013

Assignee: Adobe Systems Incorporated

Inventors: Walter Chang, Nadia Ghamrawi
System and method for near and exact de-duplication of documents

Patent number: 8504578

Abstract: A system, method and computer program product for identifying near and exact-duplicate documents in a document collection, including for each document in the collection, reading textual content from the document; filtering the textual content based on user settings; determining N most frequent words from the filtered textual content of the document; performing a quorum search of the N most frequent words in the document with a threshold M; and sorting results from the quorum search based on relevancy. Based on the values of N and M near and exact-duplicate documents are identified in the document collection.

Type: Grant

Filed: August 16, 2012

Date of Patent: August 6, 2013

Assignee: MSC Intellectual Properties B.V.

Inventors: Johannes C. Scholtes, Siebe Bloembergen
Using domain intent to provide more search results that correspond to a domain

Patent number: 8504561

Abstract: Techniques are described herein for using intent to access a domain (i.e., domain intent) to provide more search results that correspond to the domain. For example, a rule may specify a maximum number of search results that are allowed to be provided from a domain (or a host that corresponds to the domain) in response to a search query. Each search query may include any number of ngrams. An ngram is a subsequence of elements in a sequence (e.g., a search query). An intent to access a domain may be determined based on one or more of the ngrams in a search query. A number of search results that correspond to a domain may be increased to be greater than the maximum number based on one or more of the ngrams that are included in the search query being associated with the intent to access the domain.

Type: Grant

Filed: September 2, 2011

Date of Patent: August 6, 2013

Assignee: Microsoft Corporation

Inventors: Timothy C. Hoad, Deepak Vijaywargi, Yatharth Saraf
Automated detection of null field values and effectively null field values

Patent number: 8489617

Abstract: Disclosed are systems for, and methods of, automatically detecting and treating field values of a particular field as null field values in records of a database. The system and method provide automatic treatment of these field values as null field values by calculating a critical frequency for the field. Based on the critical frequency of the field, the system and method treats field values that occur more than the critical frequency of the field as null field values and treats field values that occur less than the critical frequency as non-null field values.

Type: Grant

Filed: June 5, 2012

Date of Patent: July 16, 2013

Assignee: LexisNexis Risk Solutions FL Inc.

Inventor: David Alan Bayliss
METHOD OR SYSTEM FOR IDENTIFYING WEBSITE LINK SUGGESTIONS

Publication number: 20130173568

Abstract: Methods and systems are provided that may be utilized to generate website link suggestions.

Type: Application

Filed: December 28, 2011

Publication date: July 4, 2013

Applicant: YAHOO! INC.

Inventors: Vanja Josifovski, Evgeniy Gabrilovich, Bo Pang, Fernando Diaz, Jangwon Seo
Natural language text analytics

Patent number: 8473498

Abstract: A method of text analytics includes filtering a plurality of unfiltered records having unstructured data into at least a first group and a second group. The first group and said second group each include at least two records and the first group is different than the second group. The method includes determining a first proportion of occurrence for a term by comparing a first number of records having at least one occurrence of the term in the first group to a first total number of records in the first group, determining a second proportion of occurrence for the term by comparing a second number of records having at least one occurrence of the term in said second group to a second total number of records in the second group, and comparing the first proportion of occurrence to the second proportion of occurrence to yield a resultant comparison occurrence.

Type: Grant

Filed: August 2, 2011

Date of Patent: June 25, 2013

Inventor: Tom H. C. Anderson
ENTITY SUMMARIZATION AND COMPARISON

Publication number: 20130151538

Abstract: An entity summarization system is described herein that mines the Internet and other data source to provide answers to questions such as the relative sentiment of users towards various brands. The system uses a controlled vocabulary list describing a specific aspect of entities of interest. Given an entity name, the system scans the whole content corpus to collect statistics on the words that occur most frequently in the context of the entity name, taking into account proximity information, to produce a weighted list of vocabulary terms describing the entity. Two entities can be compared by normalizing and comparing their weighted term lists. In some embodiments, the system performs these procedures efficiently by leveraging an N-gram web model. Thus, the system provides an automated way to compare two entities to derive information about how users feel about the entities at any given time.

Type: Application

Filed: December 12, 2011

Publication date: June 13, 2013

Applicant: MICROSOFT CORPORATION

Inventors: Pavel Dmitriev, Wei Zhuang
Mining global email folders for identifying auto-folder tags

Patent number: 8463827

Abstract: Embodiments are directed towards identifying auto-folder tags for messages by using a combinational optimization approach of bi-clustering folder names and features of messages based on relationship strengths. The combinational optimization approach of bi-clustering, generally, groups a plurality of folder names and a plurality of features into one or more metafolders to optimize a cost. The cost is based on an aggregate of cut relationship strengths, where a cut results when a relationship folder name and feature are grouped in separate metafolders. Furthermore, the plurality of folder names and the plurality of features are obtained by monitoring actions of a plurality of users, where the folder names are user generated folder names and features are from a plurality of messages. The metafolders may be used to tag new user messages with an auto-folder tag.

Type: Grant

Filed: January 4, 2011

Date of Patent: June 11, 2013

Assignee: Yahoo! Inc.

Inventors: Vishwanath Tumkur Ramarao, Andrei Broder, Idan Szpektor, Edo Liberty, Yehuda Koren, Mark E. Risher, Yoelle Maarek Smadja
Document analysis and multi-word term detector

Patent number: 8458198

Abstract: A term analyzer receives an ordered collection of text-based terms. The term analyzer analyzes groupings of consecutive text-based terms in the ordered collection to identify occurrences of different combinations of text-based terms. In addition, the term analyzer maintains frequency information representing the occurrences of the different combinations of text-based terms in the collection. The frequency information can then be used to determine relatively significant keywords and/or keyword phrases in the document. In an example configuration, the term analyzer creates a tree in which a first term in a given grouping of the groupings is defined as a parent node in the tree and a second term in the given grouping is defined as a child node of the parent node in the tree. The method of the analyzer generalizes to create a tree of multi-word terms in which the terms can be efficiently ranked by occurrence.

Type: Grant

Filed: December 5, 2011

Date of Patent: June 4, 2013

Assignee: Adobe Systems Incorporated

Inventors: Michael J. Welch, Walter W. Chang
Methodology to establish term co-relationship using sentence boundary detection

Patent number: 8452774

Abstract: A method and system for splitting a text document into individual sentences using sentence boundary detection, and establishing co-relationships between terms which are present in the same sentence. A document corpus, or collection of text records, is provided, containing text with terms to be extracted. The text records in the document corpus are divided into individual sentences, using a set of rules for sentence boundary detection. The individual sentences are then analyzed to extract and correlate terms, such as parts and symptoms, symptoms and actions, or parts and failure modes. The correlated terms are then validated based on frequency of occurrence, with term pairs being considered valid if their frequency of occurrence exceeds a minimum frequency threshold. The validated term correlations can be used for fault model development, document classification, and document clustering.

Type: Grant

Filed: March 10, 2011

Date of Patent: May 28, 2013

Assignee: GM Global Technology Operations LLC

Inventor: Dnyanesh Rajpathak
Robust Fitting of Surfaces from Noisy Data

Publication number: 20130132407

Abstract: Various embodiments of methods and apparatus for fitting a surface to a data set are disclosed. A frequency distribution of an input data set is determined. Determining the frequency distribution includes assigning each data point of the input data set to a category representing a value of a variable for the respective data point. Responsive to identifying one or more discontinuities of the frequency distribution, a continuous section of the frequency distribution is identified as a first data set. A first equation is fit to the first data set.

Type: Application

Filed: February 25, 2011

Publication date: May 23, 2013

Inventors: Balaji Krishnmurthy, Anubha Rastogi
COLLABORATIVE BOOKMARKING

Publication number: 20130124541

Abstract: A method and system for collaborating tags in a bookmarking system wherein the bookmarking system includes a plurality of tags applied to content items by a plurality of users, the method and system including, examining all the tags that are applied to all the content items, determining whether two tags have been assigned to the same content item, if two tags have been assigned to the same content item, computing the relative strength of each of the two tags with respect to each other.

Type: Application

Filed: January 2, 2013

Publication date: May 16, 2013

Applicant: International Business Machines Corporation

Inventor: International Business Machines Corporation
Adaptive cell-specific dictionaries for frequency-partitioned multi-dimensional data

Patent number: 8442988

Abstract: A cell-specific dictionary is applied adaptively to adequate cells, where the cell-specific dictionary subsequently optimizes the handling of frequency-partitioned multi-dimensional data. This includes improved data partitioning with super cells or adjusting resulting cells by sub-dividing very large cells and merging multiple small cells, both of which avoid the highly skewed data distribution in cells and improve the query processing. In addition, more efficient encoding is taught within a cell in case the distinct values that actually appear in that cell are much smaller than the size of the column dictionary.

Type: Grant

Filed: November 4, 2010

Date of Patent: May 14, 2013

Assignee: International Business Machines Corporation

Inventors: Oliver Draese, Namik Hrle, Oliver Koeth, Tianchao Li, Vijayshankar Raman, Knut Stolze
Extending media annotations using collective knowledge

Patent number: 8429176

Abstract: The present invention is directed towards systems and methods for extending media annotations using collective knowledge. The method according to one embodiment of the present invention comprises receiving a plurality of content items and associated annotations. The method further normalizes the plurality of associated annotations and calculates pair frequencies for the plurality of associated annotations. The method then retrieves a plurality of alternative annotations and provides the plurality of alternative annotations.

Type: Grant

Filed: March 28, 2008

Date of Patent: April 23, 2013

Assignee: Yahoo! Inc.

Inventors: Borkur Sigurbjornsson, Roelof van Zwol
METHODS AND SYSTEMS FOR PERFORMING TIME-PARTITIONED COLLABORATIVE FILTERING

Publication number: 20130091151

Abstract: In accordance with disclosed embodiments, there are provided methods, systems, and apparatuses for performing time-partitioned collaborative filtering in an on-demand service environment including, for example, receiving as input, a plurality of access requests for data stored within the host organization and a corresponding plurality of actions for the data to which access is requested; accessing an input table having a time field, action field, item field, and agent field therein; recording time data and agent data for each of the received plurality of access requests and the corresponding plurality of actions; recording an item within the item field and an action within the action field for each of the received plurality of access requests and the corresponding plurality of actions based on the action performed on an item of the data to which access is requested; and analyzing the input table to generate one or more pairs of first actions and items to second actions and items and a time based score for eac

Type: Application

Filed: October 2, 2012

Publication date: April 11, 2013

Applicant: SALESFORCE.COM, INC.

Inventor: Salesforce.com, Inc.
INFORMATION GENERATING COMPUTER PRODUCT, APPARATUS, AND METHOD; AND INFORMATION SEARCH COMPUTER PRODUCT, APPARATUS, AND METHOD

Publication number: 20130086086

Abstract: A computer-readable recording medium stores a program causing a computer to execute an information generating process that includes tabulating an appearance frequency for each designated word in an object file group in which character strings are described; identifying for each designated word and based on the appearance frequency tabulated for the designated word, a rank in descending order up to a target appearance rate for the designated words; detecting in an object file selected from the object file group, specific designated words among the identified ranks; and generating for each of the detected specific designated words, index information that indicates the presence/absence of the specific designated word in each object file among the object file group.

Type: Application

Filed: November 27, 2012

Publication date: April 4, 2013

Applicant: FUJITSU LIMITED

Inventor: FUJITSU LIMITED
COMPUTER PRODUCT, ANALYSIS SUPPORT METHOD, ANALYSIS SUPPORT APPARATUS, AND SYSTEM

Publication number: 20130086085

Abstract: A computer-readable recording medium has stored therein a program for causing a computer to execute an analysis support process that includes storing to a storage device, a name of a second process that is a process included among a plurality of processes called in response to execution of a program, the computer storing the name of the second process when a first process having a name that matches a keyword stored in a storage device is included among the processes.

Type: Application

Filed: August 13, 2012

Publication date: April 4, 2013

Applicant: FUJITSU LIMITED

Inventor: Shingo KATO
Method for calculating relevance between words based on document set and system for executing the method

Patent number: 8407233

Abstract: A method and system for calculating a relevance between words using a document set is provided. The method of calculating the relevance between words based on a document set, includes: obtaining statistical information about the words based on at least one of the words, documents, a word classification of the words, and a document classification of the documents, wherein the words and the documents are included in the document set; standardizing the statistical information; and calculating the relevance between the words based on the standardized statistical information.

Type: Grant

Filed: December 10, 2007

Date of Patent: March 26, 2013

Assignee: NHN Business Platform Corporation

Inventors: Ki Ho Song, Byoung Hak Kim, Min uk Kim, Tae Yeong Kwak
Automated tagging of objects in databases

Patent number: 8407216

Abstract: Embodiments of the present invention provide systems and methods for automatically generating tag terms (or tags) for objects in databases of a web site. The metadata of the objects (or data) of the web site are processed and parsed to automatically generate tag terms for the corresponding objects. Information (or data, or content) downloaded from the Internet often comes with metadata, which can exist in titles, description, sources, and authors of the information, etc. The metadata of downloaded information can be process and parsed to generate tag terms for the corresponding objects. The system can automatically generate tag terms for the data, which are stored as objects in the databases, and make the data (or objects) searchable. In addition, the automatically generated tag terms allow associated data to maintain their relationship. For example, data from the same sources, same authors, or same subjects can be identified based on the common tag terms.

Type: Grant

Filed: September 25, 2008

Date of Patent: March 26, 2013

Assignee: Yahoo! Inc.

Inventors: Hubert M. Walker, Noel C. Morrison, Timothy Caplis, Scott Bedard, Ankarino S. Lara, Stephen James Blake
Methods and systems for determing media value

Patent number: 8402035

Abstract: Exemplary embodiments are directed to determining a media value associated mentions of an entity in one or more documents based on a sentiment attributed to the mentions of the entity and/or a frequency with which the entity is mentioned. Exemplary embodiments can include a media value engine that can identify mentions of an entity in documents, attribute sentiment to the mentions of the entity; determine a polarity of the sentiment, and calculate a media value attributed to the entity based on the sentiment.

Type: Grant

Filed: March 14, 2011

Date of Patent: March 19, 2013

Assignee: General Sentiment, Inc.

Inventors: Greg Artzt, Mark Fasciano, Steve Skiena, Levon Lloyd
Generating context-based spell corrections of entity names

Patent number: 8402032

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for correcting entity names. One method includes receiving texts and deriving a plurality of name-context pairs from the texts. The method further includes calculating a context consistency measure for each name-context pair and storing context-entity name data representing the name-context pairs. Another method includes identifying an entity name and one or more context terms from a query and generating candidate names for the entity name. The method further includes determining a score for each of the candidate names, selecting a number of top scoring candidate names, and using the selected candidate names to respond to the query.

Type: Grant

Filed: March 24, 2011

Date of Patent: March 19, 2013

Assignee: Google Inc.

Inventors: Lawrence J. Brunsman, Matthieu Devin, Uri N. Lerner, Simon Tong
Using a dynamically-generated content-level newsworthiness rating to provide content recommendations

Patent number: 8402034

Abstract: A method for providing content-level data artifact recommendations can begin with the creation of a semantic library from the textual content of data artifacts by a newsworthy content recommendation engine. A base newsworthiness rating can be calculated using global newsworthiness parameters and behavioral functions that model newsworthy influences for each relationship contained in the semantic library. A user-specific search network can be generated that represents user-entered criteria and/or user task-related criteria. Within the semantic library, potential newsworthy semantic networks can be identified. Newsworthy content from each identified potential newsworthy semantic network can be dynamically determined based upon the base newsworthiness rating and a predefined newsworthiness threshold. The newsworthy content from the identified potential newsworthy semantic network can be related to the user-specific search network at the common node, creating a newsworthy content recommendation graph.

Type: Grant

Filed: March 2, 2012

Date of Patent: March 19, 2013

Assignee: International Business Machines Corporation

Inventors: Daniel John McCloskey, Marcello Trovati, Carol Sue Zimmet
Phrase based snippet generation

Patent number: 8402036

Abstract: Disclosed herein is a method, a system and a computer product for generating a snippet for an entity, wherein each snippet comprises a plurality of sentiments about the entity. One or more textual reviews associated with the entity is selected. A plurality of sentiment phrases are identified based on the one or more textual reviews, wherein each sentiment phrase comprises a sentiment about the entity. One or more sentiment phrases from the plurality of sentiment phrases are selected to generate a snippet.

Type: Grant

Filed: June 24, 2011

Date of Patent: March 19, 2013

Assignee: Google Inc.

Inventors: Sasha Blair-Goldensohn, Kerry Hannan, Ryan T. McDonald, Tyler Neylon, Jeffrey C. Reynar
Convergence of terms within a collaborative tagging environment

Patent number: 8402022

Abstract: Tools and techniques for converging terms within a collaborative tagging environment are described herein. Methods for converging divergent contributions to the collaborative tagging environment may include receiving respective contributions from users within the environment. The methods may identify at least some of the contributions as divergent, and enable the users to converge the divergent contributions.

Type: Grant

Filed: September 29, 2006

Date of Patent: March 19, 2013

Inventors: Martin R. Frank, Walter Manching Tseng
Ranking authors and their content in the same framework

Patent number: 8396879

Abstract: One or more server devices may simultaneously calculate first ranking scores for a group of users and second ranking scores for a group of comments authored by the group of users. The calculating may occur during a same process. The one or more server devices may further provide one of a first ranked list that includes information identifying the group of users, the information identifying the group of users being ordered based on the first ranking scores, or a second group of comments of the group of comments, the comments in the second group of comments being ordered based on the second ranking scores.

Type: Grant

Filed: February 28, 2012

Date of Patent: March 12, 2013

Assignee: Google Inc.

Inventors: Michal Cierniak, Na Tang
Query optimization over graph data streams

Patent number: 8392398

Abstract: A method for executing a query on a graph data stream. The graph stream comprises data representing edges that connect vertices of a graph. The method comprises constructing a plurality of synopsis data structures based on at least a subset of the graph data stream. Each vertex connected to an edge represented within the subset of the graph data stream is assigned to a synopsis data structure such that each synopsis data structure represents a corresponding section of the graph. The method further comprises mapping each received edge represented within the graph data stream onto the synopsis data structure which corresponds to the section of the graph which includes that edge, and using the plurality of synopsis data structures to execute the query on the graph data stream.

Type: Grant

Filed: July 29, 2009

Date of Patent: March 5, 2013

Assignee: International Business Machines Corporation

Inventors: Charu C. Aggarwal, Min Wang, Peixiang Zhao
System and method for grouping similar documents

Patent number: 8380718

Abstract: A system and method for grouping similar documents is provided. Frequencies of occurrences are determined for terms and noun phrases within a set of documents. A subset of the documents is selected by removing those documents having terms and noun phrases that fall outside a bounded range of upper and lower conditions for frequency of occurrence. Each of the documents in the subset is mapped to a cluster of documents based on a similarity of the documents to the cluster documents.

Type: Grant

Filed: September 2, 2011

Date of Patent: February 19, 2013

Assignee: FTI Technology LLC

Inventors: Dan Gallivan, Kenji Kawai
SYSTEM AND METHOD FOR PROFILING CLIENTS WITHIN A SYSTEM FOR HARVESTING COMMUNITY KNOWLEDGE

Publication number: 20130041906

Abstract: A privacy-preserving system and method is disclosed for profiling clients within a system for knowledge management. The method of the present invention discloses steps for generating a client profile in support of receiving and processing messages using scoring techniques and/or filtering techniques. The method of the present invention further includes steps for generating a client profile in support of a method for generating and obtaining responses to messages using scoring techniques and/or filtering techniques. The system of the present invention, includes all means for implementing the method.

Type: Application

Filed: August 7, 2012

Publication date: February 14, 2013

Inventors: Eytan Adar, Rajan Mathew Lukose, Joshua Rogers Tyler, Caesar Sengupta
Book content item search

Patent number: 8375036

Abstract: Methods, systems, and apparatus, including computer program products are provided for ranking distinct book content items based on implicit links to other distinct book content items. The implicit links are defined based on the identification of matching features in the distinct book content items. In some implementations, the matching features are uncommon phrases in textual content of the distinct book content items. Edges representing implicit links are generated between distinct nodes representing distinct book content items in a weighted graph. Search results for distinct book content items can be ordered based on the edges connected to the distinct nodes in the weighted graph that represent the distinct book content items.

Type: Grant

Filed: November 17, 2011

Date of Patent: February 12, 2013

Assignee: Google Inc.

Inventors: Shumeet Baluja, Yushi Jing
Methods and apparatus to identify images in print advertisements

Patent number: 8368918

Abstract: Methods and apparatus to methods and apparatus to identify images in print advertisements are disclosed. An example method comprises computing a first image feature vector for a first presented image, comparing the first image feature vector to a second image feature vector, and when the first image feature vector matches the second image feature vector, storing printed-media information associated with the first presented image in a database record associated with the second image feature vector.

Type: Grant

Filed: September 14, 2007

Date of Patent: February 5, 2013

Assignee: The Nielsen Company (US), LLC

Inventors: Kevin Deng, Alan Nguyen Bosworth
Method and system for comparing attributes such as business names

Patent number: 8370366

Abstract: Embodiments of systems and methods for comparing attributes of a data record are presented herein. Broadly speaking, embodiments of the present invention generate a weight based on a comparison of the name (or other) attributes of data records. More particularly, embodiments of the present invention generate a weight based on a comparison of name attributes. More specifically, embodiments of the present invention may calculate an information score for each of two name attributes to be compared to get an average information score for the two name attributes. The two name attributes may then be compared against one another to generate a weight between the two attributes. This weight can then be normalized to generate a final weight between the two business name attributes.

Type: Grant

Filed: January 14, 2010

Date of Patent: February 5, 2013

Assignee: International Business Machines Corporation

Inventors: Norm Adams, Scott Ellard, Scott Schumacher
System and methods for ranking documents based on content characteristics

Patent number: 8370347

Abstract: A system is described for assessing information in natural language contents. A user interface receives an object name as a query term and a value for a customized ranking parameter from a user. A computer storage device stores an object-specific data set related to the object name, wherein the object-specific data set includes a plurality of property names and association-strength values. A computer processing system can count a first frequency of a first property name and count a second frequency of a second property name in a document containing text in a natural language, calculate a relevance score as a function of the first frequency and the second frequency, and rank the plurality of documents using their respective relevance scores, and return one or more documents to the user based on the ranking of the plurality of documents. The function is in part defined by the customized ranking parameter.

Type: Grant

Filed: February 17, 2012

Date of Patent: February 5, 2013

Inventor: Guangsheng Zhang
System and method for information identification

Patent number: 8364706

Abstract: A system and a method of retrieving information is described. In a system according to the invention, software modules may be used to provide the user with information that is most likely to be the information desired.

Type: Grant

Filed: June 18, 2004

Date of Patent: January 29, 2013

Assignee: ZI Corporation of Canada, Inc.

Inventor: Todd Garrett Simpson
METHOD AND SYSTEM OF EXTRACTING CONCEPTS AND RELATIONSHIPS FROM TEXTS

Publication number: 20130007020

Abstract: An exemplary embodiment of the present techniques extracts concepts and relationships from a text. Concepts may be generated from the text using singular value decomposition, and ranked based on a term weight and a distance metric. The concepts that are ranked above a particular threshold may be iteratively extracted, and the concepts may be merged to form larger concepts until the generation of concepts has stabilized. Relationships may be generated based on the concepts using singular value decomposition, then ranked based on various metrics. The relationships that are ranked above a particular threshold may be extracted.

Type: Application

Filed: June 30, 2011

Publication date: January 3, 2013

Inventors: Sujoy Basu, Sharad Singhal

prev … 2 3 4 5 6 7 8 9 10 … next