Based On Term Frequency Of Appearance Patents (Class 707/750)

Estimating Unique Entry Counts Using a Counting Bloom Filter

Publication number: 20140149433

Abstract: A method of estimating a number of unique entry counts of an attribute in a database comprises, with a processor: identifying a sample of entries from an attribute database, determining frequencies of a number of input observations of the sample of entries, determining a number of high frequency values of the sample of entries, and estimating a number of unique entry counts of an attribute within the attribute database using a counting Bloom filter and based on the frequencies of the input observations and the high frequency values.

Type: Application

Filed: November 27, 2012

Publication date: May 29, 2014

Applicant: HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.

Inventors: Choudur Lakshminarayan, Hansjorg Zeller, QiFan Chen, Ramakumar Kosuru
Generating sharable recommended and popular E-mails

Patent number: 8738637

Abstract: A method of determining popularity of an e-mail is provided. The method includes receiving an e-mail and determining if a generated signature is associated with the e-mail. If there is no generated signature, then a signature is generated for associating with the e-mail. A popularity measure associated with the e-mail is determined based on the signature. Furthermore, a method of determining popularity of an e-mail is provided. The method includes receiving an e-mail and identifying a generated signature associated with the e-mail. The method further includes determining a match of the associated generated signature with a record of the generated signature, if the generated signature is identified. If the identified generated signature is determined to match the record of the generated signature, then a popularity measure associated with the e-mail is increased.

Type: Grant

Filed: June 3, 2008

Date of Patent: May 27, 2014

Assignee: Yahoo! Inc.

Inventors: Jyh-Shin Shue, Jeff Weng
Method and system for document presentation and analysis

Patent number: 8739032

Abstract: A document analysis system receives multiple concepts along with multiple reference documents and generates sensory indicators that assist a researcher in assessing the relevance of each of the documents to the concepts. In one exemplary aspect, the document analysis system displays a table of keywords separated into blocks, each block of keywords corresponding to one of the concepts. Each block is colored according to the prevalence of any keyword within a given keyword group. The color of a block thus indicates the relative presence of a concept in the document. The document analysis system also determines a unique color for each block of keywords for highlighting in the text of the document. In this manner a researcher can quickly identify passages that contain multiple concepts. Additionally, the researcher is provided the ability to quickly locate reference characters, figure numbers and patent numbers in the document.

Type: Grant

Filed: October 12, 2010

Date of Patent: May 27, 2014

Inventor: Patrick Sander Walsh
Selecting content for publication

Patent number: 8732185

Abstract: Among other disclosed subject matter, a computer-implemented method relating to selecting content for publication includes receiving a term to be used in selecting content for publication. The method includes obtaining information from a record using the received term, the information reflecting a correspondence between contents in a repository and the received term. The method includes determining, using at least the obtained information, a query to be performed on the repository for selecting at least part of the content.

Type: Grant

Filed: September 10, 2012

Date of Patent: May 20, 2014

Assignee: Google Inc.

Inventors: Nicholas Lynn, Alexander P. Carobus
Method and Apparatus of Generating Update Parameters and Displaying Correlated Keywords

Publication number: 20140136551

Abstract: Provided is a method of generating updating parameters. The method obtains search keywords used by users within a predetermined time period; counts the search keywords to obtain primary keywords, related keywords, co-search frequencies of each primary keyword and the respective related keywords being searched together, and search frequencies of the primary keywords being searched alone; computes first feature values based on the search frequencies of the primary keywords being searched alone; and then computes second feature values based on the first feature values and the co-search frequencies of the primary keywords and the respective related keywords. The second feature values serve as updating parameters for determining displaying modes of the related keywords. An apparatus of generating updating parameters, and a method and an apparatus of displaying related keywords according to the updating parameters are also provided.

Type: Application

Filed: January 21, 2014

Publication date: May 15, 2014

Applicant: Alibaba Group Holding Limited

Inventors: Lei Pan, Yuanhu Yao, Zhen Yang, Tianji Zhang
Session-based query suggestions

Patent number: 8725756

Abstract: Methods, systems, and apparatus, including computer program products, in which one or more search query suggestions are made for a current search session. Similar previous search sessions which include search queries common to the current search session are identified. Based upon the similar previous search sessions, one or more suggested search queries are derived and provided to a search engine interface for serving to a user or a client.

Type: Grant

Filed: November 11, 2008

Date of Patent: May 13, 2014

Assignee: Google Inc.

Inventors: Ashutosh Garg, Kedar Dhamdhere
Computer-implemented system and method for clustering similar documents

Patent number: 8725736

Abstract: A computer-implemented system and method for clustering similar documents is provided. Concepts are identified for a set of documents and occurrence frequencies are determined for each concept in the documents set. A distance quantifying a similarity for each of the documents in the set with one or more clusters of documents is calculated. Each document is mapped to at least one of the one or more document clusters.

Type: Grant

Filed: February 14, 2013

Date of Patent: May 13, 2014

Assignee: FTI Technology LLC

Inventors: Dan Gallivan, Kenji Kawai
Method for searching relation sudden rising word and system thereof

Patent number: 8725723

Abstract: A method and system for searching for a related term having rapidly increasing popularity is provided. The method includes: analyzing a search log and extracting a daily search frequency for each search term; comparing peaks of the daily search frequency, extracted for each search term in a predetermined period; and analyzing relevance between candidate search terms in which the peaks have occurred together in the predetermined period as a result of the comparison and filtering out a candidate search term having no relevance.

Type: Grant

Filed: August 8, 2008

Date of Patent: May 13, 2014

Assignee: NHN Corporation

Inventor: Dong Wook Kim
Classifying text into hierarchical categories

Patent number: 8725732

Abstract: Systems, methods and program products for classifying text. A system classifies text into first subject matter categories. The system identifies one or more second subject matter categories in a collection of second subject matter categories, each of the second categories is a hierarchical classification of a collection of confirmed valid search results for queries, in which at least one query for each identified second category includes a term in the text. The system filters the identified categories by excluding identified categories whose ancestors are not among the first categories. The system selects categories from the filtered categories based on one or more thresholds in which a threshold specifies a degree of relatedness between a selected category and the text. The selected categories are a sufficient basis for recommending content to a user, the content being associated with one or more of the selected categories.

Type: Grant

Filed: March 22, 2012

Date of Patent: May 13, 2014

Assignee: Google Inc.

Inventors: Glen M. Jeh, Beverly Yang
Summarizing reviews

Patent number: 8719283

Abstract: Summarizing a set of reviews is disclosed. In some embodiments, a set of reviews is analyzed, e.g., by an at least partially automated process. A summary of the information included in the set of reviews is provided. The summary includes a visual indication of a range and distribution of opinions expressed in the set of reviews. In some embodiments, the set of reviews includes reviews from one or more members of an online or other user community, such as customers of an online store, subscribers to a podcast, blog, or other online source of content, etc.

Type: Grant

Filed: September 29, 2006

Date of Patent: May 6, 2014

Assignee: Apple Inc.

Inventor: David A. Koski
Document-related representative information

Patent number: 8712991

Abstract: Some implementations include techniques and arrangements to provide document-related representative information with search results. For example, a representative query and/or representative results may be provided for one or more individual documents identified in a set of search results to supplement the search results returned in response to a received search query. The representative queries may be determined by correlating a plurality of previously submitted queries in search log data with a plurality of documents returned in response to the queries. In some implementations, click-through frequency for a particular document with respect to the plurality of queries may be taken into consideration when determining the representative queries for the particular document.

Type: Grant

Filed: July 7, 2011

Date of Patent: April 29, 2014

Assignee: Microsoft Corporation

Inventors: Jingdong Wang, Shipeng Li
Associating objects in databases by rate-based tagging

Patent number: 8713009

Abstract: Embodiments of the present invention provide automatic systems and methods for associating objects in databases of a web site by rate-based tagging. The frequencies of users entering specific tag terms for objects stored in the databases of the web site are used to determine hard associations between objects and tag terms and between objects. When the frequencies of user tags exceed established thresholds, hard associations between objects and tag terms are established. When objects are identified or determined to have hard association with tag terms, the objects are determined to be more clearly associated with the corresponding tag terms. Therefore, they should be highlighted or featured in more prominent locations on web pages of the web site to increase users' confidence in content of the web site. To identify hard-associated objects, more weights can be assigned to the hard-associated objects, which allows them to be more likely to be selected for display in prominent locations.

Type: Grant

Filed: September 25, 2008

Date of Patent: April 29, 2014

Assignee: Yahoo! Inc.

Inventors: Hubert M. Walker, Noel C. Morrison, Ankarino S. Lara, Scott Bedard, Stephen James Blake
Configurable Dynamic Matching System

Publication number: 20140101172

Abstract: A system is provided that that dynamically matches data originating from one or more data sources. The system analyzes a matching configuration file, where the matching configuration file includes one or more matching configurations. The system modifies a probabilistic matching algorithm of a matching engine at runtime based on the one or more matching configurations and based on two or more data records of the plurality of data records that require matching. The system compares two data records of a plurality of data records using the modified probabilistic matching algorithm. The system generates a match score for the two data records based on the match weight for each data record field.

Type: Application

Filed: October 5, 2012

Publication date: April 10, 2014

Applicant: ORACLE INTERNATIONAL CORPORATION

Inventor: Swaranjit Singh DUA
METHOD OF PROVIDING INFORMATION OF MAIN KNOWLEDGE STREAM AND APPARATUS FOR PROVIDING INFORMATION OF MAIN KNOWLEDGE STREAM

Publication number: 20140101173

Abstract: A method for providing information about a main knowledge stream is disclosed. According to an embodiment of the present invention, the method includes obtaining reference links representing reference relationships among reference documents in each of a plurality of documents stored in a database, determining one or more basic paths connecting the reference links, calculating probability values of the reference links by overlapping the determined basic paths, determining a first document among the documents and an input reference link associated with the first document, and performing a Markov chain model using a probability value of the input reference link, and calculating information about the main knowledge stream associated with the first document using the result obtained by performing the Markov chain model.

Type: Application

Filed: December 24, 2012

Publication date: April 10, 2014

Applicant: KOREA INSTITUTE OF SCIENCE AND TECHNOLOGY INFORMATION

Inventor: Korea Institute of Science and Technology Information
Evaluation of substitute terms

Patent number: 8682907

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for evaluating substitute terms. One of the methods includes selecting a first term and a second term. A first co-occurrence frequency is determined for co-occurring terms in search queries that include the first term. A first vector is generated for the first term using the first co-occurrence frequencies. A second co-occurrence frequency is determined for the co-occurring terms in the search queries that include the first term adjacent to the second term. A second vector is generated for the second term using the second co-occurrence frequencies. A score for the second term as a context for a substitution rule based on the first term is computed, wherein the score is based on a comparison between the first vector and the second vector.

Type: Grant

Filed: March 30, 2012

Date of Patent: March 25, 2014

Assignee: Google Inc.

Inventors: Ke Yang, Zachary A. Garrett, Daisuke Ikeda
Defragmentation during multiphase deduplication

Patent number: 8682870

Abstract: Defragmentation during multiphase deduplication. In one example embodiment, a method of defragmentation during multiphase deduplication includes an analysis phase that includes analyzing each allocated block stored in a source storage at a point in time to determine if the block is duplicated in a vault storage, a defragmentation phase that includes reordering the duplicate blocks stored in the source storage to match the order of the duplicate blocks as stored in the vault storage, and a backup phase that is performed after completion of the defragmentation phase and that includes storing, in the vault storage, each unique nonduplicate block from the source storage.

Type: Grant

Filed: March 1, 2013

Date of Patent: March 25, 2014

Assignee: Storagecraft Technology Corporation

Inventors: Andrew Lynn Gardner, Nathan S. Bushman
SHARING MODELING DATA BETWEEN PLUG-IN APPLICATIONS

Publication number: 20140081901

Abstract: Embodiments of the present invention provide various techniques for sharing modeling data between plug-in applications. The plug-in applications may use or generate various modeling data. In an example, the host application that interfaces with the plug-in applications can access and store this modeling data at a location where it is accessible to the other plug-in applications.

Type: Application

Filed: April 24, 2009

Publication date: March 20, 2014

Applicant: NetApp, Inc.

Inventor: Martin Szymczak
Method and System for Creating a Data Profile Engine, Tool Creation Engines and Product Interfaces for Identifying and Analyzing File and Sections of Files

Publication number: 20140081995

Abstract: A data profile engine identifies, classifies, analyzes, searches, compares and cross-references entire files and sections of files, records and other forms of electronic media, and a tool creation engine in combination with the data profile engine builds custom solutions and product interfaces.

Type: Application

Filed: November 6, 2013

Publication date: March 20, 2014

Applicant: Kiiac LLC

Inventors: Kingsley Martin, Tracy Scott Liggett
Dynamic visual representation of phrases

Patent number: 8676795

Abstract: A plurality of phrases may be extracted from documents associated with one or more document sources. The plurality of phrases may be filtered and processed to determine a frequency in which the plurality of phrases appear in the documents and/or a number of the document sources in which each phrase appears. A weight may be assigned to each of the phrases and, based at least in part on the assigned weight, a visual representation of the plurality of phrases may be presented. The visual representation may be dynamically updated based at least in part on an updated frequency or an updated total number of document sources associated with any one of the plurality of phrases.

Type: Grant

Filed: August 4, 2011

Date of Patent: March 18, 2014

Assignee: Amazon Technologies, Inc.

Inventors: Cyrus J. Durgin, George N. Stathakopoulos, Dominique I. Brezinski, Emilia S. Buneci, Martin M. O'Reilly, Lane R. LaRue, Benjamin S. Kirzhner
Parallel, side-effect based DNS pre-caching

Patent number: 8677018

Abstract: Embodiments of the present invention include methods and systems for domain name system (DNS) pre-caching. A method for DNS pre-caching is provided. The method includes receiving uniform resource locator (URL) hostnames for DNS pre-fetch resolution prior to a user hostname request for any of the URL hostnames. The method also includes making a DNS lookup call for at least one of the URL hostnames that are not cached by a DNS cache prior to the user hostname request. The method further includes discarding at least one IP address provided by a DNS resolver for the URL hostnames, wherein a resolution result for at least one of the URL hostnames is cached in the DNS cache in preparation for the user hostname request. A system for DNS pre-caching is provided. The system includes a renderer, an asynchronous DNS pre-fetcher and a hostname table.

Type: Grant

Filed: August 25, 2008

Date of Patent: March 18, 2014

Assignee: Google Inc.

Inventor: James Roskind
Computer product, data conversion apparatus, and conversion method

Patent number: 8676786

Abstract: A computer-readable medium storing therein a data conversion program that causes a computer to execute a process that includes receiving after a schema of a database has been changed from a former schema to a new schema, a processing request concerning the database; judging based on difference information concerning the former schema and the new schema, whether in the processing request, a condition that specifies process data subject to processing, has been changed by the new schema; searching the database for conversion data whose format is to be converted from the former schema to the new schema, the searching based on judgment results obtained at the judging and on the processing request; and converting the format of the retrieved conversion data, from the former schema to the new schema.

Type: Grant

Filed: October 18, 2011

Date of Patent: March 18, 2014

Assignee: Fujitsu Limited

Inventors: Hiroshi Otsuka, Atsuji Sekiguchi, Masazumi Matsubara, Shinya Kitajima, Yuji Wada, Yasuhide Matsumoto
METHOD AND APPARATUS FOR GENERATING A QUERY CANDIDATE SET

Publication number: 20140074816

Abstract: The present invention provides a method and apparatus for generating a query candidate set. The method comprises automatically tagging a sequence of words in a digital document to obtain a sequence of tags, comparing the sequence of tags with one or more reference sequences and including the sequence of words in the query candidate set if the sequence of tags matches the one or more reference sequences. Each tag of the sequence of tags represents a part of speech.

Type: Application

Filed: June 25, 2013

Publication date: March 13, 2014

Inventors: KALPANA BANERJEE, Surabhi Khandavalli, Vishal Shah, Gaurav Ruhela
METHOD AND APPARATUS FOR PROVIDING A CUSTOMIZED SELECTION OF AUDIO CONTENT OVER THE INTERNET

Publication number: 20140074775

Abstract: A method an apparatus is provided for providing selected media files, which are chosen from among a plurality of media files, to a user over a packet-switched network such as the Internet. The method begins by receiving over the packet- switched network a request from the user to receive media content. Next, a user profile associated with the user is retrieved from a database. The user profile reflects user preferences in media content to be received over the packet-switched network. The plurality of media files are ranked based at least in part on the user profile. At least one highly ranked media file is selected from among the ranked plurality of media files. At least one of the highly ranked media files is forwarded to the user over the packet-switched network.

Type: Application

Filed: November 15, 2013

Publication date: March 13, 2014

Applicants: Sony Electronics Inc., Sony Corporation

Inventors: Brian M. Siegel, Philip M. Abram, Marc Beckwitt, Gregory D. Gudorf, Kazuaki Iso, Brian Raymond, Christopher M. Tobin
ESTABLISHING "IS A" RELATIONSHIPS FOR A TAXONOMY

Publication number: 20140067832

Abstract: Disclosed are methods for returning to a user an answer to the question “what is <string>.” Concepts and classes to which the concepts belong are determined from a corpus, such as taxonomy. The concepts are mapped to categories according to the structure of the taxonomy. Homonyms for words are collected and scored according to likeliness of use. Concept vectors are assembled for the identified concepts based on articles in the corpus and social media usage. Words are evaluated for generic-ness and a generic score is associated therewith. In responding to a query, the generic-ness of the terms of the query is evaluated and additional context solicited if the terms are generic. Candidate homonym concepts for a string in the query are selected according to context vectors for the homonym concepts. One or more homonym concepts are selected and the one or more categories corresponding to these concepts are returned.

Type: Application

Filed: September 28, 2012

Publication date: March 6, 2014

Applicant: Wal-Mart Stores, Inc.

Inventors: Digvijay Singh Lamba, Xiaoyong Chai
Method and system to augment vehicle domain ontologies for vehicle diagnosis

Patent number: 8666982

Abstract: A document may be received at a processing module. One or more tags may be applied to the document, each tag applied to a term, each tag representing a part of speech. One or more terms may be extracted from the document based on the tag. A weighting assignment parameter may be determined for each of the one or more extracted terms. Based on the weighting assignment parameter associated with each of the extracted terms, it may be determined whether the domain ontology includes the one or more extracted terms. If the domain ontology does not include the one or more extracted terms, the domain ontology may be augmented such that the domain ontology comprises the one or more extracted terms.

Type: Grant

Filed: October 6, 2011

Date of Patent: March 4, 2014

Assignee: GM Global Technology Operations LLC

Inventors: Dnyanesh Rajpathak, Vineet R Khare, Rahul Chougule
CANDIDATE GENERATION FOR PREDICTIVE INPUT USING INPUT HISTORY

Publication number: 20140059058

Abstract: A computing device maintains an input history in memory. This input history includes input strings that have been previously entered into the computing device. When the user begins entering characters of an input string, a predictive input engine is activated. The predictive input engine receives the input string and the input history to generate a candidate list of predictive inputs which are presented to the user. The user can select one of the inputs from the list, or otherwise continue entering characters. The computing device generates the candidate list by combining frequency and recency information of the matching strings from the input history. Additionally, the candidate list can be manipulated to present a variety of candidates. By using a combination of frequency, recency and variety, a favorable user experience is provided.

Type: Application

Filed: August 24, 2012

Publication date: February 27, 2014

Applicant: Microsoft Corporation

Inventors: Katsutoshi Ohtsuki, Koji Watanabe
Brand name synonymy

Patent number: 8655737

Abstract: A product catalog includes information regarding products for sale online by various merchants. An analysis software module can identify brand names in the product catalog that relate to the same brand. The analysis module can compute parameters of pairs of product offers having matching product identifiers. The analysis module can group the product offer pairs into brand pair groups based on the brand names for the products subject to the product offers. The analysis module can compute parameters of each brand pair group based on product offer pairs in the brand pair group and attributes of product offers in the product catalog. The analysis module can use the computed parameters to determine whether the brand names of each brand pair are related. The analysis module can use the identified related brand names and additional attributes of product offers to identify product offers related to the same product.

Type: Grant

Filed: January 31, 2011

Date of Patent: February 18, 2014

Assignee: Google Inc.

Inventor: Roy Tromble
System for targeting advertising content to a plurality of mobile communication facilities

Patent number: 8655891

Abstract: A system for targeting advertising content includes the steps of: (a) receiving respective requests for advertising content corresponding to a plurality of mobile communication facilities operated by a group of users, wherein the plurality includes first and second types of mobile communication facilities with different rendering capabilities; (b) receiving a datum corresponding to the group; (c) selecting from a first and second sponsor respective content based on a relevancy to the datum, wherein each content includes a first and second item requiring respective rendering capabilities; (d) receiving bids from the first and second sponsors; (e) attributing a priority to the content of the first sponsor based upon a determination that a yield associated with the first sponsor is greater than a yield associated with the second sponsor; and (f) transmitting the first and second items of the first sponsor to the first and second types of mobile communication facilities respectively.

Type: Grant

Filed: November 18, 2012

Date of Patent: February 18, 2014

Assignee: Millennial Media

Inventors: Jorey Ramer, Adam Soroca, Dennis Doughty
Selective indexing of content portions

Patent number: 8655886

Abstract: A request monitor may monitor user requests, each user request including at least one keyword. A portion evaluator may determine inclusive portions of content file portions of indexed content files within an index, and may assign values to the inclusive portions, based on a providing of at least one of the indexed content files to the user in response to the user request. A portion selector may select, from the inclusive portions and based on the values, retained portions to be retained within the index. An index updater configured to update the index to replace the indexed content files with the retained portions.

Type: Grant

Filed: March 25, 2011

Date of Patent: February 18, 2014

Assignee: Google Inc.

Inventor: Erik Gross
Method and apparatus for propagating updates in databases

Patent number: 8645397

Abstract: A method and apparatus for propagating updates in databases are disclosed. For example, the present method uses “blocking” and/or “thresholding” to delay update propagation and/or to limit the propagation of updates to an optimal stage. For example, the present method receives at least one database update and extracts at least one token from the at least one database update. The method then determines whether a threshold for propagating the at least one database update for the at least one token is reached. The method then propagates the at least one database update for updating an index structure of a database pertaining to the at least one token whose threshold has been reached.

Type: Grant

Filed: November 30, 2006

Date of Patent: February 4, 2014

Assignee: AT&T Intellectual Property II, L.P.

Inventors: Nikolaos Koudas, Amit Jaywant Marathe, Divesh Srivastava
Predicting data for document attributes based on aggregated data for repeated URL patterns

Patent number: 8645367

Abstract: One or more hierarchies of string patterns are generated a plurality of URL strings according to a pattern extraction procedure. Repeated string patterns are selected from the generated hierarchies of string patterns. A URL class is defined for each of selected repeated string patterns. Each URL class is associated with a respective group of URL strings in the plurality of URL strings, where the respective group of URL strings contains a repeated string pattern that defines the URL class. Respective aggregated data is calculated for each URL class. The respective aggregated data is based on respective data of each respective document of each URL string in the group of URL strings associated with the URL class. Respective data for a respective document referenced by a lookup-URL is predicted based on respective aggregated data of one or more of the URL classes.

Type: Grant

Filed: March 8, 2010

Date of Patent: February 4, 2014

Assignee: Google Inc.

Inventors: Nissan Hajaj, Chi Zhang, Changxun Wu, Erik Gross
Presenting sponsored content on a mobile communication facility

Patent number: 8631018

Abstract: A computer-implemented method for positioning targeted sponsored content on a cellular phone includes the steps of (a) assessing a likelihood of an interaction by a user of the cellular phone with a sponsored content, wherein the assessment is based on a plurality of user characteristics associated with the cellular phone including (i) a credit card datum; and (ii) a predefined hardware or software characteristic of the cellular phone; (c) prioritizing the placement of the sponsored content within one of a plurality of predefined areas of a graphical user interface of the cellular phone over the placement of other sponsored content within the same area, wherein the prioritization is based on the assessment of the likelihood of the interaction of the user of the cellular phone with the sponsored content; and (d) presenting the sponsored content within the one of a plurality of predefined areas of the graphical user interface.

Type: Grant

Filed: December 6, 2012

Date of Patent: January 14, 2014

Assignee: Millennial Media

Inventors: Jorey Ramer, Adam Soroca, Dennis Doughty
SYSTEM AND METHOD FOR TOPIC EXTRACTION AND OPINION MINING

Publication number: 20140012863

Abstract: Technique for topic extraction and opinion mining are described. For example, a document that is pertinent to a topic is selected based on searching, using a key phrase, a plurality of documents. A subtopic referenced in the document is identified. A feature of the subtopic is identified based on the document. A rating of the feature of the subtopic is determined based on the document. Using at least one processor, a sentiment of the document is determined based in part on the feature and the rating of the feature.

Type: Application

Filed: September 6, 2013

Publication date: January 9, 2014

Applicant: eBay Inc.

Inventors: Neelakantan Sundaresan, Yongzheng Zhang, Catherine Baudin, Dan Shen, Shen Huang
INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, PROGRAM, AND INFORMATION PROCESSING SYSTEM

Publication number: 20140012862

Abstract: An information processing apparatus includes a calculation unit and a generation unit. The calculation unit is configured to calculate a frequency function which is a function relating to an appearance frequency of one or more attribute values of a database having a predetermined attribute and the one or more attribute values relating to the attribute. The generation unit is configured to generate sample data in accordance with the appearance frequency relating to the database on the basis of the frequency function calculated, the sample data including at least a part of the one or more attribute values as one or more sample attribute values.

Type: Application

Filed: May 28, 2013

Publication date: January 9, 2014

Inventors: Yohei KAWAMOTO, Taizo SHIRAI, Kazuya KAMIO, Yu TANAKA, Koichi SAKUMOTO
Delivery performance analysis for internet services

Patent number: 8621076

Abstract: One preferred embodiment of the present invention provides systems and methods for analyzing the delivery performance of newsgroup services. Briefly described, in architecture, one embodiment, among others, includes a newsgroup evaluation system configured to determine a delivery rate for a newsgroup server. In other embodiments, methods and systems are provided for analyzing completion and retention for newsgroup services.

Type: Grant

Filed: August 15, 2012

Date of Patent: December 31, 2013

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Richard J. Gerlach, Charles S. Shull, David Edward Haslam
COMPUTING TF-IDF VALUES FOR TERMS IN DOCUMENTS IN A LARGE DOCUMENT CORPUS

Publication number: 20130346424

Abstract: Technologies pertaining to computing a respective TF-IDF value for each term in each document of a relative large document corpus are described herein. TF-IDF values are computed with respect to terms in documents of a large document corpus by in a single pass over the document corpus. Secondary sorting functionality of a distributed computing framework is exploited to compute TF-IDF values efficiently.

Type: Application

Filed: June 21, 2012

Publication date: December 26, 2013

Applicant: MICROSOFT CORPORATION

Inventors: Xiong Zhang, Hung-chih Yang, Danny Lange
Clustering documents using citation patterns

Patent number: 8612411

Abstract: Systems and methods for clustering documents, such as for scientific documents, taking into account the citation patterns of the documents are disclosed. In one embodiment, the method includes locating citations to other documents, e.g., search result documents, comparing each pair of documents to be clustered for overlapping citations in a first, a more specific second, and an even more specific optional third citation generality, and determining clusters of related documents based on the comparisons. The levels of generalities may be, for example, document-, paragraph-, and/or citation-level generalities. The locating may locate only citations to the other documents to be clustered. The clusters may be determined based on a weighted score of the amount of overlapping citations in the various generalities and/or by performing factor analysis using the comparison results. The clusters may be ranked to determine the dominant clusters.

Type: Grant

Filed: December 31, 2003

Date of Patent: December 17, 2013

Assignee: Google Inc.

Inventor: Vibhu O. Mittal
Frequency based keyword extraction method and system using a statistical measure

Patent number: 8606795

Abstract: Frequency based keyword extraction method and system utilizing a statistical measure is disclosed which generates keywords within a page and/or document that can distinguish the document from an average document. A simple frequency threshold parameter can be utilized to determine a number of common stop words if a word in the document possesses a frequency in a corpus that is more than the threshold parameter. A statistical confidence interval of the frequency in the document can be compared against a frequency confidence interval of the word in the corpus. The extracted keyword possesses a greater intra-document frequency confidence interval than the frequency confidence interval of the word within the corpus. A statistical hypothesis test can also be utilized to determine the keyword by calculating a test statistic and testing whether the test statistic is greater than some threshold.

Type: Grant

Filed: July 1, 2008

Date of Patent: December 10, 2013

Assignee: Xerox Corporation

Inventors: Stephen C. Morgana, John C. Handley
Method and system for creating a data profile engine, tool creation engines and product interfaces for identifying and analyzing files and sections of files

Patent number: 8606796

Abstract: A data profile engine identifies, classifies, analyzes, searches, compares and cross-references entire files and sections of files, records and other forms of electronic media, and a tool creation engine in combination with the data profile engine builds custom solutions and product interfaces.

Type: Grant

Filed: September 15, 2009

Date of Patent: December 10, 2013

Assignee: Kilac, LLC

Inventors: Kingsley Martin, Tracy S. Liggett
Website, user interfaces, and applications facilitating improved media search capability

Patent number: 8600838

Abstract: A method for improving media search capability includes providing a user with access to an interface that allows the user to provide one or more inputs relating to an item of media (such as an audio or video recording of a song or a cover song), performing a media search in response to the one or more inputs, and presenting search results via an interactive display generated depending upon media ratings, wherein one or more of the media ratings is determined from media ratings inputs depending upon one or more metrics associated with sources or providers of the media ratings inputs.

Type: Grant

Filed: March 21, 2011

Date of Patent: December 3, 2013

Inventors: Joshua Beroukhim, Joseph Michael
METHOD AND SYSTEM FOR ANALYZING DATA IN ARTIFACTS AND CREATING A MODIFIABLE DATA NETWORK

Publication number: 20130318104

Abstract: Computer-implemented systems, methods, and computer-readable media for analyzing data in one or more artifacts and creating a modifiable data network includes: extracting the key elements from the one or more artifacts; identifying relationship among the key elements for each of the one or more artifacts; determining a first frequency of each of the key elements; determining a second frequency for each relationship among the key elements; creating a data network showing the key elements and the relationship among the key elements; and enabling a user to modify the data network based on one or more of: the key elements; the relationship among the key elements; the first frequency; and the second frequency.

Type: Application

Filed: May 22, 2013

Publication date: November 28, 2013

Applicant: Infosys Limited

Inventor: Sanal Kumar Sundaresan Nair
Product idea sharing algorithm

Patent number: 8595209

Abstract: Methods and systems for identifying products and product idea lists. A method is provided which includes searching a product index for a result. The result is used to search an idea list index for idea lists related to the result wherein each idea list includes at least one product and has an associated popularity and relevance to the search. The method also includes outputting at least some of the idea lists based on the popularity and relevance of the idea lists. In one embodiment a method of identifying product idea lists is provided. The method includes searching a product index for keywords associated with products in a product idea list. The method also includes using the keywords to search a product idea index for other idea lists and outputting the other idea lists based on their popularities. In some embodiments, the popularities may be based on time-weighted events.

Type: Grant

Filed: January 29, 2008

Date of Patent: November 26, 2013

Assignee: Boundless Network, Inc.

Inventor: Jeremy Kraybill
Labeling samples in a similarity graph

Patent number: 8583659

Abstract: In one embodiment, one or more computing devices determine a confidence score between a user node and a concept node of a social graph based on similarity numbers associated with edges between the user node and the concept node in one or more hops between them on the social graph.

Type: Grant

Filed: July 9, 2012

Date of Patent: November 12, 2013

Assignee: Facebook, Inc.

Inventors: Tudor Andrei Cristian Alexandrescu, Pierre Moreels
Measuring informative content of words in documents in a document collection relative to a probability function including a concavity control parameter

Publication number: 20130297622

Abstract: Processing methods and systems are provided for representing documents relative to importance of words in the document. A processor comprising a weighting model of word importance in a document in a collection relative to an importance of the word in other documents in the collection computes a deviation of distribution of the word from a probability distribution of the word in other documents in the collection, where the deviation distribution is weighted in accordance with a concavity control function. A concavity control parameter is adjustable relative to word frequency.

Type: Application

Filed: May 3, 2012

Publication date: November 7, 2013

Applicant: Xerox Corporation

Inventor: Stephane Clinchant
Utilizing affinity groups to allocate data items and computing resources

Patent number: 8577892

Abstract: Systems and methods for utilizing affinity groups to allocate data items and computing resources are disclosed. Upon receipt of a user preference indicating an affinity group, a token associated with that affinity group may be stored in a database. The affinity group may be associated with a geographic region or a number of data centers. Data items and computing resources may be associated with the affinity group. These data items and computing resources may be allocated to a geographic region or data center based on their association with the affinity group. These data items and computing resources may also be reallocated based on efficiency analyses or user preferences. In this way, data items and computing resources may be efficiently allocated with lower user effort.

Type: Grant

Filed: June 5, 2009

Date of Patent: November 5, 2013

Assignee: Microsoft Corporation

Inventors: Remy Pairault, Zhe Yang, Sriram Krishnan, George Moore
Detecting duplicates in a shared knowledge base

Patent number: 8577899

Abstract: Methods and systems supporting curation of items in a searchable knowledge base are provided. The methods and systems include mining one or more search queries of the searchable knowledge base, where each of the search queries includes a plurality of the items. The method further includes determining one or more pairs of items using a processor, where each of the pairs of items includes a correlation value exceeding a threshold. The correlation values for the pairs of items are based upon the frequency the items of the pairs of items co-occur within the search queries. The method further includes providing the pairs of items to a curator, where the curator reviews the pairs of items.

Type: Grant

Filed: March 5, 2010

Date of Patent: November 5, 2013

Assignee: Palo Alto Research Center Incorporation

Inventor: John T. Maxwell
Method and system for recording search trails across one or more search engines in a communications network

Patent number: 8572100

Abstract: An automated method for recording sites accessed by a client in a communications network, the method including the steps of: detecting submission of a search query from the client to one or more search engines; and recording a search trail of one or more parameters of sites accessed consecutively following return of search query results to the client.

Type: Grant

Filed: December 15, 2004

Date of Patent: October 29, 2013

Inventor: Nigel Hamilton
Grouping and differentiating files based on underlying grouped and differentiated files

Patent number: 8566323

Abstract: Methods and apparatus teach a digital spectrum of a file. The digital spectrum is used to map a file's position. This position relative to another file's position reveals closest neighbors. When multiple such neighbors are arranged, first “patterns” of data are created that further define digital spectrums of new files. It is within this sorted new data that emergent relationships or second “patterns” are examined, according to the techniques for its underlying files, or “patterns of patterns.” Representatively, original files are stored on computing devices. If encoded, they have pluralities of symbols representing an underlying data stream of original bits of data. The original files are examined for relationships between each of the files. The original relationships are converted to new files. The new files are representatively encoded and examined for other relationships.

Type: Grant

Filed: December 29, 2009

Date of Patent: October 22, 2013

Assignee: Novell, Inc.

Inventors: Scott A. Isaacson, Craig N. Teerlink, Nadeem A. Nazeer
PSEUDO-DOCUMENTS TO FACILITATE DATA DISCOVERY

Publication number: 20130275436

Abstract: Various embodiments promote the discoverability of data that can be contained within a database. In one or more embodiments, data within a database is organized in a structure having a schema. The structure and data can be processed in a manner that renders one or more pseudo-documents each of which constitutes a sub-structure that can be indexed. Once produced and indexed, the pseudo-documents constitute a set of searchable objects each of which relationally points back to its associated structure within the database. Searches can now be performed against the pseudo-documents which, in turn, returns a set of search results. The set of search results can include multiple sub-sets of pseudo-documents, each sub-set of which is associated with a different structure.

Type: Application

Filed: April 11, 2012

Publication date: October 17, 2013

Applicant: Microsoft Corporation

Inventors: Surajit Chaudhuri, Lev Novik, John C. Platt
Apparatus and method for generating additional information about moving picture content

Patent number: 8559724

Abstract: An apparatus and method for generating additional information about moving picture content, including: comparing image feature information about each image frame in moving picture content with image feature information about each image frame in web information, searching for an image frame in the moving picture content, the image frame matching the image frame in the web information, determining location information about the found image frame in the moving picture content, and generating additional information by use of the determined location information and the web information.

Type: Grant

Filed: February 24, 2010

Date of Patent: October 15, 2013

Assignee: Samsung Electronics Co., Ltd.

Inventors: Yoon-hee Choi, Il-hwan Choi, Hee-seon Park

prev 1 2 3 4 5 6 7 8 9 … next