Based On Term Frequency Of Appearance Patents (Class 707/750)
  • Publication number: 20140149433
    Abstract: A method of estimating a number of unique entry counts of an attribute in a database comprises, with a processor: identifying a sample of entries from an attribute database, determining frequencies of a number of input observations of the sample of entries, determining a number of high frequency values of the sample of entries, and estimating a number of unique entry counts of an attribute within the attribute database using a counting Bloom filter and based on the frequencies of the input observations and the high frequency values.
    Type: Application
    Filed: November 27, 2012
    Publication date: May 29, 2014
    Applicant: HEWLETT-PACKARD DEVELOPMENT COMPANY, L.P.
    Inventors: Choudur Lakshminarayan, Hansjorg Zeller, QiFan Chen, Ramakumar Kosuru
  • Patent number: 8738637
    Abstract: A method of determining popularity of an e-mail is provided. The method includes receiving an e-mail and determining if a generated signature is associated with the e-mail. If there is no generated signature, then a signature is generated for associating with the e-mail. A popularity measure associated with the e-mail is determined based on the signature. Furthermore, a method of determining popularity of an e-mail is provided. The method includes receiving an e-mail and identifying a generated signature associated with the e-mail. The method further includes determining a match of the associated generated signature with a record of the generated signature, if the generated signature is identified. If the identified generated signature is determined to match the record of the generated signature, then a popularity measure associated with the e-mail is increased.
    Type: Grant
    Filed: June 3, 2008
    Date of Patent: May 27, 2014
    Assignee: Yahoo! Inc.
    Inventors: Jyh-Shin Shue, Jeff Weng
  • Patent number: 8739032
    Abstract: A document analysis system receives multiple concepts along with multiple reference documents and generates sensory indicators that assist a researcher in assessing the relevance of each of the documents to the concepts. In one exemplary aspect, the document analysis system displays a table of keywords separated into blocks, each block of keywords corresponding to one of the concepts. Each block is colored according to the prevalence of any keyword within a given keyword group. The color of a block thus indicates the relative presence of a concept in the document. The document analysis system also determines a unique color for each block of keywords for highlighting in the text of the document. In this manner a researcher can quickly identify passages that contain multiple concepts. Additionally, the researcher is provided the ability to quickly locate reference characters, figure numbers and patent numbers in the document.
    Type: Grant
    Filed: October 12, 2010
    Date of Patent: May 27, 2014
    Inventor: Patrick Sander Walsh
  • Patent number: 8732185
    Abstract: Among other disclosed subject matter, a computer-implemented method relating to selecting content for publication includes receiving a term to be used in selecting content for publication. The method includes obtaining information from a record using the received term, the information reflecting a correspondence between contents in a repository and the received term. The method includes determining, using at least the obtained information, a query to be performed on the repository for selecting at least part of the content.
    Type: Grant
    Filed: September 10, 2012
    Date of Patent: May 20, 2014
    Assignee: Google Inc.
    Inventors: Nicholas Lynn, Alexander P. Carobus
  • Publication number: 20140136551
    Abstract: Provided is a method of generating updating parameters. The method obtains search keywords used by users within a predetermined time period; counts the search keywords to obtain primary keywords, related keywords, co-search frequencies of each primary keyword and the respective related keywords being searched together, and search frequencies of the primary keywords being searched alone; computes first feature values based on the search frequencies of the primary keywords being searched alone; and then computes second feature values based on the first feature values and the co-search frequencies of the primary keywords and the respective related keywords. The second feature values serve as updating parameters for determining displaying modes of the related keywords. An apparatus of generating updating parameters, and a method and an apparatus of displaying related keywords according to the updating parameters are also provided.
    Type: Application
    Filed: January 21, 2014
    Publication date: May 15, 2014
    Applicant: Alibaba Group Holding Limited
    Inventors: Lei Pan, Yuanhu Yao, Zhen Yang, Tianji Zhang
  • Patent number: 8725756
    Abstract: Methods, systems, and apparatus, including computer program products, in which one or more search query suggestions are made for a current search session. Similar previous search sessions which include search queries common to the current search session are identified. Based upon the similar previous search sessions, one or more suggested search queries are derived and provided to a search engine interface for serving to a user or a client.
    Type: Grant
    Filed: November 11, 2008
    Date of Patent: May 13, 2014
    Assignee: Google Inc.
    Inventors: Ashutosh Garg, Kedar Dhamdhere
  • Patent number: 8725736
    Abstract: A computer-implemented system and method for clustering similar documents is provided. Concepts are identified for a set of documents and occurrence frequencies are determined for each concept in the documents set. A distance quantifying a similarity for each of the documents in the set with one or more clusters of documents is calculated. Each document is mapped to at least one of the one or more document clusters.
    Type: Grant
    Filed: February 14, 2013
    Date of Patent: May 13, 2014
    Assignee: FTI Technology LLC
    Inventors: Dan Gallivan, Kenji Kawai
  • Patent number: 8725723
    Abstract: A method and system for searching for a related term having rapidly increasing popularity is provided. The method includes: analyzing a search log and extracting a daily search frequency for each search term; comparing peaks of the daily search frequency, extracted for each search term in a predetermined period; and analyzing relevance between candidate search terms in which the peaks have occurred together in the predetermined period as a result of the comparison and filtering out a candidate search term having no relevance.
    Type: Grant
    Filed: August 8, 2008
    Date of Patent: May 13, 2014
    Assignee: NHN Corporation
    Inventor: Dong Wook Kim
  • Patent number: 8725732
    Abstract: Systems, methods and program products for classifying text. A system classifies text into first subject matter categories. The system identifies one or more second subject matter categories in a collection of second subject matter categories, each of the second categories is a hierarchical classification of a collection of confirmed valid search results for queries, in which at least one query for each identified second category includes a term in the text. The system filters the identified categories by excluding identified categories whose ancestors are not among the first categories. The system selects categories from the filtered categories based on one or more thresholds in which a threshold specifies a degree of relatedness between a selected category and the text. The selected categories are a sufficient basis for recommending content to a user, the content being associated with one or more of the selected categories.
    Type: Grant
    Filed: March 22, 2012
    Date of Patent: May 13, 2014
    Assignee: Google Inc.
    Inventors: Glen M. Jeh, Beverly Yang
  • Patent number: 8719283
    Abstract: Summarizing a set of reviews is disclosed. In some embodiments, a set of reviews is analyzed, e.g., by an at least partially automated process. A summary of the information included in the set of reviews is provided. The summary includes a visual indication of a range and distribution of opinions expressed in the set of reviews. In some embodiments, the set of reviews includes reviews from one or more members of an online or other user community, such as customers of an online store, subscribers to a podcast, blog, or other online source of content, etc.
    Type: Grant
    Filed: September 29, 2006
    Date of Patent: May 6, 2014
    Assignee: Apple Inc.
    Inventor: David A. Koski
  • Patent number: 8712991
    Abstract: Some implementations include techniques and arrangements to provide document-related representative information with search results. For example, a representative query and/or representative results may be provided for one or more individual documents identified in a set of search results to supplement the search results returned in response to a received search query. The representative queries may be determined by correlating a plurality of previously submitted queries in search log data with a plurality of documents returned in response to the queries. In some implementations, click-through frequency for a particular document with respect to the plurality of queries may be taken into consideration when determining the representative queries for the particular document.
    Type: Grant
    Filed: July 7, 2011
    Date of Patent: April 29, 2014
    Assignee: Microsoft Corporation
    Inventors: Jingdong Wang, Shipeng Li
  • Patent number: 8713009
    Abstract: Embodiments of the present invention provide automatic systems and methods for associating objects in databases of a web site by rate-based tagging. The frequencies of users entering specific tag terms for objects stored in the databases of the web site are used to determine hard associations between objects and tag terms and between objects. When the frequencies of user tags exceed established thresholds, hard associations between objects and tag terms are established. When objects are identified or determined to have hard association with tag terms, the objects are determined to be more clearly associated with the corresponding tag terms. Therefore, they should be highlighted or featured in more prominent locations on web pages of the web site to increase users' confidence in content of the web site. To identify hard-associated objects, more weights can be assigned to the hard-associated objects, which allows them to be more likely to be selected for display in prominent locations.
    Type: Grant
    Filed: September 25, 2008
    Date of Patent: April 29, 2014
    Assignee: Yahoo! Inc.
    Inventors: Hubert M. Walker, Noel C. Morrison, Ankarino S. Lara, Scott Bedard, Stephen James Blake
  • Publication number: 20140101172
    Abstract: A system is provided that that dynamically matches data originating from one or more data sources. The system analyzes a matching configuration file, where the matching configuration file includes one or more matching configurations. The system modifies a probabilistic matching algorithm of a matching engine at runtime based on the one or more matching configurations and based on two or more data records of the plurality of data records that require matching. The system compares two data records of a plurality of data records using the modified probabilistic matching algorithm. The system generates a match score for the two data records based on the match weight for each data record field.
    Type: Application
    Filed: October 5, 2012
    Publication date: April 10, 2014
    Applicant: ORACLE INTERNATIONAL CORPORATION
    Inventor: Swaranjit Singh DUA
  • Publication number: 20140101173
    Abstract: A method for providing information about a main knowledge stream is disclosed. According to an embodiment of the present invention, the method includes obtaining reference links representing reference relationships among reference documents in each of a plurality of documents stored in a database, determining one or more basic paths connecting the reference links, calculating probability values of the reference links by overlapping the determined basic paths, determining a first document among the documents and an input reference link associated with the first document, and performing a Markov chain model using a probability value of the input reference link, and calculating information about the main knowledge stream associated with the first document using the result obtained by performing the Markov chain model.
    Type: Application
    Filed: December 24, 2012
    Publication date: April 10, 2014
    Applicant: KOREA INSTITUTE OF SCIENCE AND TECHNOLOGY INFORMATION
    Inventor: Korea Institute of Science and Technology Information
  • Patent number: 8682907
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for evaluating substitute terms. One of the methods includes selecting a first term and a second term. A first co-occurrence frequency is determined for co-occurring terms in search queries that include the first term. A first vector is generated for the first term using the first co-occurrence frequencies. A second co-occurrence frequency is determined for the co-occurring terms in the search queries that include the first term adjacent to the second term. A second vector is generated for the second term using the second co-occurrence frequencies. A score for the second term as a context for a substitution rule based on the first term is computed, wherein the score is based on a comparison between the first vector and the second vector.
    Type: Grant
    Filed: March 30, 2012
    Date of Patent: March 25, 2014
    Assignee: Google Inc.
    Inventors: Ke Yang, Zachary A. Garrett, Daisuke Ikeda
  • Patent number: 8682870
    Abstract: Defragmentation during multiphase deduplication. In one example embodiment, a method of defragmentation during multiphase deduplication includes an analysis phase that includes analyzing each allocated block stored in a source storage at a point in time to determine if the block is duplicated in a vault storage, a defragmentation phase that includes reordering the duplicate blocks stored in the source storage to match the order of the duplicate blocks as stored in the vault storage, and a backup phase that is performed after completion of the defragmentation phase and that includes storing, in the vault storage, each unique nonduplicate block from the source storage.
    Type: Grant
    Filed: March 1, 2013
    Date of Patent: March 25, 2014
    Assignee: Storagecraft Technology Corporation
    Inventors: Andrew Lynn Gardner, Nathan S. Bushman
  • Publication number: 20140081901
    Abstract: Embodiments of the present invention provide various techniques for sharing modeling data between plug-in applications. The plug-in applications may use or generate various modeling data. In an example, the host application that interfaces with the plug-in applications can access and store this modeling data at a location where it is accessible to the other plug-in applications.
    Type: Application
    Filed: April 24, 2009
    Publication date: March 20, 2014
    Applicant: NetApp, Inc.
    Inventor: Martin Szymczak
  • Publication number: 20140081995
    Abstract: A data profile engine identifies, classifies, analyzes, searches, compares and cross-references entire files and sections of files, records and other forms of electronic media, and a tool creation engine in combination with the data profile engine builds custom solutions and product interfaces.
    Type: Application
    Filed: November 6, 2013
    Publication date: March 20, 2014
    Applicant: Kiiac LLC
    Inventors: Kingsley Martin, Tracy Scott Liggett
  • Patent number: 8676795
    Abstract: A plurality of phrases may be extracted from documents associated with one or more document sources. The plurality of phrases may be filtered and processed to determine a frequency in which the plurality of phrases appear in the documents and/or a number of the document sources in which each phrase appears. A weight may be assigned to each of the phrases and, based at least in part on the assigned weight, a visual representation of the plurality of phrases may be presented. The visual representation may be dynamically updated based at least in part on an updated frequency or an updated total number of document sources associated with any one of the plurality of phrases.
    Type: Grant
    Filed: August 4, 2011
    Date of Patent: March 18, 2014
    Assignee: Amazon Technologies, Inc.
    Inventors: Cyrus J. Durgin, George N. Stathakopoulos, Dominique I. Brezinski, Emilia S. Buneci, Martin M. O'Reilly, Lane R. LaRue, Benjamin S. Kirzhner
  • Patent number: 8677018
    Abstract: Embodiments of the present invention include methods and systems for domain name system (DNS) pre-caching. A method for DNS pre-caching is provided. The method includes receiving uniform resource locator (URL) hostnames for DNS pre-fetch resolution prior to a user hostname request for any of the URL hostnames. The method also includes making a DNS lookup call for at least one of the URL hostnames that are not cached by a DNS cache prior to the user hostname request. The method further includes discarding at least one IP address provided by a DNS resolver for the URL hostnames, wherein a resolution result for at least one of the URL hostnames is cached in the DNS cache in preparation for the user hostname request. A system for DNS pre-caching is provided. The system includes a renderer, an asynchronous DNS pre-fetcher and a hostname table.
    Type: Grant
    Filed: August 25, 2008
    Date of Patent: March 18, 2014
    Assignee: Google Inc.
    Inventor: James Roskind
  • Patent number: 8676786
    Abstract: A computer-readable medium storing therein a data conversion program that causes a computer to execute a process that includes receiving after a schema of a database has been changed from a former schema to a new schema, a processing request concerning the database; judging based on difference information concerning the former schema and the new schema, whether in the processing request, a condition that specifies process data subject to processing, has been changed by the new schema; searching the database for conversion data whose format is to be converted from the former schema to the new schema, the searching based on judgment results obtained at the judging and on the processing request; and converting the format of the retrieved conversion data, from the former schema to the new schema.
    Type: Grant
    Filed: October 18, 2011
    Date of Patent: March 18, 2014
    Assignee: Fujitsu Limited
    Inventors: Hiroshi Otsuka, Atsuji Sekiguchi, Masazumi Matsubara, Shinya Kitajima, Yuji Wada, Yasuhide Matsumoto
  • Publication number: 20140074816
    Abstract: The present invention provides a method and apparatus for generating a query candidate set. The method comprises automatically tagging a sequence of words in a digital document to obtain a sequence of tags, comparing the sequence of tags with one or more reference sequences and including the sequence of words in the query candidate set if the sequence of tags matches the one or more reference sequences. Each tag of the sequence of tags represents a part of speech.
    Type: Application
    Filed: June 25, 2013
    Publication date: March 13, 2014
    Inventors: KALPANA BANERJEE, Surabhi Khandavalli, Vishal Shah, Gaurav Ruhela
  • Publication number: 20140074775
    Abstract: A method an apparatus is provided for providing selected media files, which are chosen from among a plurality of media files, to a user over a packet-switched network such as the Internet. The method begins by receiving over the packet- switched network a request from the user to receive media content. Next, a user profile associated with the user is retrieved from a database. The user profile reflects user preferences in media content to be received over the packet-switched network. The plurality of media files are ranked based at least in part on the user profile. At least one highly ranked media file is selected from among the ranked plurality of media files. At least one of the highly ranked media files is forwarded to the user over the packet-switched network.
    Type: Application
    Filed: November 15, 2013
    Publication date: March 13, 2014
    Applicants: Sony Electronics Inc., Sony Corporation
    Inventors: Brian M. Siegel, Philip M. Abram, Marc Beckwitt, Gregory D. Gudorf, Kazuaki Iso, Brian Raymond, Christopher M. Tobin
  • Publication number: 20140067832
    Abstract: Disclosed are methods for returning to a user an answer to the question “what is <string>.” Concepts and classes to which the concepts belong are determined from a corpus, such as taxonomy. The concepts are mapped to categories according to the structure of the taxonomy. Homonyms for words are collected and scored according to likeliness of use. Concept vectors are assembled for the identified concepts based on articles in the corpus and social media usage. Words are evaluated for generic-ness and a generic score is associated therewith. In responding to a query, the generic-ness of the terms of the query is evaluated and additional context solicited if the terms are generic. Candidate homonym concepts for a string in the query are selected according to context vectors for the homonym concepts. One or more homonym concepts are selected and the one or more categories corresponding to these concepts are returned.
    Type: Application
    Filed: September 28, 2012
    Publication date: March 6, 2014
    Applicant: Wal-Mart Stores, Inc.
    Inventors: Digvijay Singh Lamba, Xiaoyong Chai
  • Patent number: 8666982
    Abstract: A document may be received at a processing module. One or more tags may be applied to the document, each tag applied to a term, each tag representing a part of speech. One or more terms may be extracted from the document based on the tag. A weighting assignment parameter may be determined for each of the one or more extracted terms. Based on the weighting assignment parameter associated with each of the extracted terms, it may be determined whether the domain ontology includes the one or more extracted terms. If the domain ontology does not include the one or more extracted terms, the domain ontology may be augmented such that the domain ontology comprises the one or more extracted terms.
    Type: Grant
    Filed: October 6, 2011
    Date of Patent: March 4, 2014
    Assignee: GM Global Technology Operations LLC
    Inventors: Dnyanesh Rajpathak, Vineet R Khare, Rahul Chougule
  • Publication number: 20140059058
    Abstract: A computing device maintains an input history in memory. This input history includes input strings that have been previously entered into the computing device. When the user begins entering characters of an input string, a predictive input engine is activated. The predictive input engine receives the input string and the input history to generate a candidate list of predictive inputs which are presented to the user. The user can select one of the inputs from the list, or otherwise continue entering characters. The computing device generates the candidate list by combining frequency and recency information of the matching strings from the input history. Additionally, the candidate list can be manipulated to present a variety of candidates. By using a combination of frequency, recency and variety, a favorable user experience is provided.
    Type: Application
    Filed: August 24, 2012
    Publication date: February 27, 2014
    Applicant: Microsoft Corporation
    Inventors: Katsutoshi Ohtsuki, Koji Watanabe
  • Patent number: 8655737
    Abstract: A product catalog includes information regarding products for sale online by various merchants. An analysis software module can identify brand names in the product catalog that relate to the same brand. The analysis module can compute parameters of pairs of product offers having matching product identifiers. The analysis module can group the product offer pairs into brand pair groups based on the brand names for the products subject to the product offers. The analysis module can compute parameters of each brand pair group based on product offer pairs in the brand pair group and attributes of product offers in the product catalog. The analysis module can use the computed parameters to determine whether the brand names of each brand pair are related. The analysis module can use the identified related brand names and additional attributes of product offers to identify product offers related to the same product.
    Type: Grant
    Filed: January 31, 2011
    Date of Patent: February 18, 2014
    Assignee: Google Inc.
    Inventor: Roy Tromble
  • Patent number: 8655891
    Abstract: A system for targeting advertising content includes the steps of: (a) receiving respective requests for advertising content corresponding to a plurality of mobile communication facilities operated by a group of users, wherein the plurality includes first and second types of mobile communication facilities with different rendering capabilities; (b) receiving a datum corresponding to the group; (c) selecting from a first and second sponsor respective content based on a relevancy to the datum, wherein each content includes a first and second item requiring respective rendering capabilities; (d) receiving bids from the first and second sponsors; (e) attributing a priority to the content of the first sponsor based upon a determination that a yield associated with the first sponsor is greater than a yield associated with the second sponsor; and (f) transmitting the first and second items of the first sponsor to the first and second types of mobile communication facilities respectively.
    Type: Grant
    Filed: November 18, 2012
    Date of Patent: February 18, 2014
    Assignee: Millennial Media
    Inventors: Jorey Ramer, Adam Soroca, Dennis Doughty
  • Patent number: 8655886
    Abstract: A request monitor may monitor user requests, each user request including at least one keyword. A portion evaluator may determine inclusive portions of content file portions of indexed content files within an index, and may assign values to the inclusive portions, based on a providing of at least one of the indexed content files to the user in response to the user request. A portion selector may select, from the inclusive portions and based on the values, retained portions to be retained within the index. An index updater configured to update the index to replace the indexed content files with the retained portions.
    Type: Grant
    Filed: March 25, 2011
    Date of Patent: February 18, 2014
    Assignee: Google Inc.
    Inventor: Erik Gross
  • Patent number: 8645397
    Abstract: A method and apparatus for propagating updates in databases are disclosed. For example, the present method uses “blocking” and/or “thresholding” to delay update propagation and/or to limit the propagation of updates to an optimal stage. For example, the present method receives at least one database update and extracts at least one token from the at least one database update. The method then determines whether a threshold for propagating the at least one database update for the at least one token is reached. The method then propagates the at least one database update for updating an index structure of a database pertaining to the at least one token whose threshold has been reached.
    Type: Grant
    Filed: November 30, 2006
    Date of Patent: February 4, 2014
    Assignee: AT&T Intellectual Property II, L.P.
    Inventors: Nikolaos Koudas, Amit Jaywant Marathe, Divesh Srivastava
  • Patent number: 8645367
    Abstract: One or more hierarchies of string patterns are generated a plurality of URL strings according to a pattern extraction procedure. Repeated string patterns are selected from the generated hierarchies of string patterns. A URL class is defined for each of selected repeated string patterns. Each URL class is associated with a respective group of URL strings in the plurality of URL strings, where the respective group of URL strings contains a repeated string pattern that defines the URL class. Respective aggregated data is calculated for each URL class. The respective aggregated data is based on respective data of each respective document of each URL string in the group of URL strings associated with the URL class. Respective data for a respective document referenced by a lookup-URL is predicted based on respective aggregated data of one or more of the URL classes.
    Type: Grant
    Filed: March 8, 2010
    Date of Patent: February 4, 2014
    Assignee: Google Inc.
    Inventors: Nissan Hajaj, Chi Zhang, Changxun Wu, Erik Gross
  • Patent number: 8631018
    Abstract: A computer-implemented method for positioning targeted sponsored content on a cellular phone includes the steps of (a) assessing a likelihood of an interaction by a user of the cellular phone with a sponsored content, wherein the assessment is based on a plurality of user characteristics associated with the cellular phone including (i) a credit card datum; and (ii) a predefined hardware or software characteristic of the cellular phone; (c) prioritizing the placement of the sponsored content within one of a plurality of predefined areas of a graphical user interface of the cellular phone over the placement of other sponsored content within the same area, wherein the prioritization is based on the assessment of the likelihood of the interaction of the user of the cellular phone with the sponsored content; and (d) presenting the sponsored content within the one of a plurality of predefined areas of the graphical user interface.
    Type: Grant
    Filed: December 6, 2012
    Date of Patent: January 14, 2014
    Assignee: Millennial Media
    Inventors: Jorey Ramer, Adam Soroca, Dennis Doughty
  • Publication number: 20140012863
    Abstract: Technique for topic extraction and opinion mining are described. For example, a document that is pertinent to a topic is selected based on searching, using a key phrase, a plurality of documents. A subtopic referenced in the document is identified. A feature of the subtopic is identified based on the document. A rating of the feature of the subtopic is determined based on the document. Using at least one processor, a sentiment of the document is determined based in part on the feature and the rating of the feature.
    Type: Application
    Filed: September 6, 2013
    Publication date: January 9, 2014
    Applicant: eBay Inc.
    Inventors: Neelakantan Sundaresan, Yongzheng Zhang, Catherine Baudin, Dan Shen, Shen Huang
  • Publication number: 20140012862
    Abstract: An information processing apparatus includes a calculation unit and a generation unit. The calculation unit is configured to calculate a frequency function which is a function relating to an appearance frequency of one or more attribute values of a database having a predetermined attribute and the one or more attribute values relating to the attribute. The generation unit is configured to generate sample data in accordance with the appearance frequency relating to the database on the basis of the frequency function calculated, the sample data including at least a part of the one or more attribute values as one or more sample attribute values.
    Type: Application
    Filed: May 28, 2013
    Publication date: January 9, 2014
    Inventors: Yohei KAWAMOTO, Taizo SHIRAI, Kazuya KAMIO, Yu TANAKA, Koichi SAKUMOTO
  • Patent number: 8621076
    Abstract: One preferred embodiment of the present invention provides systems and methods for analyzing the delivery performance of newsgroup services. Briefly described, in architecture, one embodiment, among others, includes a newsgroup evaluation system configured to determine a delivery rate for a newsgroup server. In other embodiments, methods and systems are provided for analyzing completion and retention for newsgroup services.
    Type: Grant
    Filed: August 15, 2012
    Date of Patent: December 31, 2013
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Richard J. Gerlach, Charles S. Shull, David Edward Haslam
  • Publication number: 20130346424
    Abstract: Technologies pertaining to computing a respective TF-IDF value for each term in each document of a relative large document corpus are described herein. TF-IDF values are computed with respect to terms in documents of a large document corpus by in a single pass over the document corpus. Secondary sorting functionality of a distributed computing framework is exploited to compute TF-IDF values efficiently.
    Type: Application
    Filed: June 21, 2012
    Publication date: December 26, 2013
    Applicant: MICROSOFT CORPORATION
    Inventors: Xiong Zhang, Hung-chih Yang, Danny Lange
  • Patent number: 8612411
    Abstract: Systems and methods for clustering documents, such as for scientific documents, taking into account the citation patterns of the documents are disclosed. In one embodiment, the method includes locating citations to other documents, e.g., search result documents, comparing each pair of documents to be clustered for overlapping citations in a first, a more specific second, and an even more specific optional third citation generality, and determining clusters of related documents based on the comparisons. The levels of generalities may be, for example, document-, paragraph-, and/or citation-level generalities. The locating may locate only citations to the other documents to be clustered. The clusters may be determined based on a weighted score of the amount of overlapping citations in the various generalities and/or by performing factor analysis using the comparison results. The clusters may be ranked to determine the dominant clusters.
    Type: Grant
    Filed: December 31, 2003
    Date of Patent: December 17, 2013
    Assignee: Google Inc.
    Inventor: Vibhu O. Mittal
  • Patent number: 8606795
    Abstract: Frequency based keyword extraction method and system utilizing a statistical measure is disclosed which generates keywords within a page and/or document that can distinguish the document from an average document. A simple frequency threshold parameter can be utilized to determine a number of common stop words if a word in the document possesses a frequency in a corpus that is more than the threshold parameter. A statistical confidence interval of the frequency in the document can be compared against a frequency confidence interval of the word in the corpus. The extracted keyword possesses a greater intra-document frequency confidence interval than the frequency confidence interval of the word within the corpus. A statistical hypothesis test can also be utilized to determine the keyword by calculating a test statistic and testing whether the test statistic is greater than some threshold.
    Type: Grant
    Filed: July 1, 2008
    Date of Patent: December 10, 2013
    Assignee: Xerox Corporation
    Inventors: Stephen C. Morgana, John C. Handley
  • Patent number: 8606796
    Abstract: A data profile engine identifies, classifies, analyzes, searches, compares and cross-references entire files and sections of files, records and other forms of electronic media, and a tool creation engine in combination with the data profile engine builds custom solutions and product interfaces.
    Type: Grant
    Filed: September 15, 2009
    Date of Patent: December 10, 2013
    Assignee: Kilac, LLC
    Inventors: Kingsley Martin, Tracy S. Liggett
  • Patent number: 8600838
    Abstract: A method for improving media search capability includes providing a user with access to an interface that allows the user to provide one or more inputs relating to an item of media (such as an audio or video recording of a song or a cover song), performing a media search in response to the one or more inputs, and presenting search results via an interactive display generated depending upon media ratings, wherein one or more of the media ratings is determined from media ratings inputs depending upon one or more metrics associated with sources or providers of the media ratings inputs.
    Type: Grant
    Filed: March 21, 2011
    Date of Patent: December 3, 2013
    Inventors: Joshua Beroukhim, Joseph Michael
  • Publication number: 20130318104
    Abstract: Computer-implemented systems, methods, and computer-readable media for analyzing data in one or more artifacts and creating a modifiable data network includes: extracting the key elements from the one or more artifacts; identifying relationship among the key elements for each of the one or more artifacts; determining a first frequency of each of the key elements; determining a second frequency for each relationship among the key elements; creating a data network showing the key elements and the relationship among the key elements; and enabling a user to modify the data network based on one or more of: the key elements; the relationship among the key elements; the first frequency; and the second frequency.
    Type: Application
    Filed: May 22, 2013
    Publication date: November 28, 2013
    Applicant: Infosys Limited
    Inventor: Sanal Kumar Sundaresan Nair
  • Patent number: 8595209
    Abstract: Methods and systems for identifying products and product idea lists. A method is provided which includes searching a product index for a result. The result is used to search an idea list index for idea lists related to the result wherein each idea list includes at least one product and has an associated popularity and relevance to the search. The method also includes outputting at least some of the idea lists based on the popularity and relevance of the idea lists. In one embodiment a method of identifying product idea lists is provided. The method includes searching a product index for keywords associated with products in a product idea list. The method also includes using the keywords to search a product idea index for other idea lists and outputting the other idea lists based on their popularities. In some embodiments, the popularities may be based on time-weighted events.
    Type: Grant
    Filed: January 29, 2008
    Date of Patent: November 26, 2013
    Assignee: Boundless Network, Inc.
    Inventor: Jeremy Kraybill
  • Patent number: 8583659
    Abstract: In one embodiment, one or more computing devices determine a confidence score between a user node and a concept node of a social graph based on similarity numbers associated with edges between the user node and the concept node in one or more hops between them on the social graph.
    Type: Grant
    Filed: July 9, 2012
    Date of Patent: November 12, 2013
    Assignee: Facebook, Inc.
    Inventors: Tudor Andrei Cristian Alexandrescu, Pierre Moreels
  • Publication number: 20130297622
    Abstract: Processing methods and systems are provided for representing documents relative to importance of words in the document. A processor comprising a weighting model of word importance in a document in a collection relative to an importance of the word in other documents in the collection computes a deviation of distribution of the word from a probability distribution of the word in other documents in the collection, where the deviation distribution is weighted in accordance with a concavity control function. A concavity control parameter is adjustable relative to word frequency.
    Type: Application
    Filed: May 3, 2012
    Publication date: November 7, 2013
    Applicant: Xerox Corporation
    Inventor: Stephane Clinchant
  • Patent number: 8577892
    Abstract: Systems and methods for utilizing affinity groups to allocate data items and computing resources are disclosed. Upon receipt of a user preference indicating an affinity group, a token associated with that affinity group may be stored in a database. The affinity group may be associated with a geographic region or a number of data centers. Data items and computing resources may be associated with the affinity group. These data items and computing resources may be allocated to a geographic region or data center based on their association with the affinity group. These data items and computing resources may also be reallocated based on efficiency analyses or user preferences. In this way, data items and computing resources may be efficiently allocated with lower user effort.
    Type: Grant
    Filed: June 5, 2009
    Date of Patent: November 5, 2013
    Assignee: Microsoft Corporation
    Inventors: Remy Pairault, Zhe Yang, Sriram Krishnan, George Moore
  • Patent number: 8577899
    Abstract: Methods and systems supporting curation of items in a searchable knowledge base are provided. The methods and systems include mining one or more search queries of the searchable knowledge base, where each of the search queries includes a plurality of the items. The method further includes determining one or more pairs of items using a processor, where each of the pairs of items includes a correlation value exceeding a threshold. The correlation values for the pairs of items are based upon the frequency the items of the pairs of items co-occur within the search queries. The method further includes providing the pairs of items to a curator, where the curator reviews the pairs of items.
    Type: Grant
    Filed: March 5, 2010
    Date of Patent: November 5, 2013
    Assignee: Palo Alto Research Center Incorporation
    Inventor: John T. Maxwell
  • Patent number: 8572100
    Abstract: An automated method for recording sites accessed by a client in a communications network, the method including the steps of: detecting submission of a search query from the client to one or more search engines; and recording a search trail of one or more parameters of sites accessed consecutively following return of search query results to the client.
    Type: Grant
    Filed: December 15, 2004
    Date of Patent: October 29, 2013
    Inventor: Nigel Hamilton
  • Patent number: 8566323
    Abstract: Methods and apparatus teach a digital spectrum of a file. The digital spectrum is used to map a file's position. This position relative to another file's position reveals closest neighbors. When multiple such neighbors are arranged, first “patterns” of data are created that further define digital spectrums of new files. It is within this sorted new data that emergent relationships or second “patterns” are examined, according to the techniques for its underlying files, or “patterns of patterns.” Representatively, original files are stored on computing devices. If encoded, they have pluralities of symbols representing an underlying data stream of original bits of data. The original files are examined for relationships between each of the files. The original relationships are converted to new files. The new files are representatively encoded and examined for other relationships.
    Type: Grant
    Filed: December 29, 2009
    Date of Patent: October 22, 2013
    Assignee: Novell, Inc.
    Inventors: Scott A. Isaacson, Craig N. Teerlink, Nadeem A. Nazeer
  • Publication number: 20130275436
    Abstract: Various embodiments promote the discoverability of data that can be contained within a database. In one or more embodiments, data within a database is organized in a structure having a schema. The structure and data can be processed in a manner that renders one or more pseudo-documents each of which constitutes a sub-structure that can be indexed. Once produced and indexed, the pseudo-documents constitute a set of searchable objects each of which relationally points back to its associated structure within the database. Searches can now be performed against the pseudo-documents which, in turn, returns a set of search results. The set of search results can include multiple sub-sets of pseudo-documents, each sub-set of which is associated with a different structure.
    Type: Application
    Filed: April 11, 2012
    Publication date: October 17, 2013
    Applicant: Microsoft Corporation
    Inventors: Surajit Chaudhuri, Lev Novik, John C. Platt
  • Patent number: 8559724
    Abstract: An apparatus and method for generating additional information about moving picture content, including: comparing image feature information about each image frame in moving picture content with image feature information about each image frame in web information, searching for an image frame in the moving picture content, the image frame matching the image frame in the web information, determining location information about the found image frame in the moving picture content, and generating additional information by use of the determined location information and the web information.
    Type: Grant
    Filed: February 24, 2010
    Date of Patent: October 15, 2013
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yoon-hee Choi, Il-hwan Choi, Hee-seon Park