Category Specific Web Crawling Patents (Class 707/710)
  • Patent number: 8250092
    Abstract: Methods, apparatus, and systems directed to receiving search queries, retrieving documents, computing the number of categories to present for a given query, computing the number of results to show in each category, computing an ordering of categories, and for all the result pages beyond the first page employing user interface elements that optionally allow the user to quickly zoom in on a specific category and get more results belonging to that category.
    Type: Grant
    Filed: December 19, 2011
    Date of Patent: August 21, 2012
    Assignee: Microsoft Corporation
    Inventors: Sreenivas Gollapudi, Rakesh Agrawal, Samuel Ieong
  • Publication number: 20120209828
    Abstract: The present invention includes: acquiring plural web pages of an identical category into which targets stated in the web pages are classified (S1); acquiring an attribute-related term related to an attribute of the targets stated in the web pages or an attribute description pattern used to describe the attribute of the targets as initial data (S2); extracting the attribute-related term of the attribute matching the attribute description pattern from the plural web pages (S3); and extracting an attribute description pattern matching the attribute-related term from plural web pages (S4).
    Type: Application
    Filed: February 28, 2011
    Publication date: August 16, 2012
    Applicant: RAKUTEN, INC.
    Inventors: Takamasa Takenaka, Satoshi Sekine
  • Publication number: 20120209826
    Abstract: An approach is provided for providing location based information according to a predetermined format. A location information manager associates location information with web content. The location information manager also causes, at least in part, publication of the web content and the associated location information according to a predetermined format, wherein the predetermined format facilitates, at least in part, discovery of the location information.
    Type: Application
    Filed: February 10, 2011
    Publication date: August 16, 2012
    Applicant: Nokia Corporation
    Inventor: Petros Belimpasakis
  • Publication number: 20120209827
    Abstract: A method, apparatus, article of manufacture for generating a media program database having a plurality of media programs is disclosed. In one embodiment, the method is comprises the steps of receiving first media program metadata from a first source, searching the Internet to find second media program metadata from a second source distinct from the first source, determining if the first media program metadata and the second media program metadata are associated with the same media program, merging the first media program metadata and the second media program metadata if the first media program metadata and the second media program metadata are associated with the same media program, and storing the merged first media program metadata and second media program metadata in the media program database.
    Type: Application
    Filed: February 29, 2012
    Publication date: August 16, 2012
    Applicant: HULU LLC
    Inventors: Zhibing Wang, Yizhe Tang, Qian Chang, Ting-hao Yang
  • Publication number: 20120209986
    Abstract: A method and system are disclosed for monitoring user interactions and generating proactive responses thereto within a social media environment. Social media interactions are monitored, collected, and processed to determine whether they contain content outside of a threshold. If so, they are processed to determine the content causing the content to be outside of the threshold. Once the issues have been determined, proactive actions are performed to counteract the affect of the content.
    Type: Application
    Filed: February 15, 2011
    Publication date: August 16, 2012
    Inventors: Shesha Shah, Rajiv Narang
  • Patent number: 8244710
    Abstract: Retrieving information from information sources using links. A set of information sources is preprocessed to extract content from text and existing links in the information sources according to some predetermined criteria. A set of search results is generated from amongst the preprocessed information sources in response to a received search query.
    Type: Grant
    Filed: August 6, 2007
    Date of Patent: August 14, 2012
    Assignee: Oracle International Corporation
    Inventors: Ajay Kumar Singh, Madhu Syamala
  • Publication number: 20120203759
    Abstract: The present invention relates to a method for writing a newly recognized image. The method includes the steps of: (a) comparing pre-stored image in the image database with a queried image; (b) storing the queried image onto a database for unrecognized images if there is no image similar to the queried image; (c) grouping the images in the database for unrecognized images based on degrees of similarity thereamong; and (d) comparing, if a specific image and its tag information are inputted, the specific image with some images included in a specific set of images among the organized sets of the images, determining whether there is any image in the specific set of images which has a degree of similarity exceeding the pre-set value and allowing images determined to have degrees of similarity exceeding the pre-set value with the tag information to be automatically written onto the image database.
    Type: Application
    Filed: November 17, 2011
    Publication date: August 9, 2012
    Applicant: OLAWORKS, INC.
    Inventors: Tae Hoon Kim, Min Je Park, Song Ki Choi
  • Publication number: 20120203760
    Abstract: Techniques for obtaining geographically-relevant product inventory information, in real-time, from heterogeneous data sources are described. Product inventory information, including the volume of available products in specific geographical locations, is obtained from at least three different sources. First, one or more data feeds may be received. Second, a data obtaining module uses one or more APIs to obtain product inventory information from one or more third-party inventory management systems. Finally, a structured data mining module uses a web crawler, at the direction of a crawler configuration, to systematically obtain product inventory information from various third-party websites. Accordingly, a user's search query is processed to provide geographically relevant product inventory information in near real time.
    Type: Application
    Filed: February 6, 2012
    Publication date: August 9, 2012
    Applicant: eBay Inc.
    Inventors: Jack Phillip Abraham, Aaron Adelson, Matthew Barto, Theodore James Dziuba, John Evans, Neville Newey, Justin Van Winkle
  • Patent number: 8239361
    Abstract: Disclosed is a method and system for user-centered information search. The user-centered information search may include generating an object as a classification unit of an information search structure and a property of the object, generating a class and determining a property of the class using the object; and detecting a search result corresponding to an information request from a user using at least one of the object, property, and class.
    Type: Grant
    Filed: June 16, 2008
    Date of Patent: August 7, 2012
    Assignee: NHN Corporation
    Inventors: Seok Ho Kang, Dohwan Kang
  • Patent number: 8239367
    Abstract: A system receives a search query from a user and searches a repository of documents based on the search query to obtain search results. The system provides the search results to the user and automatically bookmarks one or more of the search results without the user explicitly requesting that the one or more search results be bookmarked.
    Type: Grant
    Filed: January 31, 2007
    Date of Patent: August 7, 2012
    Assignee: Google Inc.
    Inventors: Oren Zamir, Jeffrey Korn
  • Patent number: 8239479
    Abstract: Systems and methods for synchronizing data between endpoints using elements of centralized and decentralized synchronization systems and communication topologies are disclosed. Such systems and methods may in some cases synchronize some subset of data with a centralized endpoint while another subset of data is synchronized in a decentralized fashion directly with other endpoints. Such systems and methods may include a variety of cooperative functionality to assist in the synchronization of data between endpoints.
    Type: Grant
    Filed: June 22, 2007
    Date of Patent: August 7, 2012
    Assignee: Microsoft Corporation
    Inventors: Akash J. Sagar, George P. Moromisato, Richard Yiu-Sai Chung, Raymond E. Ozzie, Jack E. Ozzie, David Richard Reed, Michael Steven Vernal, Vladimir Dmitri Fedorov, Muthukaruppan Annamalai
  • Patent number: 8239383
    Abstract: A method, system and article of manufacture for query execution management and, more particularly, for managing execution of queries against database samples. One embodiment provides a computer-implemented method for managing execution of a query against a database having a multiplicity of data records. The method comprises receiving, from a requesting entity, a query against the database, and performing an automated execution process, comprising: (i) iteratively executing the query against samples of the database, each sample including a subset of the multiplicity of data records, (ii) after each iterative execution of the query, determining whether a query result obtained for the iterative execution satisfies a predefined condition, and (iii) if the predefined condition is not satisfied, performing a predefined action.
    Type: Grant
    Filed: June 15, 2006
    Date of Patent: August 7, 2012
    Assignee: International Business Machines Corporation
    Inventor: John M. Santosuosso
  • Publication number: 20120197862
    Abstract: Method and apparatus for creating an electronic database of disambiguated entity mentions and relations from a corpus of electronic documents. The invention automatically extracts from the corpus of electronic documents mentions about entities (e.g., references to people, organizations or places), parses the entity mentions into “mention objects,” and executes a series of grouping, comparison and hierarchical fuzzy object clustering algorithms to cluster together in an electronic database all of the mention objects referring to the same entity and all of the mention objects (e.g. “people”) associated with each other by a relationship (e.g., “co-authors” or “family members”).
    Type: Application
    Filed: August 9, 2011
    Publication date: August 2, 2012
    Applicant: COMSORT, INC.
    Inventors: Michael A. Woytowitz, Marshall Wells Hawks
  • Publication number: 20120197863
    Abstract: In an example, disclosed is a machine automated method of identifying a set of skills. In some examples, the method includes extracting a plurality of skill seed phrases from a plurality of member profiles of a social networking site, creating a plurality of disambiguated skill seed phrases by disambiguating the plurality of skill seed phrases using one or more computer processors, and de-duplicating the plurality of disambiguated skill seed phrases to create a plurality of de-duplicated skill seed phrases.
    Type: Application
    Filed: January 24, 2012
    Publication date: August 2, 2012
    Applicant: Linkedln Corporation
    Inventors: Peter N. Skomoroch, Matthew T. Hayes, Abhishek Gupta, Dhanurjay A.S. Patil
  • Patent number: 8234584
    Abstract: Provided is a computer system including an information providing server and a computer which is coupled to the information providing server, and which collects information, the computer being configured to: record status histories including a history of an operation to a screen which shows a status of the computer, and which is displayed on the computer in chronological order to constitute a set of the status histories; and divide, in a case where a history of an operation of switching the screen is detected from the set of the status histories, based on the history of the operation of switching the screen, the set of the status histories. Accordingly, when a user collects information, navigation information is provided by taking the fact that the user has actually reached useful information into consideration.
    Type: Grant
    Filed: February 18, 2009
    Date of Patent: July 31, 2012
    Assignee: Hitachi, Ltd.
    Inventors: Masahiro Motobayashi, Toshio Okochi, Michiko Sakai, Maki Hayashi, Akio Azuma
  • Patent number: 8234706
    Abstract: A method for enabling access to software security data is provided. The method includes accessing data associated with software vulnerabilities from a plurality of on-line sources. The method further includes aggregating the data from the plurality of on-line sources and identifying attributes associated with the data. The method also includes enabling access to the aggregated data through a graphical user interface that can be used to analyze the data according to the attributes.
    Type: Grant
    Filed: June 20, 2007
    Date of Patent: July 31, 2012
    Assignee: Microsoft Corporation
    Inventors: Dongmei Zhang, Yingnong Dang, Xiaohui Hou, Song Huang, Jian Wang
  • Patent number: 8229911
    Abstract: An Internet infrastructure that supports searching of web links selects search results by processing browser activity information along with one or more of favorite lists, and related metadata, user profiles, and trends based on browser activity behavior and favorite behavior. The Internet infrastructure consists of a plurality of web browsers located on client devices. The web browsers are incorporated with a browser activity-monitoring module that tracks user's Internet usage, processes this information, and sends this information periodically or upon user request to the server to aid in improving search operation results. The search engine server is communicatively coupled to the plurality of web browsers and supports delivery of search results/web links to the client device based upon a search string, browser activity information, and possibly the favorite lists and related metadata.
    Type: Grant
    Filed: March 31, 2009
    Date of Patent: July 24, 2012
    Assignee: Enpulz, LLC
    Inventor: James D. Bennett
  • Patent number: 8214342
    Abstract: A method to identify a supplier of good or services over the Internet by providing a home page with at least one link to a directory Web site for a class of goods or services. The directory Web site includes a directory Web site domain that at least partially describes a class of goods or services. The directory Web site also contains at least one supplier link to a corresponding supplier Web site and a rollover window. The home page and the directory Web site are configured to allow a user to access the home page; select a directory Web site based at least in part on the directory Web site domain name; activate the link to the selected directory Web site; and select and activate the supplier link for a supplier of goods or services.
    Type: Grant
    Filed: August 23, 2001
    Date of Patent: July 3, 2012
    Inventor: Michael Meiresonne
  • Patent number: 8209320
    Abstract: A computer-implemented system and method for keyword extraction are disclosed. The system in an example embodiment includes a keyword extraction component to extract relevant keywords from content of a web page, to identify items relevant to the extracted keywords, and to rank the relevant items.
    Type: Grant
    Filed: December 27, 2006
    Date of Patent: June 26, 2012
    Assignee: eBay Inc.
    Inventors: Alec Reitter, Barb Chang, Ken Sun, Raghav Gupta, Alvaro Bolivar, Alan Lewis
  • Patent number: 8200650
    Abstract: From when any one of several DMSs 2 is selected, until contents are downloaded from that DMS 2, a DMP 1 stores the search information which has been specified for that DMS 2. And if some other DMS 2 is selected before contents have been downloaded from the DMS 2 which was first selected, then the DMP 1 specifies this search information which is stored to that other DMS 2. Accordingly, it is possible greatly to enhance the ease of use when it is not known upon which DMS 2 the desired contents is stored.
    Type: Grant
    Filed: June 9, 2009
    Date of Patent: June 12, 2012
    Assignee: Funai Electric Co., Ltd.
    Inventor: Junya Senoo
  • Publication number: 20120143845
    Abstract: The present invention outlines a genuine entity following system that also addresses data source limitation. When reviewing entity-related objects in web content, a web user designates one or more entities to follow in real time. More particularly, the present invention is directed through strategic deployment of a dynamic crawler upon selection of a “follow” pointer over an object in a web browser such that a web user can automatically designate entities to be followed and receive alerts at predetermined temporal intervals when new information regarding such designated entities becomes available. A web entity engine of the present invention is designed to discover trending entities at any given time while generating output activity (i.e., signal) streams for this entity.
    Type: Application
    Filed: December 1, 2010
    Publication date: June 7, 2012
    Applicant: Microsoft Corporation
    Inventors: Zhaowei Jiang, Xavier Legros, Ronald H. Jones, JR., Ryan Panchadsaram
  • Patent number: 8195638
    Abstract: Computer implemented methods and systems are provided for web log filtering. A uniform resource locator (URL) is identified for a resource requested by an identified device. The URL is stored unless the URL has at a reference to an advertisement or an extension that matches any of a list of extensions specified for storage exclusion. The stored URL is categorized based on either the stored URL or an included domain name, depending on whether the included domain name matches any of the list of domain names that are associated with multiple categories. A count is incremented in a web log category associated with the identified device based on the categorized stored URL.
    Type: Grant
    Filed: April 5, 2011
    Date of Patent: June 5, 2012
    Assignee: Sprint Communications Company L.P.
    Inventors: James D. Barnes, Dan O'Connor, Dora Potluri
  • Patent number: 8195630
    Abstract: What is provided is a spatially-enabled content management system in which unstructured information is data mined for location or spatial references, with the search query including not only the spatial reference that has been provided by the data mining but also other search query terms, thus to provide an analyst with rapid geo-searching for unstructured information management.
    Type: Grant
    Filed: October 29, 2007
    Date of Patent: June 5, 2012
    Assignee: BAE Systems Information Solutions Inc.
    Inventors: John R. Ellis, Michael T. Hornbeek, Mark Meadows
  • Patent number: 8185513
    Abstract: A method, apparatus, article of manufacture for generating a media program database having a plurality of media programs is disclosed. In one embodiment, the method is comprises the steps of receiving first media program metadata from a first source, searching the Internet to find second media program metadata from a second source distinct from the first source, determining if the first media program metadata and the second media program metadata are associated with the same media program, merging the first media program metadata and the second media program metadata if the first media program metadata and the second media program metadata are associated with the same media program, and storing the merged first media program metadata and second media program metadata in the media program database.
    Type: Grant
    Filed: December 31, 2008
    Date of Patent: May 22, 2012
    Assignee: Hulu LLC
    Inventors: Zhibing Wang, Yizhe Tang, Qian Chang, Ting-hao Yang
  • Patent number: 8185515
    Abstract: Information regarding the structure of information in a content database is maintained in a structure database. The structure database is used to correlate the data structure of a query to the structure of the content database, in order to determine that information in the content database which needs to be provided to a searcher in response to the query. In one embodiment, this search method is used in an online forum, and the forum maintains a reputation score for users with respect to given subject matter. The reputation score is dependent upon the quality of a user's participation in the forum. A user's reputation score depends upon the evaluation by others of information he posts and. upon the user evaluating information posted by others.
    Type: Grant
    Filed: December 1, 2008
    Date of Patent: May 22, 2012
    Assignee: Transparensee Systems, Inc.
    Inventor: Steven David Lavine
  • Publication number: 20120117053
    Abstract: Techniques for identifying knowledge use an graphical user interface for inputting one or more terms to be explored for additional knowledge. Then a search is conducted across one or more sources of information to identify resources containing information about or information associated with said terms. The resources are decomposed into elemental units of information and stored in a data structures called nodes. A group of nodes are stored in a node pool and, from the node pool, correlations of nodes are constructed that represent knowledge.
    Type: Application
    Filed: December 14, 2011
    Publication date: May 10, 2012
    Applicant: MAKE SENCE, INC.
    Inventors: Mark Bobick, Carl Wimmer
  • Patent number: 8171012
    Abstract: A document management apparatus that searches at least one document group saved in advance for a document group having attributes that correspond to a search condition. The apparatus includes an updating unit configured to update the attributes of the document group in accordance with an operation performed by a user on a document in the document group, and a search unit configured to search for a document group having attributes that correspond to user information inputted from the exterior.
    Type: Grant
    Filed: February 9, 2009
    Date of Patent: May 1, 2012
    Assignee: Canon Kabushiki Kaisha
    Inventor: Masaya Soga
  • Publication number: 20120095985
    Abstract: A system, media, and method for selecting future queries are provided. The selected future queries are used to transmit appropriate online advertising to a user that issues queries to a search engine. The search engine is coupled to a prediction component that predicts what subject the user is going to be interested in and when the user will be interested in the subject. The prediction component returns a future query using statistical language models representing a query history of the user and aggregate query histories for a community of users.
    Type: Application
    Filed: December 22, 2011
    Publication date: April 19, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: DOU SHEN, Ying Li
  • Publication number: 20120089591
    Abstract: A new network communications tool that allows third parties (companies, individuals, and others) to recognize if a certain user is looking for products or services that they might offer. Providing that the user expressly authorizes it, this new communication system will send an automated communication to various companies or individuals indicating that the user is looking for products or services offered by them and that the user is willing to receive direct information from them.
    Type: Application
    Filed: October 6, 2011
    Publication date: April 12, 2012
    Inventor: Abraham Stern
  • Patent number: 8156104
    Abstract: Systems and methods for managing data, such as metadata. In one exemplary method, metadata from files created by several different software applications are captured, and the captured metadata is searched. The type of information in metadata for one type of file differs from the type of information in metadata for another type of file. Other methods are described and data processing systems and machine readable media are also described.
    Type: Grant
    Filed: March 26, 2009
    Date of Patent: April 10, 2012
    Assignee: Apple Inc.
    Inventors: Yan Arrouye, Dominic Giampaolo, Bas Ording, Gregory Christie, Stephen Olivier Lemay, Marcel van Os, Imran Chaudhri, Kevin Tiene, Pavel Cisler
  • Patent number: 8156096
    Abstract: A supplier identification and locator system in that allows a user to identify a supplier of goods or services over the Internet; the system includes at least one directory Web site having a domain name that is at least partially descriptive of a class of goods or services. The directory Web site has a plurality of links that access suppliers' Web sites; a supplier descriptive portion located substantially adjacent to the link; a descriptive title portion substantially corresponding to the class of goods or services described in the domain name; a rollover window that displays information about at least one supplier; and an input receiving area where a user inputs data and ranked search results are displayed.
    Type: Grant
    Filed: September 23, 2011
    Date of Patent: April 10, 2012
    Inventor: Michael Meiresonne
  • Patent number: 8156103
    Abstract: A computer-related and/or business type method is presented for embedding one or more media hotspots within a digital media file and, in response to interaction from a separate target entity, such as via an associating request, associating one or more resultant actions with the media hotspot(s). In exchange for associating the one or more resultant actions with the media hotspot(s), an interactive media service entity being affiliated with a web site displaying the digital media file and/or a user being affiliated with the digital media file itself is compensated based upon at least one compensation plan.
    Type: Grant
    Filed: February 8, 2011
    Date of Patent: April 10, 2012
    Assignee: Clayco Research Limited Liability Company
    Inventor: Leigh Rothschild
  • Patent number: 8156125
    Abstract: A data handling method combines search capabilities with analytical functionality. The invention provides advantages when dealing with structured documents (such as electronic catalogs, XML documents, text documents, HTML documents, Internet documents, etc.) and other data stored in a computer system. Various embodiments include simplified ways to express search/analysis requests of a data set and also to express results to such requests.
    Type: Grant
    Filed: February 19, 2008
    Date of Patent: April 10, 2012
    Assignee: Oracle International Corporation
    Inventors: Thomas M. Annau, Joseph Sill
  • Patent number: 8150831
    Abstract: Systems and methods facilitate a search and identify documents and associated metadata reflecting content of the documents. In one implementation, a method receives a query comprising a set of search terms, identifies a stored document in response to the query, and determines a score value for the retrieved document based on a similarity between one or more of the query search terms and metadata associated with the identified document. The method locates the identified document in a citation network of baseline query results, the citation network comprising a first set of documents that cite to the identified document and a second set of documents cited to by the identified document. The method further determines a new score value of the identified document as a function of the score value and a quantity and a quality of documents within the first and second set of documents.
    Type: Grant
    Filed: April 15, 2009
    Date of Patent: April 3, 2012
    Assignee: LexisNexis
    Inventors: Ling Qin Zhang, Harry R. Silver
  • Patent number: 8145618
    Abstract: A system and method for scoring documents is described. One or more documents are identified responsive to a search criteria. A text match score indicating a quality of match of the identified documents is determined. A category match score is determined over categories. A document-categories score is determined indicating a quality of match between an identified document and a plurality of categories. A search criteria-categories score is determined indicating a quality of match between the search criteria and the categories. An overall score is determined based on the text match score and the category match score.
    Type: Grant
    Filed: October 11, 2010
    Date of Patent: March 27, 2012
    Assignee: Google Inc.
    Inventors: Karl Pfleger, Brian Larson
  • Patent number: 8145619
    Abstract: A method for identifying companies with specific business objectives that includes using existing sources of company firmographic data to identify a broad set of companies and associated websites, crawling the websites associated with the identified companies and indexing web site content for each of the identified companies with the specific business objective to realize indexed web content. The method further includes joining the company firmographic data with the indexed web content using a business objective common identifier to generate a store of joined structured firmographic data and indexed web content and presenting a display image representation of the store of joined structured firmographic data and indexed web content for user review. The display image further receives user input to score each of said companies identified therein, and using a search interface, querying the store of scored, joined structured firmographic data and indexed web content.
    Type: Grant
    Filed: February 11, 2008
    Date of Patent: March 27, 2012
    Assignee: International Business Machines Corporation
    Inventors: Timothy R. Bowden, Upendra Chitnis, Ildar K. Khabibrakhmanov, Richard D. Lawrence, Yan Liu, Prem Melville
  • Publication number: 20120072409
    Abstract: A method and system is provided that in a fully automated manner crawls web sites and identifies specific types of web pages, then extracts targeted data from those web pages. One or more text nodes containing product-related information on a first web page are first identified, and the locations of those test nodes are described using one or more vectors. The vectors are then analyzed to identify one or more patterns and to generate a model from those patterns that discriminates between text nodes that contain product-related information and text nodes that do not contain product-related information on a second web page. The model can then be used to crawl web sites to identify and extract targeted data, or the model can be installed on a user's computer to identify and extract targeted information from web sites as the user is browsing.
    Type: Application
    Filed: March 18, 2011
    Publication date: March 22, 2012
    Inventors: Bradley John Perry, Nancy Ann Perry, Daniel Carl Marriott
  • Publication number: 20120066201
    Abstract: The described embodiments relate generally to methods and systems for generating a search on an electronic device. The method includes receiving a multimedia object, determining metadata corresponding to the multimedia object and generating a search query corresponding to the metadata. The system includes an input interface configured to select a multimedia object, a metadata determining module configured to determine metadata associated with the multimedia object and a query generating module configured to generate a search query corresponding to the metadata. The electronic device may include a portable device where the search query is generated upon detection of the movement of the portable device.
    Type: Application
    Filed: September 15, 2010
    Publication date: March 15, 2012
    Applicant: RESEARCH IN MOTION LIMITED
    Inventors: Michael Williams Suman, Bhavuk Kaul
  • Publication number: 20120066202
    Abstract: A information search method extends to a social networking site to search information by a contact name and other elements as location-based information, relationship, product, etc. The method includes the steps of receiving a search term specified by a searcher, detecting whether WHO is specified in the search term to determine whether search should extend to human contacts in social networking services, identifying WHAT from the search term and determining a nature of WHAT from the search term; sending the search term to a social networking service to retrieve relevant information; and presenting the retrieved information arranged in a predetermined order on a display or by audible sound and optionally a geographic information to retrieve information.
    Type: Application
    Filed: July 26, 2011
    Publication date: March 15, 2012
    Inventors: Mari Hatazawa, Mike Iao, Andrew De Silva
  • Publication number: 20120059816
    Abstract: A search engine receives user-submitted queries, determines web pages that are relevant to those queries, and returns relevance-ranked lists of references to the relevant web pages. Additionally, the search engine adds each query's terms to a query log. An automated process asynchronously examines the log and locates questions therein. For each question so located, the process determines whether that question already is contained in a database of questions maintained by an online question-and-answer system that is separate from the search engine. For each such question that is not already contained in the stored database of questions, the process automatically adds that question to the question database. As a result, the set of questions used by the online question-and-answer system grows even in the absence of any further direct question submissions by users of the system.
    Type: Application
    Filed: September 7, 2010
    Publication date: March 8, 2012
    Inventors: Priyesh Narayanan, Ashvin Agrawal
  • Patent number: 8126876
    Abstract: A system ranks results. The system may receive a list of links. The system may identify a source with which each of the links is associated and rank the list of links based at least in part on a quality of the identified sources.
    Type: Grant
    Filed: July 10, 2009
    Date of Patent: February 28, 2012
    Assignee: Google Inc.
    Inventors: Michael Curtiss, Krishna Bharat, Michael Schmitt
  • Publication number: 20120047123
    Abstract: The present invention is directed to a method and computer system for representing a dataset comprising N documents by computing a diffusion geometry of the dataset comprising at least a plurality of diffusion coordinates. The present method and system stores a number of diffusion coordinates, wherein the number is linear in proportion to N.
    Type: Application
    Filed: November 3, 2011
    Publication date: February 23, 2012
    Inventors: RONALD R. COIFMAN, Andreas C. COPPI, Frank GESHWIND, Stephane S. LAFON, Ann B. LEE, Mauro M. MAGGIONI, Frederick J. WARNER, Steven ZUCKER, William G. FATELEY
  • Patent number: 8122005
    Abstract: A training set generator may be configured to input a taxonomy including a hierarchy of categories and a plurality of top-level sites, and to output a training set of categorized data. The training set generator may include a crawler configured to crawl each of the top-level sites to determine at least one lower-level site associated therewith and to store the top-level sites and associated lower-level sites as crawl data. The training set generator also may include an extractor configured to determine, for each of the top-level sites, a corresponding site-specific extraction template associating at least one portion of the corresponding top-level site with at least one category of the hierarchy of categories, and further configured to apply each site-specific extraction template to corresponding crawl data to thereby associate the crawl data with the categories of the hierarchical categories and obtain categorized data of the training set.
    Type: Grant
    Filed: October 22, 2009
    Date of Patent: February 21, 2012
    Assignee: Google Inc.
    Inventors: Philo Juang, Christopher Testa, Nicolaus Mote
  • Patent number: 8112404
    Abstract: Search results are provided for mobile computing devices. Search results are retrieved based on a search term. Each of the search results is assigned to one or more categories. The categories and the assigned search results are provided to the mobile computing device. The mobile computing device is adapted to display each of the categories and a partial list of the search results for each of the categories.
    Type: Grant
    Filed: May 8, 2008
    Date of Patent: February 7, 2012
    Assignee: Microsoft Corporation
    Inventors: Tuan Huynh, Hiromi Kobayashi, Takeshi Tanaka, Hirokazu Sawada, Tsutomu Kagoshima
  • Patent number: 8112412
    Abstract: Attempts by a user to download executable files with unacceptable reputations are detected, and recommendations for similar files with good reputations are made to the user. More specifically, a user's web browsing is tracked, and terms describing software applications are extracted from browsed pages. When a user attempts to download an executable file, a corresponding notification including recently extracted terms is transmitted to a categorization component, which receives such information from many users. The categorization component stores the received information in a database. This maintained database identifies files that are available for download, as well as corresponding extracted terms and reputational scores. If a user initiates a download of an executable file with an unacceptable score, the categorization component identifies executable files in the database with related extracted terms, but with acceptable reputations, to recommend to the user as alternatives.
    Type: Grant
    Filed: June 30, 2008
    Date of Patent: February 7, 2012
    Assignee: Symantec Corporation
    Inventor: Carey Nachenberg
  • Patent number: 8108482
    Abstract: A data relaying apparatus disposed on the preceding stage of a registry server centrally managing meta-information extracts meta-information from a content retrieval result transmitted from the registry server to a client terminal and retains and correlates the meta-information with URI information included in the meta-information. On the other hand, a data relaying apparatus disposed on the preceding stage of a repository server retaining contents receives a content acquisition request transmitted from the client terminal to the repository server to extract URI information from the content acquisition request and transmits the URI information to the data relaying apparatus to acquire meta-information. The meta-information is added to contents transmitted to the client terminal before the contents are relayed.
    Type: Grant
    Filed: October 10, 2008
    Date of Patent: January 31, 2012
    Assignee: Fujitsu Limited
    Inventors: Naoki Matsuoka, Tomohiro Ishihara
  • Patent number: 8108405
    Abstract: In one embodiment, a search space of a corpus is searched to yield results. The corpus comprises documents associated with keywords, where each document is associated with at least one keyword indicating at least one theme of the document. One or more keywords are determined to be irrelevant keywords. The search space is refined according to the irrelevant keywords.
    Type: Grant
    Filed: October 1, 2008
    Date of Patent: January 31, 2012
    Assignee: Fujitsu Limited
    Inventors: David L. Marvit, Jawahar Jain, Stergios Stergiou
  • Patent number: 8103646
    Abstract: An automated mechanism of automatically tagging media files such as podcasts, blog entries, and videos, for example, with meaningful taxonomy tags. The mechanism provides active (or automated) assistance in assigning appropriate tags to a particular piece of content (or media). Included is a system for automatic tagging of audio streams on the Internet, whether from audio files, or from the audio tracks of audio/video files, using the folksonomy of the Internet. The audio streams may be provided by the media author. For example, the author can make a recording to be posted on a website, and use the system to automatically suggest (via prompted author interaction) folksonomically appropriate tags for the media recording. Alternatively, the system can be used in an automated fashion to develop and assign without any intervention by the author.
    Type: Grant
    Filed: March 13, 2007
    Date of Patent: January 24, 2012
    Assignee: Microsoft Corporation
    Inventor: Robert I. Brown
  • Publication number: 20120016862
    Abstract: In one embodiment, a method may include accessing a particular page of Web application that includes a form having at least one field for entry of data by a user of the Web application, the Web page rendered by the Web application based on code for the Web page. The method may also include analyzing the code. The method may further include generating one or more sets of inputs for the at least one field based on the analysis. The method may additionally include automatically entering, into the at least one field, the one or more sets of inputs. The method may also include automatically submitting the form, including the one or more sets of inputs into the at least one field.
    Type: Application
    Filed: July 14, 2010
    Publication date: January 19, 2012
    Inventor: Sreeranga P. Rajan
  • Publication number: 20120016863
    Abstract: Methods for enriching metadata associated with a document that is categorized in a document category are described. Documents are pre-categorized within a document category. Uniform resource locaters (URL) that are related to a document category are identified and linked to the document category. Indications of tokens and relationships between the tokens and the URLs are received. The tokens are linked to the URLs. The tokens are propagated to the document categories and to the documents therein based on linking between the token, URL, and document category. As such the document category, and documents therein, are provided with metadata that is descriptive thereof. The documents and their associated metadata tokens are useable to generate a searchable index of the documents. The linking between the tokens, URLs, and categories is also useable to identify tokens that are too specific, too general, or documents that are miscategorized.
    Type: Application
    Filed: July 16, 2010
    Publication date: January 19, 2012
    Applicant: MICROSOFT CORPORATION
    Inventors: DANIEL BERNHARDT, IAN DOUGLAS HEGERTY, TOMASZ ANDRZEJ MARCINIAK