Category Specific Web Crawling Patents (Class 707/710)
-
Patent number: 8250092Abstract: Methods, apparatus, and systems directed to receiving search queries, retrieving documents, computing the number of categories to present for a given query, computing the number of results to show in each category, computing an ordering of categories, and for all the result pages beyond the first page employing user interface elements that optionally allow the user to quickly zoom in on a specific category and get more results belonging to that category.Type: GrantFiled: December 19, 2011Date of Patent: August 21, 2012Assignee: Microsoft CorporationInventors: Sreenivas Gollapudi, Rakesh Agrawal, Samuel Ieong
-
Publication number: 20120209828Abstract: The present invention includes: acquiring plural web pages of an identical category into which targets stated in the web pages are classified (S1); acquiring an attribute-related term related to an attribute of the targets stated in the web pages or an attribute description pattern used to describe the attribute of the targets as initial data (S2); extracting the attribute-related term of the attribute matching the attribute description pattern from the plural web pages (S3); and extracting an attribute description pattern matching the attribute-related term from plural web pages (S4).Type: ApplicationFiled: February 28, 2011Publication date: August 16, 2012Applicant: RAKUTEN, INC.Inventors: Takamasa Takenaka, Satoshi Sekine
-
Publication number: 20120209826Abstract: An approach is provided for providing location based information according to a predetermined format. A location information manager associates location information with web content. The location information manager also causes, at least in part, publication of the web content and the associated location information according to a predetermined format, wherein the predetermined format facilitates, at least in part, discovery of the location information.Type: ApplicationFiled: February 10, 2011Publication date: August 16, 2012Applicant: Nokia CorporationInventor: Petros Belimpasakis
-
Publication number: 20120209827Abstract: A method, apparatus, article of manufacture for generating a media program database having a plurality of media programs is disclosed. In one embodiment, the method is comprises the steps of receiving first media program metadata from a first source, searching the Internet to find second media program metadata from a second source distinct from the first source, determining if the first media program metadata and the second media program metadata are associated with the same media program, merging the first media program metadata and the second media program metadata if the first media program metadata and the second media program metadata are associated with the same media program, and storing the merged first media program metadata and second media program metadata in the media program database.Type: ApplicationFiled: February 29, 2012Publication date: August 16, 2012Applicant: HULU LLCInventors: Zhibing Wang, Yizhe Tang, Qian Chang, Ting-hao Yang
-
Publication number: 20120209986Abstract: A method and system are disclosed for monitoring user interactions and generating proactive responses thereto within a social media environment. Social media interactions are monitored, collected, and processed to determine whether they contain content outside of a threshold. If so, they are processed to determine the content causing the content to be outside of the threshold. Once the issues have been determined, proactive actions are performed to counteract the affect of the content.Type: ApplicationFiled: February 15, 2011Publication date: August 16, 2012Inventors: Shesha Shah, Rajiv Narang
-
Patent number: 8244710Abstract: Retrieving information from information sources using links. A set of information sources is preprocessed to extract content from text and existing links in the information sources according to some predetermined criteria. A set of search results is generated from amongst the preprocessed information sources in response to a received search query.Type: GrantFiled: August 6, 2007Date of Patent: August 14, 2012Assignee: Oracle International CorporationInventors: Ajay Kumar Singh, Madhu Syamala
-
Publication number: 20120203759Abstract: The present invention relates to a method for writing a newly recognized image. The method includes the steps of: (a) comparing pre-stored image in the image database with a queried image; (b) storing the queried image onto a database for unrecognized images if there is no image similar to the queried image; (c) grouping the images in the database for unrecognized images based on degrees of similarity thereamong; and (d) comparing, if a specific image and its tag information are inputted, the specific image with some images included in a specific set of images among the organized sets of the images, determining whether there is any image in the specific set of images which has a degree of similarity exceeding the pre-set value and allowing images determined to have degrees of similarity exceeding the pre-set value with the tag information to be automatically written onto the image database.Type: ApplicationFiled: November 17, 2011Publication date: August 9, 2012Applicant: OLAWORKS, INC.Inventors: Tae Hoon Kim, Min Je Park, Song Ki Choi
-
Publication number: 20120203760Abstract: Techniques for obtaining geographically-relevant product inventory information, in real-time, from heterogeneous data sources are described. Product inventory information, including the volume of available products in specific geographical locations, is obtained from at least three different sources. First, one or more data feeds may be received. Second, a data obtaining module uses one or more APIs to obtain product inventory information from one or more third-party inventory management systems. Finally, a structured data mining module uses a web crawler, at the direction of a crawler configuration, to systematically obtain product inventory information from various third-party websites. Accordingly, a user's search query is processed to provide geographically relevant product inventory information in near real time.Type: ApplicationFiled: February 6, 2012Publication date: August 9, 2012Applicant: eBay Inc.Inventors: Jack Phillip Abraham, Aaron Adelson, Matthew Barto, Theodore James Dziuba, John Evans, Neville Newey, Justin Van Winkle
-
Patent number: 8239361Abstract: Disclosed is a method and system for user-centered information search. The user-centered information search may include generating an object as a classification unit of an information search structure and a property of the object, generating a class and determining a property of the class using the object; and detecting a search result corresponding to an information request from a user using at least one of the object, property, and class.Type: GrantFiled: June 16, 2008Date of Patent: August 7, 2012Assignee: NHN CorporationInventors: Seok Ho Kang, Dohwan Kang
-
Patent number: 8239367Abstract: A system receives a search query from a user and searches a repository of documents based on the search query to obtain search results. The system provides the search results to the user and automatically bookmarks one or more of the search results without the user explicitly requesting that the one or more search results be bookmarked.Type: GrantFiled: January 31, 2007Date of Patent: August 7, 2012Assignee: Google Inc.Inventors: Oren Zamir, Jeffrey Korn
-
Patent number: 8239479Abstract: Systems and methods for synchronizing data between endpoints using elements of centralized and decentralized synchronization systems and communication topologies are disclosed. Such systems and methods may in some cases synchronize some subset of data with a centralized endpoint while another subset of data is synchronized in a decentralized fashion directly with other endpoints. Such systems and methods may include a variety of cooperative functionality to assist in the synchronization of data between endpoints.Type: GrantFiled: June 22, 2007Date of Patent: August 7, 2012Assignee: Microsoft CorporationInventors: Akash J. Sagar, George P. Moromisato, Richard Yiu-Sai Chung, Raymond E. Ozzie, Jack E. Ozzie, David Richard Reed, Michael Steven Vernal, Vladimir Dmitri Fedorov, Muthukaruppan Annamalai
-
Patent number: 8239383Abstract: A method, system and article of manufacture for query execution management and, more particularly, for managing execution of queries against database samples. One embodiment provides a computer-implemented method for managing execution of a query against a database having a multiplicity of data records. The method comprises receiving, from a requesting entity, a query against the database, and performing an automated execution process, comprising: (i) iteratively executing the query against samples of the database, each sample including a subset of the multiplicity of data records, (ii) after each iterative execution of the query, determining whether a query result obtained for the iterative execution satisfies a predefined condition, and (iii) if the predefined condition is not satisfied, performing a predefined action.Type: GrantFiled: June 15, 2006Date of Patent: August 7, 2012Assignee: International Business Machines CorporationInventor: John M. Santosuosso
-
Publication number: 20120197862Abstract: Method and apparatus for creating an electronic database of disambiguated entity mentions and relations from a corpus of electronic documents. The invention automatically extracts from the corpus of electronic documents mentions about entities (e.g., references to people, organizations or places), parses the entity mentions into “mention objects,” and executes a series of grouping, comparison and hierarchical fuzzy object clustering algorithms to cluster together in an electronic database all of the mention objects referring to the same entity and all of the mention objects (e.g. “people”) associated with each other by a relationship (e.g., “co-authors” or “family members”).Type: ApplicationFiled: August 9, 2011Publication date: August 2, 2012Applicant: COMSORT, INC.Inventors: Michael A. Woytowitz, Marshall Wells Hawks
-
Publication number: 20120197863Abstract: In an example, disclosed is a machine automated method of identifying a set of skills. In some examples, the method includes extracting a plurality of skill seed phrases from a plurality of member profiles of a social networking site, creating a plurality of disambiguated skill seed phrases by disambiguating the plurality of skill seed phrases using one or more computer processors, and de-duplicating the plurality of disambiguated skill seed phrases to create a plurality of de-duplicated skill seed phrases.Type: ApplicationFiled: January 24, 2012Publication date: August 2, 2012Applicant: Linkedln CorporationInventors: Peter N. Skomoroch, Matthew T. Hayes, Abhishek Gupta, Dhanurjay A.S. Patil
-
Patent number: 8234584Abstract: Provided is a computer system including an information providing server and a computer which is coupled to the information providing server, and which collects information, the computer being configured to: record status histories including a history of an operation to a screen which shows a status of the computer, and which is displayed on the computer in chronological order to constitute a set of the status histories; and divide, in a case where a history of an operation of switching the screen is detected from the set of the status histories, based on the history of the operation of switching the screen, the set of the status histories. Accordingly, when a user collects information, navigation information is provided by taking the fact that the user has actually reached useful information into consideration.Type: GrantFiled: February 18, 2009Date of Patent: July 31, 2012Assignee: Hitachi, Ltd.Inventors: Masahiro Motobayashi, Toshio Okochi, Michiko Sakai, Maki Hayashi, Akio Azuma
-
Patent number: 8234706Abstract: A method for enabling access to software security data is provided. The method includes accessing data associated with software vulnerabilities from a plurality of on-line sources. The method further includes aggregating the data from the plurality of on-line sources and identifying attributes associated with the data. The method also includes enabling access to the aggregated data through a graphical user interface that can be used to analyze the data according to the attributes.Type: GrantFiled: June 20, 2007Date of Patent: July 31, 2012Assignee: Microsoft CorporationInventors: Dongmei Zhang, Yingnong Dang, Xiaohui Hou, Song Huang, Jian Wang
-
Patent number: 8229911Abstract: An Internet infrastructure that supports searching of web links selects search results by processing browser activity information along with one or more of favorite lists, and related metadata, user profiles, and trends based on browser activity behavior and favorite behavior. The Internet infrastructure consists of a plurality of web browsers located on client devices. The web browsers are incorporated with a browser activity-monitoring module that tracks user's Internet usage, processes this information, and sends this information periodically or upon user request to the server to aid in improving search operation results. The search engine server is communicatively coupled to the plurality of web browsers and supports delivery of search results/web links to the client device based upon a search string, browser activity information, and possibly the favorite lists and related metadata.Type: GrantFiled: March 31, 2009Date of Patent: July 24, 2012Assignee: Enpulz, LLCInventor: James D. Bennett
-
Patent number: 8214342Abstract: A method to identify a supplier of good or services over the Internet by providing a home page with at least one link to a directory Web site for a class of goods or services. The directory Web site includes a directory Web site domain that at least partially describes a class of goods or services. The directory Web site also contains at least one supplier link to a corresponding supplier Web site and a rollover window. The home page and the directory Web site are configured to allow a user to access the home page; select a directory Web site based at least in part on the directory Web site domain name; activate the link to the selected directory Web site; and select and activate the supplier link for a supplier of goods or services.Type: GrantFiled: August 23, 2001Date of Patent: July 3, 2012Inventor: Michael Meiresonne
-
Patent number: 8209320Abstract: A computer-implemented system and method for keyword extraction are disclosed. The system in an example embodiment includes a keyword extraction component to extract relevant keywords from content of a web page, to identify items relevant to the extracted keywords, and to rank the relevant items.Type: GrantFiled: December 27, 2006Date of Patent: June 26, 2012Assignee: eBay Inc.Inventors: Alec Reitter, Barb Chang, Ken Sun, Raghav Gupta, Alvaro Bolivar, Alan Lewis
-
Patent number: 8200650Abstract: From when any one of several DMSs 2 is selected, until contents are downloaded from that DMS 2, a DMP 1 stores the search information which has been specified for that DMS 2. And if some other DMS 2 is selected before contents have been downloaded from the DMS 2 which was first selected, then the DMP 1 specifies this search information which is stored to that other DMS 2. Accordingly, it is possible greatly to enhance the ease of use when it is not known upon which DMS 2 the desired contents is stored.Type: GrantFiled: June 9, 2009Date of Patent: June 12, 2012Assignee: Funai Electric Co., Ltd.Inventor: Junya Senoo
-
Publication number: 20120143845Abstract: The present invention outlines a genuine entity following system that also addresses data source limitation. When reviewing entity-related objects in web content, a web user designates one or more entities to follow in real time. More particularly, the present invention is directed through strategic deployment of a dynamic crawler upon selection of a “follow” pointer over an object in a web browser such that a web user can automatically designate entities to be followed and receive alerts at predetermined temporal intervals when new information regarding such designated entities becomes available. A web entity engine of the present invention is designed to discover trending entities at any given time while generating output activity (i.e., signal) streams for this entity.Type: ApplicationFiled: December 1, 2010Publication date: June 7, 2012Applicant: Microsoft CorporationInventors: Zhaowei Jiang, Xavier Legros, Ronald H. Jones, JR., Ryan Panchadsaram
-
Patent number: 8195638Abstract: Computer implemented methods and systems are provided for web log filtering. A uniform resource locator (URL) is identified for a resource requested by an identified device. The URL is stored unless the URL has at a reference to an advertisement or an extension that matches any of a list of extensions specified for storage exclusion. The stored URL is categorized based on either the stored URL or an included domain name, depending on whether the included domain name matches any of the list of domain names that are associated with multiple categories. A count is incremented in a web log category associated with the identified device based on the categorized stored URL.Type: GrantFiled: April 5, 2011Date of Patent: June 5, 2012Assignee: Sprint Communications Company L.P.Inventors: James D. Barnes, Dan O'Connor, Dora Potluri
-
Patent number: 8195630Abstract: What is provided is a spatially-enabled content management system in which unstructured information is data mined for location or spatial references, with the search query including not only the spatial reference that has been provided by the data mining but also other search query terms, thus to provide an analyst with rapid geo-searching for unstructured information management.Type: GrantFiled: October 29, 2007Date of Patent: June 5, 2012Assignee: BAE Systems Information Solutions Inc.Inventors: John R. Ellis, Michael T. Hornbeek, Mark Meadows
-
Patent number: 8185513Abstract: A method, apparatus, article of manufacture for generating a media program database having a plurality of media programs is disclosed. In one embodiment, the method is comprises the steps of receiving first media program metadata from a first source, searching the Internet to find second media program metadata from a second source distinct from the first source, determining if the first media program metadata and the second media program metadata are associated with the same media program, merging the first media program metadata and the second media program metadata if the first media program metadata and the second media program metadata are associated with the same media program, and storing the merged first media program metadata and second media program metadata in the media program database.Type: GrantFiled: December 31, 2008Date of Patent: May 22, 2012Assignee: Hulu LLCInventors: Zhibing Wang, Yizhe Tang, Qian Chang, Ting-hao Yang
-
Patent number: 8185515Abstract: Information regarding the structure of information in a content database is maintained in a structure database. The structure database is used to correlate the data structure of a query to the structure of the content database, in order to determine that information in the content database which needs to be provided to a searcher in response to the query. In one embodiment, this search method is used in an online forum, and the forum maintains a reputation score for users with respect to given subject matter. The reputation score is dependent upon the quality of a user's participation in the forum. A user's reputation score depends upon the evaluation by others of information he posts and. upon the user evaluating information posted by others.Type: GrantFiled: December 1, 2008Date of Patent: May 22, 2012Assignee: Transparensee Systems, Inc.Inventor: Steven David Lavine
-
Publication number: 20120117053Abstract: Techniques for identifying knowledge use an graphical user interface for inputting one or more terms to be explored for additional knowledge. Then a search is conducted across one or more sources of information to identify resources containing information about or information associated with said terms. The resources are decomposed into elemental units of information and stored in a data structures called nodes. A group of nodes are stored in a node pool and, from the node pool, correlations of nodes are constructed that represent knowledge.Type: ApplicationFiled: December 14, 2011Publication date: May 10, 2012Applicant: MAKE SENCE, INC.Inventors: Mark Bobick, Carl Wimmer
-
Patent number: 8171012Abstract: A document management apparatus that searches at least one document group saved in advance for a document group having attributes that correspond to a search condition. The apparatus includes an updating unit configured to update the attributes of the document group in accordance with an operation performed by a user on a document in the document group, and a search unit configured to search for a document group having attributes that correspond to user information inputted from the exterior.Type: GrantFiled: February 9, 2009Date of Patent: May 1, 2012Assignee: Canon Kabushiki KaishaInventor: Masaya Soga
-
Publication number: 20120095985Abstract: A system, media, and method for selecting future queries are provided. The selected future queries are used to transmit appropriate online advertising to a user that issues queries to a search engine. The search engine is coupled to a prediction component that predicts what subject the user is going to be interested in and when the user will be interested in the subject. The prediction component returns a future query using statistical language models representing a query history of the user and aggregate query histories for a community of users.Type: ApplicationFiled: December 22, 2011Publication date: April 19, 2012Applicant: MICROSOFT CORPORATIONInventors: DOU SHEN, Ying Li
-
Publication number: 20120089591Abstract: A new network communications tool that allows third parties (companies, individuals, and others) to recognize if a certain user is looking for products or services that they might offer. Providing that the user expressly authorizes it, this new communication system will send an automated communication to various companies or individuals indicating that the user is looking for products or services offered by them and that the user is willing to receive direct information from them.Type: ApplicationFiled: October 6, 2011Publication date: April 12, 2012Inventor: Abraham Stern
-
Patent number: 8156104Abstract: Systems and methods for managing data, such as metadata. In one exemplary method, metadata from files created by several different software applications are captured, and the captured metadata is searched. The type of information in metadata for one type of file differs from the type of information in metadata for another type of file. Other methods are described and data processing systems and machine readable media are also described.Type: GrantFiled: March 26, 2009Date of Patent: April 10, 2012Assignee: Apple Inc.Inventors: Yan Arrouye, Dominic Giampaolo, Bas Ording, Gregory Christie, Stephen Olivier Lemay, Marcel van Os, Imran Chaudhri, Kevin Tiene, Pavel Cisler
-
Patent number: 8156096Abstract: A supplier identification and locator system in that allows a user to identify a supplier of goods or services over the Internet; the system includes at least one directory Web site having a domain name that is at least partially descriptive of a class of goods or services. The directory Web site has a plurality of links that access suppliers' Web sites; a supplier descriptive portion located substantially adjacent to the link; a descriptive title portion substantially corresponding to the class of goods or services described in the domain name; a rollover window that displays information about at least one supplier; and an input receiving area where a user inputs data and ranked search results are displayed.Type: GrantFiled: September 23, 2011Date of Patent: April 10, 2012Inventor: Michael Meiresonne
-
Patent number: 8156103Abstract: A computer-related and/or business type method is presented for embedding one or more media hotspots within a digital media file and, in response to interaction from a separate target entity, such as via an associating request, associating one or more resultant actions with the media hotspot(s). In exchange for associating the one or more resultant actions with the media hotspot(s), an interactive media service entity being affiliated with a web site displaying the digital media file and/or a user being affiliated with the digital media file itself is compensated based upon at least one compensation plan.Type: GrantFiled: February 8, 2011Date of Patent: April 10, 2012Assignee: Clayco Research Limited Liability CompanyInventor: Leigh Rothschild
-
Patent number: 8156125Abstract: A data handling method combines search capabilities with analytical functionality. The invention provides advantages when dealing with structured documents (such as electronic catalogs, XML documents, text documents, HTML documents, Internet documents, etc.) and other data stored in a computer system. Various embodiments include simplified ways to express search/analysis requests of a data set and also to express results to such requests.Type: GrantFiled: February 19, 2008Date of Patent: April 10, 2012Assignee: Oracle International CorporationInventors: Thomas M. Annau, Joseph Sill
-
Patent number: 8150831Abstract: Systems and methods facilitate a search and identify documents and associated metadata reflecting content of the documents. In one implementation, a method receives a query comprising a set of search terms, identifies a stored document in response to the query, and determines a score value for the retrieved document based on a similarity between one or more of the query search terms and metadata associated with the identified document. The method locates the identified document in a citation network of baseline query results, the citation network comprising a first set of documents that cite to the identified document and a second set of documents cited to by the identified document. The method further determines a new score value of the identified document as a function of the score value and a quantity and a quality of documents within the first and second set of documents.Type: GrantFiled: April 15, 2009Date of Patent: April 3, 2012Assignee: LexisNexisInventors: Ling Qin Zhang, Harry R. Silver
-
Patent number: 8145618Abstract: A system and method for scoring documents is described. One or more documents are identified responsive to a search criteria. A text match score indicating a quality of match of the identified documents is determined. A category match score is determined over categories. A document-categories score is determined indicating a quality of match between an identified document and a plurality of categories. A search criteria-categories score is determined indicating a quality of match between the search criteria and the categories. An overall score is determined based on the text match score and the category match score.Type: GrantFiled: October 11, 2010Date of Patent: March 27, 2012Assignee: Google Inc.Inventors: Karl Pfleger, Brian Larson
-
Patent number: 8145619Abstract: A method for identifying companies with specific business objectives that includes using existing sources of company firmographic data to identify a broad set of companies and associated websites, crawling the websites associated with the identified companies and indexing web site content for each of the identified companies with the specific business objective to realize indexed web content. The method further includes joining the company firmographic data with the indexed web content using a business objective common identifier to generate a store of joined structured firmographic data and indexed web content and presenting a display image representation of the store of joined structured firmographic data and indexed web content for user review. The display image further receives user input to score each of said companies identified therein, and using a search interface, querying the store of scored, joined structured firmographic data and indexed web content.Type: GrantFiled: February 11, 2008Date of Patent: March 27, 2012Assignee: International Business Machines CorporationInventors: Timothy R. Bowden, Upendra Chitnis, Ildar K. Khabibrakhmanov, Richard D. Lawrence, Yan Liu, Prem Melville
-
Publication number: 20120072409Abstract: A method and system is provided that in a fully automated manner crawls web sites and identifies specific types of web pages, then extracts targeted data from those web pages. One or more text nodes containing product-related information on a first web page are first identified, and the locations of those test nodes are described using one or more vectors. The vectors are then analyzed to identify one or more patterns and to generate a model from those patterns that discriminates between text nodes that contain product-related information and text nodes that do not contain product-related information on a second web page. The model can then be used to crawl web sites to identify and extract targeted data, or the model can be installed on a user's computer to identify and extract targeted information from web sites as the user is browsing.Type: ApplicationFiled: March 18, 2011Publication date: March 22, 2012Inventors: Bradley John Perry, Nancy Ann Perry, Daniel Carl Marriott
-
Publication number: 20120066201Abstract: The described embodiments relate generally to methods and systems for generating a search on an electronic device. The method includes receiving a multimedia object, determining metadata corresponding to the multimedia object and generating a search query corresponding to the metadata. The system includes an input interface configured to select a multimedia object, a metadata determining module configured to determine metadata associated with the multimedia object and a query generating module configured to generate a search query corresponding to the metadata. The electronic device may include a portable device where the search query is generated upon detection of the movement of the portable device.Type: ApplicationFiled: September 15, 2010Publication date: March 15, 2012Applicant: RESEARCH IN MOTION LIMITEDInventors: Michael Williams Suman, Bhavuk Kaul
-
Method and apparatus for enhancing search results by extending search to contacts of social networks
Publication number: 20120066202Abstract: A information search method extends to a social networking site to search information by a contact name and other elements as location-based information, relationship, product, etc. The method includes the steps of receiving a search term specified by a searcher, detecting whether WHO is specified in the search term to determine whether search should extend to human contacts in social networking services, identifying WHAT from the search term and determining a nature of WHAT from the search term; sending the search term to a social networking service to retrieve relevant information; and presenting the retrieved information arranged in a predetermined order on a display or by audible sound and optionally a geographic information to retrieve information.Type: ApplicationFiled: July 26, 2011Publication date: March 15, 2012Inventors: Mari Hatazawa, Mike Iao, Andrew De Silva -
Publication number: 20120059816Abstract: A search engine receives user-submitted queries, determines web pages that are relevant to those queries, and returns relevance-ranked lists of references to the relevant web pages. Additionally, the search engine adds each query's terms to a query log. An automated process asynchronously examines the log and locates questions therein. For each question so located, the process determines whether that question already is contained in a database of questions maintained by an online question-and-answer system that is separate from the search engine. For each such question that is not already contained in the stored database of questions, the process automatically adds that question to the question database. As a result, the set of questions used by the online question-and-answer system grows even in the absence of any further direct question submissions by users of the system.Type: ApplicationFiled: September 7, 2010Publication date: March 8, 2012Inventors: Priyesh Narayanan, Ashvin Agrawal
-
Patent number: 8126876Abstract: A system ranks results. The system may receive a list of links. The system may identify a source with which each of the links is associated and rank the list of links based at least in part on a quality of the identified sources.Type: GrantFiled: July 10, 2009Date of Patent: February 28, 2012Assignee: Google Inc.Inventors: Michael Curtiss, Krishna Bharat, Michael Schmitt
-
Publication number: 20120047123Abstract: The present invention is directed to a method and computer system for representing a dataset comprising N documents by computing a diffusion geometry of the dataset comprising at least a plurality of diffusion coordinates. The present method and system stores a number of diffusion coordinates, wherein the number is linear in proportion to N.Type: ApplicationFiled: November 3, 2011Publication date: February 23, 2012Inventors: RONALD R. COIFMAN, Andreas C. COPPI, Frank GESHWIND, Stephane S. LAFON, Ann B. LEE, Mauro M. MAGGIONI, Frederick J. WARNER, Steven ZUCKER, William G. FATELEY
-
Patent number: 8122005Abstract: A training set generator may be configured to input a taxonomy including a hierarchy of categories and a plurality of top-level sites, and to output a training set of categorized data. The training set generator may include a crawler configured to crawl each of the top-level sites to determine at least one lower-level site associated therewith and to store the top-level sites and associated lower-level sites as crawl data. The training set generator also may include an extractor configured to determine, for each of the top-level sites, a corresponding site-specific extraction template associating at least one portion of the corresponding top-level site with at least one category of the hierarchy of categories, and further configured to apply each site-specific extraction template to corresponding crawl data to thereby associate the crawl data with the categories of the hierarchical categories and obtain categorized data of the training set.Type: GrantFiled: October 22, 2009Date of Patent: February 21, 2012Assignee: Google Inc.Inventors: Philo Juang, Christopher Testa, Nicolaus Mote
-
Patent number: 8112404Abstract: Search results are provided for mobile computing devices. Search results are retrieved based on a search term. Each of the search results is assigned to one or more categories. The categories and the assigned search results are provided to the mobile computing device. The mobile computing device is adapted to display each of the categories and a partial list of the search results for each of the categories.Type: GrantFiled: May 8, 2008Date of Patent: February 7, 2012Assignee: Microsoft CorporationInventors: Tuan Huynh, Hiromi Kobayashi, Takeshi Tanaka, Hirokazu Sawada, Tsutomu Kagoshima
-
Patent number: 8112412Abstract: Attempts by a user to download executable files with unacceptable reputations are detected, and recommendations for similar files with good reputations are made to the user. More specifically, a user's web browsing is tracked, and terms describing software applications are extracted from browsed pages. When a user attempts to download an executable file, a corresponding notification including recently extracted terms is transmitted to a categorization component, which receives such information from many users. The categorization component stores the received information in a database. This maintained database identifies files that are available for download, as well as corresponding extracted terms and reputational scores. If a user initiates a download of an executable file with an unacceptable score, the categorization component identifies executable files in the database with related extracted terms, but with acceptable reputations, to recommend to the user as alternatives.Type: GrantFiled: June 30, 2008Date of Patent: February 7, 2012Assignee: Symantec CorporationInventor: Carey Nachenberg
-
Patent number: 8108482Abstract: A data relaying apparatus disposed on the preceding stage of a registry server centrally managing meta-information extracts meta-information from a content retrieval result transmitted from the registry server to a client terminal and retains and correlates the meta-information with URI information included in the meta-information. On the other hand, a data relaying apparatus disposed on the preceding stage of a repository server retaining contents receives a content acquisition request transmitted from the client terminal to the repository server to extract URI information from the content acquisition request and transmits the URI information to the data relaying apparatus to acquire meta-information. The meta-information is added to contents transmitted to the client terminal before the contents are relayed.Type: GrantFiled: October 10, 2008Date of Patent: January 31, 2012Assignee: Fujitsu LimitedInventors: Naoki Matsuoka, Tomohiro Ishihara
-
Patent number: 8108405Abstract: In one embodiment, a search space of a corpus is searched to yield results. The corpus comprises documents associated with keywords, where each document is associated with at least one keyword indicating at least one theme of the document. One or more keywords are determined to be irrelevant keywords. The search space is refined according to the irrelevant keywords.Type: GrantFiled: October 1, 2008Date of Patent: January 31, 2012Assignee: Fujitsu LimitedInventors: David L. Marvit, Jawahar Jain, Stergios Stergiou
-
Patent number: 8103646Abstract: An automated mechanism of automatically tagging media files such as podcasts, blog entries, and videos, for example, with meaningful taxonomy tags. The mechanism provides active (or automated) assistance in assigning appropriate tags to a particular piece of content (or media). Included is a system for automatic tagging of audio streams on the Internet, whether from audio files, or from the audio tracks of audio/video files, using the folksonomy of the Internet. The audio streams may be provided by the media author. For example, the author can make a recording to be posted on a website, and use the system to automatically suggest (via prompted author interaction) folksonomically appropriate tags for the media recording. Alternatively, the system can be used in an automated fashion to develop and assign without any intervention by the author.Type: GrantFiled: March 13, 2007Date of Patent: January 24, 2012Assignee: Microsoft CorporationInventor: Robert I. Brown
-
Publication number: 20120016862Abstract: In one embodiment, a method may include accessing a particular page of Web application that includes a form having at least one field for entry of data by a user of the Web application, the Web page rendered by the Web application based on code for the Web page. The method may also include analyzing the code. The method may further include generating one or more sets of inputs for the at least one field based on the analysis. The method may additionally include automatically entering, into the at least one field, the one or more sets of inputs. The method may also include automatically submitting the form, including the one or more sets of inputs into the at least one field.Type: ApplicationFiled: July 14, 2010Publication date: January 19, 2012Inventor: Sreeranga P. Rajan
-
Publication number: 20120016863Abstract: Methods for enriching metadata associated with a document that is categorized in a document category are described. Documents are pre-categorized within a document category. Uniform resource locaters (URL) that are related to a document category are identified and linked to the document category. Indications of tokens and relationships between the tokens and the URLs are received. The tokens are linked to the URLs. The tokens are propagated to the document categories and to the documents therein based on linking between the token, URL, and document category. As such the document category, and documents therein, are provided with metadata that is descriptive thereof. The documents and their associated metadata tokens are useable to generate a searchable index of the documents. The linking between the tokens, URLs, and categories is also useable to identify tokens that are too specific, too general, or documents that are miscategorized.Type: ApplicationFiled: July 16, 2010Publication date: January 19, 2012Applicant: MICROSOFT CORPORATIONInventors: DANIEL BERNHARDT, IAN DOUGLAS HEGERTY, TOMASZ ANDRZEJ MARCINIAK