Patents Assigned to COLLECTIVE INTELLECT, INC. - Justia Patents Search

Patents Assigned to COLLECTIVE INTELLECT, INC.

IDENTIFYING SOURCES OF MEDIA CONTENT HAVING A HIGH LIKELIHOOD OF PRODUCING ON-TOPIC CONTENT

Publication number: 20080114755

Abstract: Methods and systems are provided for identifying on-topic sources of media content. According to one embodiment, candidate seed sites are identified from which current seeds are selected for deep crawling. The current seeds are identified by correlating relevancy scores or key-word search results from multiple search engines; and selecting the current seeds based on on-topic scores of the candidate seeds. Periodically, a topic net associated with the topic area of interest is executed to locate relevant sources of media content by (i) building a graph in which nodes represent pages and edges represent links among pages by performing an iterative 360 crawl starting from the seeds; (ii) assigning initial node graph scores; (iii) computing final node graph scores by performing link analysis; (iv) computing a site graph scores by aggregating and averaging corresponding node graph scores; and (v) configuring sites with the highest site graph scores to be scraped.

Type: Application

Filed: November 12, 2007

Publication date: May 15, 2008

Applicant: COLLECTIVE INTELLECT, INC.

Inventors: Timothy J. Wolters, Mehrshad Setayesh