Method and apparatus for query-specific bookmarking and data collection
A computer-implemented method and system providing both query-specific bookmarking and query-specific data collection. These features allow users to search more efficiently on the WWW by allowing users to explicitly maintain their search context. In addition, users can also collect query-specific relevance and usage data. User marked search results are saved as leads, which are displayed on a user interface such as a SearchPad. One embodiment of the invention involves extending HTML to include a Query attribute that saves the query context and returns it with a search result page. Another embodiment involves the use of cookies to save the query context. Saved query-specific bookmarks and query-specific data can be periodically saved to a central site, although not all embodiments perform this task.
Latest Hewlett Packard Patents:
This application is a continuation application and claims priority from U.S. patent application Ser. No. 09/444,524, filed on Nov. 22, 1999, which is hereby incorporated by reference herein.
BACKGROUND OF THE INVENTIONThe present invention relates generally to software programs and, more generally, to search engines that search large numbers of documents.
The World Wide Web (WWW) has grown phenomenally in recent years. At the beginning of the web's history, there were hundreds or thousands of web pages in existence. At the present time, there are millions of web pages, and the number is increasing daily. The rapid increase in the number of web pages has increased the difficulty of finding information on the web. Even though the information that a person wants may be available on the web, it is sometimes difficult to locate the page or site that contains the information. If a person performs many searches, it can be difficult to organize the search results and remember how the search results were obtained.
A user searching on the WWW may search on many unrelated topics. More than one browser can be used by a user over time. Users tend to search over many sessions and may terminate and restart their browser between sessions. For each topic searched by a user, the user will probably search on many queries. Users use many search services and may look at more than one search result page. When a user finds a useful result, he is often unsure whether the information found is the best available or whether he should search further. Finding information on the WWW is currently difficult for users because they encounter a large amount of information and have no easy way to keep track of it.
Prior approaches to bookmarking (for example, Netscape Navigator's “Bookmarks” facility and Microsoft Internet Explorer's “Favorites” facility) allow users to save useful hyperlinks in a “bookmarks” list. Users are allowed to group a set of links under a topic, by creating a folder, giving the folder a name, and placing links within the folder. This system is well suited for links that need to be remembered for a long time (i.e., for links corresponding to the user's long-term memory of web pages).
For links that are of temporary interest, such as tentative leads found on a search engine result page (i.e., corresponding to the user's short-term memory of web pages), it takes too much effort to create a folder and give it a name. Hence, users usually do not bookmark tentative information.
Prior approaches to query-specific data collection have required communication with the logging site (usually the search service) each time a result page was visited. Certain conventional approaches extend the web browser to show users statistics about the pages they visit and, in the process, log the pages they visit. Such approaches are not query-specific. Other approaches redirect accesses to result pages through a logging site (usually the search site itself). This approach logs the result pages viewed by the user for each query. However, it also causes a delay in accessing the result page, and increases the network traffic both for the user and the logging site, without providing any additional value to the user. In addition, the described conventional approaches fail to record which result links the user actually found to be relevant to the query.
What is needed is a way to easily keep track of tentative search results and to remember which queries were used to obtain the results.
SUMMARY OF THE INVENTIONThe described embodiments of the present invention provide both query-specific bookmarking and query-specific data collection. These features allow users to search more efficiently on the WWW by allowing users to explicitly maintain their search context. A user's search context includes queries recently deployed by the user, along with some or all of the hyperlinks the user looked at and/or liked in the context of each query.
In addition, users can also collect query-specific relevance and usage data. Specifically, the described embodiment can log information including but not limited to: queries that were issued; result pages viewed for each query; result hyperlinks considered relevant for each query; the order in which result pages were viewed; and whether a result hyperlink considered relevant was actually viewed by the user. This type of information can be used, for example, to statistically compare two ranking algorithms or two search services. It can also be used to compute the relevance of pages to queries, which in turn can be used to improve the ranking of search services.
The described embodiments of the present invention arise in the context of WWW search services. They apply both to general-purpose search engines, which facilitate searches over the entire Web, and to specialized search services, which permit searches over private databases. Any service that returns a list of URLs or hypertext addresses in response to a search query can benefit from this invention.
In accordance with the present invention, as described and presented herein, there is provided a computer-implemented method of query-specific bookmarking in a network, comprising: maintaining, on a client-side computer, lead information about a previously performed search, the lead information including the query used in the search and the at least some of the resulting links returned by the search; displaying the query used in the search and the resulting links; receiving a user-selection of a displayed resulting link; and displaying the document corresponding to the selected query-specific link.
In further accordance with the present invention, as described and presented herein, there is provided a computer-implemented method of query-specific bookmarking in a network, comprising: maintaining, on a client-side computer, lead information about a previously performed search, the lead information including the query used in the search and the at least some of the resulting links returned by the search; displaying the query used in the search and the resulting links; receiving a user-selection of a displayed query; and re-submitting the selected query to a search engine.
In further accordance with the present invention, as described and presented herein, there is provided a computer-implemented method for query bookmarking on a client machine, comprising: receiving a result, including a plurality of links, for a query from a search engine; allowing the user to mark one of the plurality of links; and saving the marked link and the query as a query-specific lead.
In further accordance with the present invention, as described and presented herein, there is provided a computer-implemented method of bookmarking a query, comprising: receiving a result for a query, the result including a plurality of links, each link having an associated executable; allowing the user to mark one of the plurality of links; executing the executable associated with the marked link to store the query and the marked link as a query-specific lead in a cookie accessible by a browser.
In further accordance with the present invention, as described and presented herein, there is provided a computer-implemented method of displaying a bookmarked query, comprising: retrieving a cookie maintained on a client-side computer, the cookie including lead information about a previously performed search, the lead information including the query used in the search and the at least some of the resulting links returned by the search; displaying the retrieved query and the retrieved resulting links; receiving a selection of a displayed resulting link; and causing the document corresponding to the selected link to be displayed.
In further accordance with the present invention, as described and presented herein, there is provided a computer-implemented method of bookmarking a query, comprising: receiving a result for a query, the result including a plurality of links, each link having an associated executable; allowing the user to mark one of the plurality of links; executing the executable associated with the marked link to store the query and the marked link as a query specific lead in a cookie accessible by a browser.
The invention includes comparable apparatus and computer readable media containing instructions executable by a data processor.
BRIEF DESCRIPTION OF THE DRAWINGS
FIGS. 2(a) and 2(b) show respective embodiments of SearchPad interfaces in accordance with the present invention.
FIGS. 4(a) and 4(b) show respective embodiments of systems displaying query-specific leads in accordance with the present invention.
FIGS. 7(a) and 7(b) show respective embodiments of systems performing lead marking in accordance with the present invention.
FIGS. 11(a)- 11(d) are flow charts showing respective examples of how a user can mark leads.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTSI. General Discussion
The described embodiments of the present invention allow a user to specify and use query-specific bookmarks and information. The described embodiment of the present invention aids the user in his search process by explicitly maintaining the user's search context, including queries recently deployed by the user, along with the hyperlinks the user looked at and/or liked in the context of each query. The following sections will discuss, first, use of such information and, second, creation and specification of such information.
The described embodiments of the present invention allow users to mark queries and promising results (known as leads) associated with queries. The logged information is preferably shown in a convenient manner in a separate window, allowing users to view marked pages and to re-issue marked and recent queries. The data collected may include: recent queries; results the user considered relevant in association with his query; result pages the user looked at and time spent; and the order in which events occurred. This data preferably is periodically transmitted to a server, thus achieving the goal of query-specific data collection.
The described embodiments of the present invention provide an alternative to conventional bookmarks for remembering promising links found during searching. Unlike bookmarks, in the case of leads, the system automatically derives a meaningful category name from the query, and places the leads within it. This allows for promising leads found in response to a query, albeit from various search services, to be filed together, labeled by the query they were found with (or a generalization thereof). Furthermore, unlike conventional bookmarks the described query-specific bookmarks allow for query reuse. Since queries are remembered, they can also be issued again to the same search service or to other search services, in order to continue with the search. Lastly at least one embodiment of the present invention can be implemented without extending the user's web browser or the HTML specification. This allows implementation by search services and third-parties.
Thus, the described embodiment imposes less of a burden on the user than previous schemes. The described embodiments actually provide a search service while collecting data. The described embodiments of the invention generally include two parts:
-
- 1) A client-side window called SearchPad to display the user's search context and query-specific leads.
- 2) A direct or indirect extension of search services to support the communication of leads to SearchPad and the collection of usage data.
II. Viewing Query-Specific Leads
The described embodiments of the present invention support query-specific bookmarking, reducing communication overhead and improving the quality of the data collected. To reduce communication overhead, the described embodiments preferably can periodically transmit a log to the server instead of a communication for every result page visited. Certain of the described embodiments log which pages a user found relevant to a query. This information is of high relevance, since users tend to view many more result pages than those actually relevant to them.
FIGS. 2(a) and 2(b) show respective embodiments of SearchPads in accordance with the present invention. In
For example,
A SearchPad can display two types of queries:
-
- (1) All queries for which the user has marked a lead, and/or
- (2) Recent queries (for example, the last 5 queries).
Certain embodiments allow the user to specify which of these types of queries are to be displayed, thus displaying one or both types in accordance with the user's instructions. Other embodiments always display one type, or both.
In the example shown in
In this example, clicking on a small triangle next to each of the respective queries exposes or hides the leads associated with a query. In the figure the leads associated with the queries “melissa virus” 220 and Genetic Links 224 are shown as being currently hidden. Such an “outline view” of list data is common in many graphical applications.
The exemplary SearchPad 202 also provides a mechanism for the user to select a query and a search service, and ask for the query to be sent to the search service. In the example in
Other embodiments are possible. For example, the search engine bar 204 could be replaced by a user-configured choice of a default search engine or a group of search engines.
In the embodiment shown in
In
FIGS. 4(a) and 4(b) show respective embodiments of systems displaying query-specific leads in accordance with the present invention.
The methods described herein are preferably implemented as software instructions stored in memory of system 404 and executed by a processor. These instructions can also be stored on computer readable medium, such as disk drive, memory, CD ROM, DVD, etc.
III. Extended Search Service
There are several possible ways to create and specify query-specific information.
FIGS. 7(a) and 7(b) show respective embodiments of systems performing lead marking in accordance with the present invention.
FIGS. 7(a) and 7(b) show that the marked leads are saved. In a preferred embodiment, the query and query-specific leads are saved to disk, then retrieved and displayed. The query and leads can also be sent directly to a query-specific display software, such as SearchPad software.
i. html Modification Implementation
-
- <A HREF=url-of-hyperlink>hyperlink-anchor</A>
To support the implementation of
As shown in
-
- <A HREF=url-of-result-page-hyperlinkQUERY=“genetic code”>hyperlink-anchor</A>
The use of the QUERY attribute tells the browser (or a software extension to the browser or lead marking software) that the current hyperlink is available for marking by the user, and that the hyperlink needs to be remembered in association with the specified query (“genetic code” in the above example). Thus, the QUERY attribute allows the browser to ascertain for which query the page was a result. When the marking operation is invoked, the browser software recognizes that this link can be exported as a lead and takes the corresponding action.
As shown in
A shortcoming of this approach is that it requires a change to the syntax of HTML (to add the QUERY attribute) and, possibly, a change to browser software to recognize the query attribute. This will limit the popularity of this approach. Also, this implementation requires a consensus between browser manufacturers and search services on the syntax of URLs.
ii. Cookie-Based Implementation
We next describe an alternate implementation that does not require browser manufacturers to change their software, or for search services to cooperate with the browser manufacturer. This alternate implementation is an implementation without Browser Extension or Client-Side Software Installation. This can be deployed both by search services themselves, and by third parties who wish to add value to one or more search services.
Embedded scripts tend to be subject to many restrictions by the browser, both in terms of access (i.e., limited access to other windows) and storage (no access to the file system) in the normal mode of operation. In some web browsers, the embedded scripts can ask the user for more access to the web browser's state. Nonetheless, this is not useful because many users will refuse such a request, since it represents a security risk.
In the described embodiment, this log is maintained in a set of cookies associated with the web site. The log could also be maintained on the web site server or at some other third-party machine.
All the information collected above resides in a set of cookies associated with the originating web site, and is available to scripts executing within other pages downloaded from the same site. In particular it is visible to SearchPad, which is an HTML document containing an embedded script (for example, in Javascript, Java or VB Script). All the data needed by SearchPad to display marked queries and leads to the user is available in the cookie access log. When the cookie access log is updated due to a new event, SearchPad reads the cookies and changes its display to reflect the new state. For this purpose, when the log is updated, either the code in the result page can signal SearchPad to notify that the state has changed, or SearchPad can periodically examine the cookie log to see if new leads have been added.
In time, the storage available in the cookie access log will be exhausted. At this point either the user can be prevented from marking any more leads (unless some are deleted), or SearchPad can compress the data.
To compress data in the cookie access log, SearchPad merely reloads itself. The cookies comprising the cookie access log are configured so that they are transmitted to the web server every time SearchPad is reloaded. This has the effect that all the data in the log can be saved at the server. Some of it is used for the data collection task. The data in the cookies is cleared, except for a pointer to where the previous data is stored on the server.
In at least one embodiment, some of the access log data, stored on the server, is hardwired into the SearchPad script in future transmissions of SearchPad's code to that particular user. This is used to display marked queries and leads. Because the data resides in the SearchPad script and not the cookie access log, the access log has more space for logging future events.
To ensure timely data collection SearchPad can be configured to periodically reload itself, thus logging the user's activity periodically. Furthermore, to avoid transmitting the cookies to the server during other communications, the cookies are configured so that they will be transmitted only when SearchPad is reloaded,
iii. Methods of Marking Leads
FIGS. 11(a)-11(d) are flow charts showing respective examples of how a user marks leads in various embodiments of the present invention.
In
In
In
In
In general, a saved lead includes at least a query and a link resulting from the query (or a series of links can be saved in association with a single query). For example, the following can be logged for each “mark” action:
-
- The query
- The Title, URL, and rank of the result being marked
- The time at which the event occurred
Similarly, when a result's hyperlink is clicked to view the result page, we can log the same type of information in association with the “view” event. When the user returns to the page containing search results after viewing a result page, the “return” event can be logged as well, with a timestamp. When a “return” event follows a “view” event, the time difference provides an estimate of the time spent viewing the result page.
From the above description, it will be apparent that the invention disclosed herein provides a novel and advantageous system and method of searching large numbers of hypertext documents, such as the hypertext documents of the world wide web.
Claims
1. A computer-implemented method of query-specific bookmarking in a network, comprising:
- maintaining, on a client-side computer, lead information about a previously performed search, the lead information including the query used in the search and the at least some of the resulting links returned by the search;
- displaying the query used in the search and the resulting links;
- receiving a user-selection of a displayed resulting link;
- displaying the document corresponding to the selected query-specific link; and
- marking the displayed document if the displayed document is displayed for more than a threshold amount of time.
Type: Application
Filed: Oct 12, 2004
Publication Date: May 26, 2005
Applicant: Hewlett-Packard Development Company, L.P. (Houston, TX)
Inventor: Krishna Bharat (Santa Clara, CA)
Application Number: 10/962,639