Web page content translator
A system, method, and computer readable medium for reformatting web content into a format readable on one or more mobile devices is provided. A user generates a user request for a web page from a mobile device to a proxy server. The proxy server forwards the user request to an origin web server, which returns the requested web page to the proxy server. A conversion engine within the proxy server extracts the desired content from the web page, and reformats the content in accordance with one or more predefined transform methods associated with the one or more mobile devices before transmitting the transformed web page with the desired content to the one or more mobile devices. Secure or unsecure connection provided via a decorated uniform resource locator can be used to connect a mobile device, the proxy server, and an origin web server.
This application claims priority to application Ser. No. 60/______, filed Nov. 6, 2000, entitled “Web Page Content Translator”, which is assigned to the assignee of this application. The disclosure of application Ser. No. 60/______ is incorporated herein by reference.
BACKGROUND OF THE INVENTION1. Field of the Invention
The present invention relates generally to dynamically extracting and reformatting existing web page content and, more particularly, to dynamically extracting a portion of content from a web page and reformatting the extracted content for viewing on a mobile device.
2. Background Description
Organizations of all sizes are reliant on the Internet to conduct business. Because of the explosion of mobile enterprise solutions, users of wireless devices now demand that businesses deliver web content for viewing on desktop, or mobile/portable (e.g., handheld) devices. Whether organizations are creating new web applications, or extending existing infrastructure, the new Internet powered world demands that users have access to the applications and information they need when they need it to speed business, remain flexible and competitive, and drive stronger customer relationships.
Before now, the solutions available for delivering web content to mobile devices generally required organizations to develop and maintain multiple sets of content, one for viewing in a desktop environment, and others for viewing on each individual type of mobile device. Further, secure connections between mobile devices and web servers via a proxy server that reformats original web page content were generally unavailable via a decorated uniform resource locator (URL) connection. A need exists, therefore, for a standards based, create once, deliver everywhere approach to web enabling mobile devices.
SUMMARY OF THE INVENTIONIt is a feature and advantage of the present invention to provide a system and method that dynamically extracts a portion of web content for viewing on mobile devices;
It is another feature and advantage of the present invention to provide a system and method that reformats extracted web content for viewing on mobile devices using transforms that, for example, add meta tag information to the header of a page, add a specific attribute and attribute value to a specific tag, ignore previously specified global conversions, and/or insert text from a specified file;
It is yet another feature and advantage of the present invention to manage and maintain a single set of web content that can be used in both desktop and mobile device environments;
It is still another feature and advantage of the present invention to eliminate the need to maintain multiple sets of web content, one for each device type, for delivery to a plurality of mobile devices.
It is another feature and advantage of the present invention to optionally provide a secure connection via a decorated URL between a mobile device and/or a client (e.g., a Wireless Access Protocol (WAP) gateway) which forwards a secure request from the mobile device, a proxy server that reformats original web page content, and a web page server.
It is another feature and advantage of the present invention to provide a secure connection between, for example, a proxy server that reformats an original web page and an origin web page server by setting up the proxy server in a mobile device browser (i.e., enabling the mobile device to communicate with the origin web page server via the proxy server).
To achieve these features and advantages, the present invention provides a system, method, and medium that extracts a portion of web page content and reformats the extracted content for delivery to one or more wireless devices. Embodiments of the present invention contemplate that web-based content can be created one time, whereafter at least a portion thereof is extracted, reformatted, and transmitted to, for example, one or more handheld device browsers (e.g., the Palm Web Clipping Browser from Palm, Inc., Santa Clara, Calif., Pocket Internet Explorer from Microsoft Corp, Redmond, Wash., and Wireless Access Protocol (WAP) on smart phones, etc.). This create-once, deliver-everywhere approach eliminates the need to build and maintain separate web pages for different devices and browsers, as well as the need to install proprietary browser software on the handheld device(s). In one embodiment contemplated by the present invention, a user generates a request, from a mobile device to a proxy server. The proxy server then forwards the user request to an origin web server (having a first file format), whereafter the requested web page is returned to the proxy server. The appropriate components (e.g., tags, etc.) of, for example, the HTML or Wireless Markup Language (WML) source code, are extracted from the web page. The extracted web page contents are then reformatted to place the extracted web page contents in a format the is viewable by one or more mobile devices. The reformatted web page is then transmitted from the proxy server to the requesting mobile device. The may be done for at least one of groups of devices, individual devices, web site-specific conversions, or for all web sites. The method of the present invention also provides a secure connection via a decorated URL between a mobile device and/or client (e.g., a WAP gateway) which forwards a secure request from the mobile device, a proxy server that reformats original web page content, and a web page server.
Before explaining at least one embodiment of the invention in detail, it is to be understood that the invention is not limited in its application to the details of construction and to the arrangements of the components set forth in the following description or illustrated in the drawings. The invention is capable of other embodiments and of being practiced and carried out in various ways. As such, those skilled in the art will appreciate that the conception, upon which this disclosure is based, may readily be utilized as a basis for the designing of other structures, methods and systems for carrying out the several purposes of the present invention. It is important, therefore, that the claims be regarded as including such equivalent constructions insofar as they do not depart from the spirit and scope of the present invention.
The Detailed Description will be best understood when read in reference to the accompanying figures wherein:
The proxy server 110 of the present invention “sits” between the clients (e.g., the mobile device(s) 108) and the origin web server(s) 102 that provide Web content 104. If necessary, a gateway 109 may optionally be provided that interfaces between, for example, the mobile device 108 and the proxy server 110. In this context, an origin web server 102 is a web server that contains the original web page 104. It is different from, for example, the proxy server 110 because the original web server 102 is maintained and updated by the individual(s) and/or organization(s) that hosts the web site for the web page 104. Unlike other proxy servers, however, the present invention obtains the code (e.g., HTML) of the user requested web page 104, and modifies it at the proxy server 110 (once obtained) in accordance with at least one novel method described herein. Embodiments contemplated by the present invention also envision reformatting and/or converting an entire web site or request specific Web content that is needed from the origin web server 102.
In general, embodiments of the present invention contemplate that architecture 100 comprises one or more servers 102 having web content (in the form of at least one web page 104), a proxy server 110, at least one mobile device 108, and at least one network 114. For simplicity, only one of each of the mobile device 108, server 102, and network 114 are shown in
Upon a request 106 from a mobile device 108 for a given web page 104, typically made through a HyperText Transport Protocol (HTTP) from the resident mobile device, or HyperText Transport Protocol Secure (HTTPS) request from the resident mobile device 108 browser, the process for providing a reformatted/converted and extracted web page 104 begins. It should be understood that the gateway 109 will receive such a HTTP or HTTPS request if a gateway 109 is used. Upon determining that the requested page 104 resides at the server 102, the proxy server 110 makes a request 112 for the page. As shown, the proxy server 110 and the server 102 are connected via a network 114 such as the Internet. It should be understood that the present invention can be used with networks other than the Internet where visual content is involved. Therefore, depending on the network(s) being utilized in conjunction with the present invention, it should be apparent that the visual content from such network(s) may be other than “web” content. The server 102 responsively returns 116, via the network 114, the requested web page 104 to the proxy server 110. The web page 104 is typically an HTML file with references to any component .wav, .mov, and/or Joint Photographic Experts Group (JPEG) files, which together comprise the web page 104. It should also be understood that other file formats (e.g., .gif (graphics interchange format)) and web page components (e.g., Java applets) can also be accessed and returned to the proxy server 110.
The conversion engine 118, preferably residing within the proxy server 110, retrieves and/or accesses the stored web page 104, and accesses a predefined extraction and conversion file associated with the mobile device 108 and/or web page 104, preferably via the extraction and conversion file database 120. Extraction and conversion files may, for example, contain the following information:
-
- Variables: These are variables that are defined when a site-mining expression is created. A variable is essentially a place holder for the information to “go” after the conversion engine 118 extracts it from a web page 104 preferably residing on server 102. A site-mining expression is a command that tells the conversion engine 118 where to locate the desired web content (in, for example, the HTML code) in the web page 104. One embodiment of the present invention uses software such as Spyglass Prism from OpenTV, Inc., Mountain View, Calif. to extract data from a designated web page 104. Methods that can be used in site-mining expressions in accordance with the Spyglass Prism product are contained in, for example, the Exemplary Site Mining Methods section and other sections contained herein.
- System variables: Embodiments of the present invention contemplate that there are preferably at least three system variables that can be added to extraction and conversion files, including:
- a) &&Host;
- Identifies the name of the server 102 that the proxy server 110 is connecting to for a particular request 106.
- b) &&User-Agent (UA);
- Identifies the type of mobile device 108 and the web browser of the mobile device 108.
- c) &&URLLink;
- Identifies the address of the server 102 that the proxy server 110 is connecting to.
It should be understood that the variables and delimiters are illustrative only and that different variables and/or delimiters can be used to achieve the same functionality
-
- Transforms: Transforms are conversion tags that are used to re-format the content extracted from a web page 104. These special tags (also called sw-transforms) provide the power to display the Web content in the desired format 122 for a mobile device(s) 108. The present invention provides transforms that, for example: add meta tag information to the header of a page, add a specific attribute and attribute value to a specific tag, ignore previously specified global conversions, insert text from a specified file, remove a specific attribute from all tags, remove a specific attribute from a specific tag, remove the comments tag from a web site or specific web content, remove a specific tag from a web site or web content, remove a specific tag and all the information that appears within the tag from a web site or web content, replace one tag with another, set a specific value of a specific attribute of a specific tag, stop processing of all subsequent reformatting commands, substitute one sequence of text for another sequence of text, and remove table formatting. The sw-transforms used to reformat the extracted web content are contained in the SW-Transforms section contained herein.
Mobile devices 108 generally come in a variety of different sizes and have a variety of different screen interfaces. As will be discussed in further detail herein, the present invention provides the ability to uniquely tailor web page 104 content for specific mobile devices 108. A particular organization may know that its users will always be retrieving web pages using, for example, Palm operating system devices (e.g., Palm III, X, VIIs, etc.). Formatting rules may be set up that apply to the group of Palm operating system devices generally, as well as, for example, the Palm VII in particular. To create a device profile, both the specific version of a mobile device 108 and its web browser (e.g., a Palm V with browser A, etc.) are needed. In one embodiment, device profiles are registered with the proxy server 110 by using HTTP request headers, as will be explained in detail herein.
It should also be understood that in addition to providing a device profile for an individual mobile device 108, device profiles can also be created for a group of mobile devices 108, as well as all mobile devices 108. In the latter case, suppose, for example, an organization's mobile device 108 users will often be visiting graphic-intensive web sites, and that the users will be viewing these sites on, say, smart phones (i.e., a cellular telephone that provides voice, limited Internet access, e-mail, pager and/or facsimile service, and typically has a small screen and little memory). A global extraction and conversion file can be created that, for example, instructs the conversion engine 118 to replace all graphics with text links. This conversion would affect all web content that is delivered to all mobile device 108 users. Users can therefore access web sites they need without the inconvenience of waiting online for graphics to download onto their mobile devices 108.
Thus, using an extraction and conversion file within the storage 120 corresponding to the particular mobile device(s) 108, the conversion engine 118 uses the expressions defined therein, and performs the designated operations upon, for example, the HTML code contained in the web page 104. The designated HTML content of the retrieved web page 104 is extracted, subsequently and/or converted, and transmitted as a reformatted web page 122 to the mobile device 108.
To improve performance (e.g., speed), web page 104 can optionally be cached in a local cache 124 on the proxy server 110. Caching enables web pages 104 to be processed faster, thereby reducing the time mobile device 108 users spend online. When the proxy server 110 receives a web page 104 over, say, the Internet 114, the conversion engine 118 can convert and store the web page 122 in a web page 122 specific cache. When a user of a mobile device 108 wants to access the web page 122, the conversion engine 118 retrieves it from the cache 124, which reduces online time.
It should also be understood that the proxy server 110 can also cache unconverted web page 104 content. Unconverted web page 104 content is content received from the server 102, which has not yet been converted and/or extracted by the conversion engine 118. Cache settings can therefore be configured to meet specific needs (e.g., when caching will begin, the length of time pages will be kept in cache, and the amount of disk space and memory that will be allocated to caching).
Security FeaturesThe proxy server 110 according to the present invention envisions the use of various types of security to ensure that the transmission of information is secure (e.g., that unauthorized parties cannot intercept and/or access the information). An example envisioned by embodiments of the present invention provides the ability to support the secure socket layer (SSL) protocol developed by Netscape Corp. (now merged with America Online, Inc., Dulles, Va.). When this feature is utilized, users can make secure connections (from, for example, a mobile device 108 to the proxy server 108 and from the proxy server 108 to the origin web server 102 and back), as well as partially secure connections. In embodiments contemplated by the present invention, the proxy server will the proxy server will load an SSL library, which includes, for example, SSL 2.0 and SSL 3.X protocol specification modules.
If the SSL feature of the present invention is utilized, a digital certificate from a certificate authority (e.g., VeriSign) must be obtained. The digital certificate is sent to a client (e.g., a mobile device 108 or a gateway 109) to authenticate the proxy server 110, in order to establish a secure connection between the client and the proxy server 110).
Embodiments of the present invention contemplate that there are at least four types of SSL settings that can be configured for the proxy server 110, including, for example:
-
- SSLON: This setting determines whether SSL is turned on and off. The default setting is on (as indicated by, for example, SSLON=1).
- SSL port: This setting determines which secure port the server listens on.
- SSL Certificate Settings: SSL certificate request settings are located in the proxy server .ini file. Changes to the default public certificate file name or the password for the certificate key (e.g., password), are made via the proxy server 110 .ini file.
- SSLOutboundPort: This port is a listening port the proxy server 110 uses for outbound SSL requests.
It is envisioned that if the defaults are accepted for each of the above four SSL settings, the SSL settings in, for example, the proxy server 110 .ini file do not have to be configured.
Embodiments of the present invention contemplate that in the event that a particular mobile device 108 does not enable a proxy server 110 to be used, the present invention enables mobile devices to be redirected to proxy through the proxy server 110 by typing (or in some way, entering or implementing) the proxy server 110 address (location) and the URL of the desired web site 104 in, for example, an address field of the mobile device 108 browser. This type of address is commonly referred to as a “decorated URL”. A decorated URL contains, for example, the proxy server 110 host name and port number, and the web address of the web site 104 that is to be accessed.
With the present invention, the above-identified URL decoration can be used to make an un-secure (http://) request or a secure (https://) request. Regardless of what type of connection is made, after an initial decorated URL is typed in, the proxy server retrieves the desired web page 104. For example, the proxy server automatically re-writes all of the other links within the site that are initially proxied to via the decorated URL. That is, if one or more links are embedded within a first web site, and a user clicks on one of the links, the link that the user clicks on will also be “rewritten” in decorated form in accordance with the invention. Also, if a connection is made to an un-secure site, and a user subsequently clicks a secure link within that web site, the proxy server 110 automatically re-decorates the URL, to provide the secure connection. Therefore, a user only has to type the decorated URL once each time a new Web site is accessed.
As indicated above, embodiments of the present invention contemplate that a decorated URL contains the proxy server 110 host name and port number, and the web address of the site that is to be accessed. For example, a decorated URL for the Aether Software Web (www.aethersystems.com) site may look like:
http://yourscoutweb_domain_name:port_number/?url=
http://www.aethersystems.com.
Secure requests made either to and/or from the proxy server 110 can be done by using a decorated URL. If a decorated URL is being used, there are three options. First, a completely secure connection can be made from the mobile device 108, and optionally forwarded by the gateway 109 (if one is used), to proxy server 110, and from the proxy server 110 to the origin web server 102 (i.e., the user makes a secure request to the proxy server 110 and a secure request to the web page 104. The origin server 102 then encrypts the user request and the web page 104 that is returned to the mobile device.). This type of connection provides end-to-end security and is useful, for example, for e-commerce sites or for secure Web-based applications. This type of decorated URL may look as follows:
https://yourscoutweb_domain_name:port_number/?url=
https://www.aethersystems.com.
Second, a secure connection can be made between the mobile device 108, including the gateway 109 (if one is used), and the proxy server 110. A Z business or content provider, for example, that uses the proxy server at the bridge of a firewall may want users to make a secure connection to the proxy server 110. This is a partially secure connection. This embodiment contemplates a secure connection between, for example, at least a gateway to which the mobile device 108 interfaces, and the proxy server 110. This type of decorated URL may look, for example, as follows:
https://yourscoutweb_domain_name:port_number/?url=
http://www.aethersystems.com.
Finally, a secure connection can be made between the proxy server 110 and the origin web server 102. For example, a wireless ISP may want to have users make a regular un-secure request in order to access the proxy server 110, and then make a secure connection between the proxy server 110 and the origin server 102. In this case, the decorated URL may look, for example, as follows:
http://yourscoutweb_domain_name:port_number/?url=
https://www.aethersystems.com.
It will be understood that, depending on the mobile device, the menus that a user will need to access on the mobile device to set up the decorated URL will vary. For example, on a PALM V device, the HandWeb icon in the applications launcher can be tapped to display the last page accessed. Then, the menu button can be tapped to display the Options menu, in which the Open Location menu item may be tapped. The decorated URL may then be typed in the URL field, and the web site corresponding to the URL is then accessed.
One method of performing step 201 as contemplated by embodiments of the present invention is shown in
In step 304, in accordance with the HTTP (and HTTPS) specification, UA-color information is provided via a header that specifies, for example, the screen color of the mobile device 108. In various embodiments if a mobile device 108 sends this header, the conversion engine 110 performs color reduction automatically taking into consideration, for example, the screen color of the mobile device 108. If the mobile device does not send the UA-color header, a color reduction conversion for the mobile device 108 can be configured. If the mobile device 108 does send a UA-color header, the value sent in the header can optionally be overridden to improve device performance by, for example, reducing colors to a level lower than the device's maximum capability.
The UA-connection type (e.g., HTTP or HTTPS) between the mobile device 108 and the proxy server 110, and/or the proxy server 110 and the origin web server 102 is specified in step 306. The UA-connection is used to specify the type of connection that the mobile device 108 connects to. In step 308, the UA-CPU is specified. This information is used to specify the CPU (e.g., the manufacturer, model, and/or clock speed) of the mobile device 108. In step 310, the UA display field is used to specify the display type of the user agent. In step 312, the UA-HTML field is used to specify the version of HTML of the mobile device 108. In step 314, The UA-input field 414 is used to specify the types of input fields (e.g., password field, text box, image, file, etc.) of the mobile device 108 screen display. In step 316, the UA-language field is used to specify the language of the mobile device 108 (e.g., English, Spanish, etc.). In step 316, other fields, such as UA-operating system can be used to specify, e.g., the operating system of the mobile device 108, as well as other items of information. It will be recognized that not all of the above-identified steps will be applicable to every type of mobile device 108.
Mobile device 108 profiles, as previously noted, can comprise a series of device headers based on fields 402, 402, 406, 408, 410, 414, 416 (e.g., HTTP request headers). When a user requests a web page 104 using a mobile device 108, the mobile device 108 sends associated HTTP request headers, which uniquely identify the mobile device 108 to the proxy server 110. A mobile device 108 is generally defined by both the hardware and browser. The device headers thus serve to identify the mobile device 108 to the proxy server 110.
In step 202, site-mining is performed, where web page content is extracted from a web page 104. The site-mining 202 process as contemplated by embodiments of the present invention is shown in
Referring to
In step 506, objects are extracted from the web page 104 in accordance with the DOM 600. For example, the name of the tag the content resides in must be determined (e.g., table), as well as the tag number (e.g., is it the first table? Second?). Also, whether the content is in a sub-tag or child of a parent tag should be determined. If so, what number is it? (e.g., if it is a table row that is desired, then what number is it?)
In step 508, the site-mining expressions, contained in an extraction and conversion file 512, are performed on the extracted object(s) of step 506. Embodiments of the present invention contemplate that one or more methods having the functionality of the methods provided in, for example, the Exemplary Site Mining Methods section will be used to extract the desired content. In accordance with the present invention, one or more of the conversion methods shown in the Site-Mining Expressions Examples section is used to convert/reformat one or more of the extracted objects to facilitate display on one or more mobile devices 108. Then, in step 510, after applying the operations contained in the extraction and conversion file 512 to the web page 104, the transformed web page 122 is provided and transmitted to one or more mobile devices 108. (However, prior to the transmission to one or more mobile devices, other types of conversions can also be performed, as indicated in
The following three examples of site-mining expressions are provided. The expressions utilize methods defined in, for example, the Exemplary Site-Mining Methods section. As shown by the following examples, a site-mining expression specifies a path to an object (HTML element) from the DOM (e.g., web page 104) using methods defined in the site-mining expression language as shown in the Exemplary Site-Mining Methods section. Each of the methods contained in the Exemplary Site-Mining Methods section performs an action on a collection of objects in the object hierarchy. Each method returns a list of one or more objects. It should be understood the following examples are illustrative only, and that the potential number of HTML source code and expression combinations are virtually infinite.
Site-Mining Expression Examples
Site mining expression #1:
titleEx1=document.all.tags(“TITLE”).item(0).html; produces:
<TITLE> Title of the Page </TITLE>
Site mining expression #2:
titleEx2=document.all.tags(“TITLE”).item(0).text; produces
Title of the Page
Site mining expression #3:
In step 204, group conversions can be optionally performed. Group conversions are conversion rules that are applied to a group of similar mobile devices. For example, a group conversion can be applied to all Palm operating system devices. After such group conversion rules have been created, these rules can be applied to every new version of a Palm operating system device that is added to the system. Using group conversions saves time, for example, because it saves from having to keep reentering the same conversion rule(s) for all versions of a device (e.g., Palm 3x, Palm VII, etc.). Thus, e.g., one or more predefined conversions can be applied to any device(s) using a given operating system (e.g., Palm OS).
A method (as contemplated by embodiments of the present invention) of performing step 204 is shown in
In step 708, the resolution of the graphics in the web page 104 being converted can be reduced. In general, the percentage can be entered as an integer between 1 and 100. In step 710, interpolation can optionally be used to improve image quality. This feature allows tuning of the quality of the scaled images. For example, an individual can select “yes” if he or she is scaling images and wants to improve quality. In step 712, the file format of images can be selected (e.g., .gif or JPEG).
In step 714, the quality of JPEG conversion is specified. Here, an individual can enter, for example, an integer between 0 and 100, where 100 indicates the highest quality. In step 716, images larger than a prespecified size can optionally be removed (i.e., from being sent to a mobile device 108). An individual can, for example, enter the number of the maximum file size to be sent to a mobile device 108 in kilobytes.
In step 718, an “always send converted image” feature can optionally be invoked. This will send a converted image even, for example, if the file size is larger than that specified in step 716. In step 720, the conversion engine 110 can be directed to send an error message to the corresponding mobile device 108 if an image conversion fails. In step 722, the conversion engine can be directed to remove comments (e.g., either in <COMMENT> tags or in <!--comment--> format) from the content. An individual can select “Yes”, for example, to remove all comments, or “No”, for example, to leave them in the content.
It should also be understood that additional settings regarding group conversions can optionally be provided in accordance with the present invention that allow users to, for example, translate tables, remove a tag, remove tag and content, replace tag, remove an attribute from a tag, add an attribute to a tag, set an attribute value, scale an attribute value, and/or set an attribute minimum and/or maximum. Illustrative examples of these features are provided in the Site-Mining Expression Examples section.
A representative screen display corresponding to the method shown in
Returning to
In step 906, conversions can be created for color devices. Such conversions for color devices will depend, for example, on goals. If the goal is to optimize the performance of the server 102, then either reducing images to gray scale by specifying a color depth (e.g., mono8 or lower), or reducing the resolution to decrease file size (e.g., enter 90 percent or lower) is generally recommended. In step 908, the effect of removing background images can be considered. Many web pages 104 use light text and a dark background. If the background images or color are removed, the light text becomes unreadable. Therefore, a conversion to remove the color attribute of the <FONT> tag can be provided. However, in one embodiment, it is not recommended that the width and height attributes of the <IMG> tag be removed. These attributes enable the mobile device 108 browser to format the display faster. Finally, it should also be understood that image conversions will slow down the performance of the proxy server 110. If the mobile device 108 web browser converts images fairly well (e.g., work as a clipper), it may be preferable to let the browser handle image conversions. It should be understood that in one embodiment a device specific conversion will override, for example, corresponding parameters and/or values specified in group conversions.
Of course, it should be understood that various other conversion types in addition to, and/or in lieu of, those mentioned in
Returning to
In step 210 of
Referring now to
This section contains a list of all of the site-mining methods. Each method reference contains the following information:
-
- Method name
- Description: A brief description of the method including a list of all objects the method can extract from an HTML page.
- Preceded by: For methods that have to come after a particular method.
- Followed by: A list of other methods that you can use in the site-mining expression following the object specified by this method. Because a site-mining expression drills down to a specific object in the DOM hierarchy, these methods must be provided in a specific order.
- Example: Excerpts from an examplesite-mining expression used to extract content from a web site.
The following table summarizes all of the site-mining methods that can be used to build site-mining expressions.
-
- Transform name
- Description: a brief description of the method
- Syntax: how the transform would appear in your template (either a site-specific template or the globals template)
- Example: an example of the sw-transforms
The techniques of the present invention may be implemented on a computing unit such as that depicted in
Viewed externally in
The computer system also has an optional display 1108 upon which information, such as the screens illustrated in
Although computer system 1100 is illustrated having a single processor, a single hard disk drive and a single local memory, the system 1100 is optionally suitably equipped with any multitude or combination of processors or storage devices. Computer system 1100 is, in point of fact, able to be replaced by, or combined with, any suitable processing system operative in accordance with the principles of the present invention, including hand-held, laptop/notebook, mini, mainframe and super computers, as well as processing system network combinations of the same.
A display interface 1218 interfaces display 1208 and permits information from the bus 1202 to be displayed on the display 1108. Again as indicated, display 1108 is also an optional accessory. For example, display 1108 could be substituted or omitted. Communications with external devices, for example, the other components of the system described herein, occur utilizing communication port 1516. For example, optical fibers and/or electrical cables and/or conductors and/or optical communication (e.g., infrared, and the like) and/or wireless communication (e.g., radio frequency (RF), and the like) can be used as the transport medium between the external devices and communication port 1516. Peripheral interface 1520 interfaces the keyboard 1410 and the mouse 1412, permitting input data to be transmitted to the bus 1502.
In alternate embodiments, the above-identified CPU 1204, may be replaced by or combined with any other suitable processing circuits, including programmable logic devices, such as PALs (programmable array logic) and PLAs (programmable logic arrays). DSPs (digital signal processors), FPGAs (field programmable gate arrays), ASICs (application specific integrated circuits), VLSIs (very large scale integrated circuits) or the like.
One of the implementations of the invention is as sets of instructions resident in the random access memory 1208 of one or more computer systems 1100 configured generally as described above. Until required by the computer system, the set of instructions may be stored in another computer readable memory, for example, in the hard disk drive 1214, or in a removable memory such as an optical disk for eventual use in the CD-ROM 1212 or in a floppy disk for eventual use in a floppy disk drive 1104, 1106. Further, the set of instructions (such as those written in the Java programming language) can be stored in the memory of another computer and transmitted in a transmission means such as a local area network or a wide area network such as the Internet 114 when desired by the user. One skilled in the art knows that storage or transmission of the computer program product changes the medium electrically, magnetically, or chemically so that the medium carries computer readable information.
The many features and advantages of the invention are apparent from the detailed specification, and thus, it is intended by the appended claims to cover all such features and advantages of the invention which fall within the true spirit and scope of the invention. Further, since numerous modifications and variations will readily occur to those skilled in the art, it is not desired to limit the invention to the exact construction and operation illustrated and described, and accordingly, all suitable modifications and equivalents may be resorted to, falling within the scope of the invention. While the foregoing invention has been described in detail by way of illustration and example of preferred embodiments, numerous modifications, substitutions, and alterations are possible without departing from the scope of the invention defined in the following claims.
Claims
1-105. (canceled)
106. A system for extracting and reformatting web page content into a format readable on a mobile device, comprising:
- a receiving module to receive a user request from said mobile device for web page content having a first format;
- a physical proxy server to retrieve said requested web page content from a physical origin web server, to access a predefined extraction and conversion file associated with said requested web page content, and to convert said requested web page content in accordance with said predefined extraction and conversion file associated with said requested web page content.
107. The system according to claim 106, wherein said physical proxy server comprises:
- a storage repository that contains at least one data file associated with each of said mobile device; and
- a conversion engine that receives said requested web page content and site-mines and reformats at least a portion of said requested web page content from said requested web page content having said first format, for transmission to said mobile device, in accordance with one or more predetermined instructions in each of said at least one data file associated with each of said mobile device.
108. The system according to claim 106, further comprising:
- a cache that stores said requested web page content prior to transmitting said extracted and reformatted web content to said mobile device.
109. The system according to claim 106, wherein:
- at least one of said data files is defined for at least two mobile devices having a predefined common characteristic.
110. The system according to claim 109, wherein said predefined characteristic is at least one of:
- a type of operating system,
- a type of browser, and
- a manufacturer.
111. The system according to claim 106, wherein:
- at least one of said data files is defined for a particular type of said mobile device.
112. The system according to claim 111, wherein:
- said particular type of said mobile device is defined by said manufacturer and model.
113. The system according to claim 106, further comprising:
- a secure connection between said mobile device and said physical proxy server provided by a secure socket layer connection.
114. The system according to claim 106, further comprising:
- a secure connection provided between said physical proxy server and said physical origin web server.
115. The system according to claim 114, wherein:
- said secure connection is a secure socket layer connection.
116. A method of extracting and reformatting web page content into a format readable on a mobile device, comprising:
- receiving a user request from said mobile device for web page content having a first format; and
- retrieving said requested web page content from a physical origin web server;
- accessing a predefined extraction and conversion file associated with said requested web page content; and
- converting said requested web page content in accordance with said predefined extraction and conversion file associated with said requested web page content.
117. The method according to claim 116, further comprising:
- providing a storage repository that contains at least one data file associated with each of said mobile device; and
- providing a conversion engine that receives said requested web page content and site-mines and reformats at least a portion of said requested web page content from said requested web page content having said first format, for transmission to said mobile device, in accordance with one or more predetermined instructions in each of said at least one data file associated with each of said mobile device.
118. The method according to claim 116, further comprising:
- caching said requested web page content prior to transmitting said extracted and reformatted web content to said mobile device.
119. The method according to claim 116, wherein:
- at least one of said data files is defined for at least two mobile devices having a predefined common characteristic.
120. The method according to claim 119, wherein said predefined characteristic is at least one of:
- a type of operating system,
- a type of browser, and
- a manufacturer.
121. The method according to claim 116, wherein:
- at least one of said data files is defined for a particular type of said mobile device.
122. The method according to claim 121, wherein:
- said particular type of said mobile device is defined by said manufacturer and model.
123. The method according to claim 116, further comprising:
- establishing a secure connection between said mobile device and said physical proxy server through a secure socket layer connection.
124. The method according to claim 116, further comprising:
- establishing a secure connection provided between said physical proxy server and said physical origin web server.
125. The method according to claim 124, wherein:
- said secure connection is a secure socket layer connection.
126. Apparatus for extracting and reformatting web page content into a format readable on a mobile device, comprising:
- means for receiving a user request from said mobile device for web page content having a first format; and
- means for retrieving said requested web page content from a physical origin web server;
- means for accessing a predefined extraction and conversion file associated with said requested web page content; and
- means for converting said requested web page content in accordance with said predefined extraction and conversion file associated with said requested web page content.
127. The apparatus according to claim 126, further comprising:
- means for storage repository that contains at least one data file associated with each of said mobile device; and
- means for site-mining that receives said requested web page content and site-mines and reformats at least a portion of said requested web page content from said web page content having said first format, for transmission to said mobile device, in accordance with one or more predetermined instructions in each of said at least one data file associated with each of said mobile device.
Type: Application
Filed: Jul 1, 2009
Publication Date: Jan 21, 2010
Inventors: Yin Cheng (Vienna, VA), Wilfredo Padin (Ashburn, VA), Rongli Jiang (Herndon, VA), Andrew Fedorchek (Centreville, VA)
Application Number: 12/458,153
International Classification: G06F 15/16 (20060101);