Systems and methods for efficiently capturing high-quality scans of multi-page documents with hand-held devices
Capturing a sequence of images of a multi-page printed document is performed by a handheld device, such as a Smartphone. The device has one or more processors, memory, and a digital image sensor. The device monitors preview images of a first printed page of the multi-page printed document, where the preview images are generated by the digital image sensor. Without user indication of when to capture an image, the device captures a still image of the first printed page when a first quality metric of the preview images exceeds a first quality threshold. The device repeats the monitoring and capturing of additional pages until receiving indication from the user that capturing images is complete. In response to receiving indication from the user that capturing images is complete, the device concatenates the captured still images into a single digital document and stores the single digital document.
Latest Fuji Xerox Co., Ltd. Patents:
- System and method for event prevention and prediction
- Image processing apparatus and non-transitory computer readable medium
- PROTECTION MEMBER, REPLACEMENT COMPONENT WITH PROTECTION MEMBER, AND IMAGE FORMING APPARATUS
- PARTICLE CONVEYING DEVICE AND IMAGE FORMING APPARATUS
- TONER FOR DEVELOPING ELECTROSTATIC CHARGE IMAGE, ELECTROSTATIC CHARGE IMAGE DEVELOPER, TONER CARTRIDGE, PROCESS CARTRIDGE, IMAGE FORMING APPARATUS, AND IMAGE FORMING METHOD
This application is related to U.S. patent application Ser. No. 13/586,784, filed Aug. 15, 2012, entitled “Smart Document Capture based on Estimated Scanned-Image Quality,” which is hereby incorporated by references in its entirety.
TECHNICAL FIELDThe disclosed implementations relate generally to the capture of digital images, and more specifically to capturing a sequence of digital images using a handheld device.
BACKGROUNDThere are applications that enable people to use their mobile phones as hand-held document scanners. The typical process requires a user to simultaneously hold a document in place, position the mobile phone/device, review the preview images, and press a button when the user believes the image is clear and properly positioned.
Capturing a high-quality scan of a multi-page document with a hand-held device is particularly difficult. Users must pay attention to the camera preview while simultaneously positioning the device (for proper framing) and steadying the device (to maximize sharpness). The user manually snaps and reviews each photo before moving on to the next page. Finally, users must divert cognitive and physical resources to turn pages, while attempting to minimize the impact of page-turning on the camera positioning and steadying that they performed previously. This process is slow, cumbersome, and error prone.
SUMMARYThe present invention reduces the cognitive and physical demands on users scanning multi-page documents with handheld devices. From a user's perspective, once the device is initially positioned and steadied, the experience is similar to shooting a video of the document while turning pages. Disclosed implementations do what is necessary to ensure a high-quality scan. This allows users to focus their cognitive and physical resources on a simpler task: holding the device steady with one hand while turning pages with the other hand.
In some implementations, during the capture process, the application: (a) analyzes image quality of preview frames to enable auto-shooting; (b) analyzes image quality of captured images to determine when re-shooting is necessary; and (c) signals to the user (e.g., via sound) when the user can turn the page. After capture, some implementations provide streamlined user interfaces that leverage image quality to direct a user's attention to just those images most in need of review, thereby reducing the need to review all images. Some implementations support burst-mode capture (e.g., capturing more than one photo of each page, then automatically select the best one for each page) as well as streamlined 2-sided document capture and review.
As described below, the user does not need to push a button to capture an image. Instead, capture is performed without requiring the user to decide when the image quality is good enough. Disclosed implementations address capturing images of multi-page documents, including: methods for handling when to capture; retaking a shot of poor quality (e.g., blurry or not framed well); and reordering captured images. In addition, some implementations support taking multiple shots of a single page, which can improve the quality of saved images when performing multi-page capture.
To support optimal capture of entire pages, some implementations automatically identify the edges of a page, allowing for documents where the corner is stapled, and preventing the capture of images where portions of a page are obscured by extraneous objects, such as fingers.
According to some implementations, a method of capturing a sequence of images of a multi-page printed document is performed by a handheld device, such as a Smartphone. The device has one or more processors, memory, and a digital image sensor. The device monitors preview images of a first printed page of the multi-page printed document, where the preview images are generated by the digital image sensor. Without user indication of when to capture an image, the device captures a still image of the first printed page when a first quality metric of the preview images exceeds a first quality threshold. In some implementations, the user is notified of the image capture, and thus the user knows to turn to the next page in the printed document. The device repeats the monitoring and capturing of additional pages until receiving indication from the user that capturing images is complete. The indication can be active (e.g., explicitly pressing a “Finished” button or closing the scanning application) or passive (e.g., lack of movement or device inactivity for a certain period of time). In response to receiving indication from the user that capturing images is complete, the device concatenates the captured still images into a single digital document and stores the single digital document.
According to some implementations, capturing an image of a respective printed page further comprises after capturing the still image: evaluating at least a portion of the captured still image at full resolution according to a second quality metric; and when the second quality metric is below a second quality threshold, discarding the captured still image, notifying the user of the discarded image, and repeating the monitoring of preview images of the respective printed page.
According to some implementations, capturing an image of a respective printed page further comprises after capturing the still image: evaluating at least a portion of the captured still image at full resolution according to a second quality metric and comparing the quality metric to a second quality threshold. When the second quality metric is below the second quality threshold, repeating the monitoring of preview images of the respective printed page, and when the second quality metric is at or above the second quality threshold, notifying the user the still image has been captured.
According to some implementations, in response to receiving indication from the user that capturing images is complete, the device performs some additional operations. The device evaluates each of the captured still images, and assigns a respective second quality score to each respective captured still image. The device sorts the captured still images according to their assigned second quality scores. Then the device prompts the user to review the captured still images in the sorted order, and repeats the monitoring and capturing operations for images selected by the user. In this review process, the captured still images selected by the user are replaced with higher quality captured images.
Like reference numerals refer to corresponding parts throughout the drawings.
DESCRIPTION OF IMPLEMENTATIONSAbove the document is a handheld device 102, held by a person's right hand 110. The display screen 104 shows a preview image from the digital image sensor 252, which is on the opposite side of the device 102 (see
-
- an operating system 216 that includes procedures for handling various basic system services and for performing hardware dependent tasks;
- a communication module 218 that is used for connecting the handheld device 102 to other computer systems via the one or more communication interfaces 204 (wired or wireless) and one or more communication networks, such as the Internet, other wide area networks, local area networks, metropolitan area networks, and so on;
- a user interface module 220 that receives commands from the user via the input devices 210 and generates user interface objects in the display device 208;
- a web browser 222 that enables a user to access resources, web pages, and web applications over a communication network;
- a scanning application 224 that enables a user to efficiently create digital images of multi-page printed documents, as described in with respect to
FIGS. 3A-15C below. Included in the scanning application 224 are various modules, parameters, and data, including: a sharpness calculation module 226, using techniques such as those described in U.S. patent application Ser. No. 13/586,784; one or more sharpness thresholds 228, which are used to determine when to capture and/or when to discard a captured photo 236; a photo framing module 230, which facilitates properly framing a printed page 108 with respect to the digital image sensor 252 (described in more detail with respect toFIGS. 3A-3C ); an image detection module 232, which detects and alerts a user when an object (such as a finger) is protruding over a printed page (described in greater detail with respect toFIGS. 13 and 14 ); one or more preview images 234 (or a preview image buffer), which are low resolution images provided by the digital image sensor 252 to the sharpness calculation module 226 to determine when to capture a photo; one or more stored (captured) images 236, which are taken at full resolution of the digital image sensor 252; and zero or more user preferences 238, which can specify sharpness threshold(s) 228, how long to wait after capturing a photo before evaluating the preview images 234 for the next page (e.g., half a second), default format for saved documents (e.g., PDF), default directory location for saved documents, layout of elements in the user interface, and so on; - one or more digital documents 240, built by the scanning application 224 as a concatenation of captured images 236.
Each of the above identified elements may be stored in one or more of the previously mentioned memory devices, and corresponds to a set of instructions for performing a function described above. The set of instructions can be executed by one or more processors (e.g., the CPU's 202). The above identified modules or programs (i.e., sets of instructions) need not be implemented as separate software programs, procedures or modules, and thus various subsets of these modules may be combined or otherwise re-arranged in various implementations. In some implementations, memory 214 may store a subset of the modules and data structures identified above. Furthermore, memory 214 may store additional modules and data structures not described above.
Although
Each of the methods described herein may be governed by instructions that are stored in a computer readable storage medium and that are executed by one or more processors of handheld device 102. Each of the operations shown in figures (e.g.,
In the implementation illustrated in
In some implementations, the same feedback provided by the inner preview image frame is used in the outer preview image frame (314A, 314B, and 314C). In other implementations, an inner preview image frame is not used at all, and the framing feedback is provided just by the outer preview image frame (314A, 314B, and 314C).
Some implementations use the inner and outer framing rectangles to provide different feedback, as illustrated in
In the implementation illustrated in
Some implementations provide user configurable options 238 regarding how the framing feedback is provided. For example, some implementations have user preferences to determine which framing rectangles to display, what colors or shading patterns are used, and what the colors or patterns indicate for each of the framing rectangles.
Some implementations include one or more controls or informational icons at the bottom of the display screen 104.
In some implementations, the scanning application 224 provides additional information, such as visual display indicators 310 and 312 in the user interface in the display screen 104. In some implementations, the icon 312 indicates a minimum threshold for quality of the preview images, and bar 310 indicates the quality of the current preview images (the higher the quality, the further the bar extends to the right).
When the user initiates (402) the scanning of a multi-page printed document, the application 224 begins a looping process of scanning pages until the user indicates (404) that the process is complete. One of skill in the art recognizes that the user can initiate (402) scanning in many ways, such as opening the scanning application 224, or pressing a designated button (e.g., “Start Scan”) in the user interface of the scanning application 224. When the user indicates (404) completion, the process finishes (422) by saving the captured images in the form of a digital document. One of skill in the art recognizes that a user can indicate (404) completion of a scan in many ways, such as pressing a designated button (e.g., “Fin” or “Finished”), closing the scanning application 224, setting down the device 102 (e.g., detecting lack of movement), or inactivity for a designated timeout period.
The scanning application 224 executing on the device 102 receives (406) preview images 234 taken by the digital image sensor 252. Typically the preview images are taken at a low resolution. Although at a low resolution, the preview images are sufficient to determine whether a page 108 is framed properly, and allows the application 224 to compute one or more image quality measures. For example, how sharp is the image. In some implementations with multiple quality measures, the quality measures are combined in a single quality metric. When the individual quality measures of the preview images or the combined quality measures of the preview images are (410) below a predefined quality threshold, the scanning application provides (412) feedback to the user about the quality, and continues to get (406) preview images. In some implementations the feedback is on the display screen 104 (e.g., bar 310) or an audible indication. This feedback regarding poor quality images may continue for several seconds while the user steadies the device 102 and/or reorients the device 102 to provide better focus.
One method of computing image quality is disclosed in U.S. patent application Ser. No. 13/293,845. Some implementations also utilize accelerometer readings to determine if the device is moving. Although a stationery device is good for sharpness, lack of movement is not sufficient to guarantee a high-quality image. For example, disclosed implementations consider focus, framing, rotation, shadows, lighting, and depth variation.
To compute the sharpness of a preview image 234 or captured image 236, some implementations calculate a difference in grayscale values between pixels at an edge of an image. The differences may involve a slope of grayscale values between two pixels at an edge window and estimating the edge sharpness based on the slope, multiple slopes, or differences in the slopes.
In some implementations, the sharpness calculation module 226 takes an image (either a preview 234 or captured 236), and creates a smoothed copy of the image (smoothed in the x direction, y direction, or both). The smoothing may be performed by any smoothing technique or a filter such as a median filter. The sharpness module 226 then identifies a first window with a first plurality of pixels around an edge pixel of the smoothed image and identifies a second window corresponding to the first window, where the second window is from the original (non-smoothed) image. The second window has a second plurality of pixels around an edge pixel of the original image and the second plurality of pixels correspond to the first plurality of pixels. In some implementations, the smoothed image is generated on the fly for the pixels in the identified window of the original image so that only the pixels of the window are smoothed. The sharpness calculation module estimates edge sharpness based on determining differences in grayscale value between first pairs of pixels from the first window and second pairs of pixels from the second window over at least one of an x-axis direction and a y-axis direction of the first window and the second window.
When the individual quality measures of the preview images or the combined quality measures of the preview images are (410) at or above the quality threshold, the scanning application proceeds with taking a full-resolution photo. This includes providing (414) feedback that the image quality measurements are good and capturing (416) the actual photo of the page 108. The feedback regarding satisfactory quality can include the same visual means used to provide feedback on poor quality images (e.g., bar 310), but may also include other visual or auditory indications as well. In the implementation of
The process flow in
The scanning application 224 then compares the quality measures of the photo (or a single combined quality measure) to a photo quality threshold, such as sharpness threshold 228. If the quality measures are (504) not adequate, the scanning application 224 provides (412) feedback about the quality measures and gets (406) new preview images for the same page 108. In some implementations, the scanning application 224 provides auditory feedback (such as a “beep”) so that the user knows not to turn to the next page of the printed document. On the other hand, if the quality measures of the photo are (504) adequate, the scanning application 224 proceeds to store (418) the photo (e.g., saved to a non-volatile storage medium).
Even when the quality measures used to evaluate a preview image are the same as the quality measures used to evaluate a captured photo, the threshold may be different. For example, a user may set the quality threshold for photos higher. In some implementations, the quality thresholds are user-configurable.
The process flow in
In
The scanning application 224 then identifies (706) the set of all stored photos whose quality measures are below the photo quality threshold. If this set is (708) empty, the review process is (712) complete. Otherwise, the scanning application alerts the user of the need to retake certain photos, and prompts (710) the user to retake each such photo. The display screen 104 indicates which photos need to be retaken. Retaking a photo uses the same capture process used earlier in the image capture process. In some implementations, the review process repeats after retakes are complete. In some implementations, after the initial batch review process, the individual photos are evaluated one by one, as illustrated in steps 502 and 504 of
Similar to the review process in
In the process flow of
The illustrated user interface includes a “Retry” button 98 for each captured photo, which allows the user to retry capturing a photo of the same page. When pressed, the scanning application 224 discards the current photo for the corresponding page, and initiates the photo capture process as illustrated in
Some implementations include an “Add” button 912, which allows a user to capture additional images. For example, if the user inadvertently missed a page during the original scan of the printed document, the user may press the “Add” button 912 to initiate capture of the missed page. Some implementations include a button 914 to save the currently captured photos as a document. In
The area between the preview image frames 1102 and 1104 are broken into tiles, which are either corner tiles 1108 or non-corner tiles 1106. In
The purpose of framing is to capture an entire page, including edges. Extraneous background can be cropped later. Some implementations use functions in the Open Source Computer Vision library (OpenCV) to identify edges using a Canny edge detector: find contours, compute the bounding box of each contour, and then identify one or more of the largest contours as candidates for the page edge. Large contours can be selected based on contour length, the area covered by a contour, or the area of a bounding box enclosing the contour.
Each contour is then checked to determine whether it fits between the framing rectangles 1102 and 1104. Each tile, except for the corner tiles, is checked for whether it contains a minimum (and optionally a maximum) number of contour points. For example, some implementations identify an edge when each tile contains at least (0.8*height) contour points, where height is measured in pixels. This allows for some missing edge points due to “noise” in the captured image.
The corner tiles 1108 are not included because the number of contour pixels depends on how close a page corner is to a corner of an inner framing rectangle (which could be a very small number). In addition, if a document is stapled, the contour at the stapled corner will include part of the adjacent page. This is illustrated in the upper right corner in
Some implementations identify “page intrusions,” such as a finger or other object covering a portion of an image. This technique is typically applied to preview images, but can also be applied to captured images.
Some implementations identify lines formed by the contour points within each non-corner tile 1106 (e.g., using a Hough Transform) and then check whether the distance between contour points and the identified lines has significantly greater variation in one or more of the edge tiles 1106.
In some implementations, the scanning application 224 finds turning points of contour direction and finds 4 document edge lines. Turning points are identified by computing a vector between each pair of adjacent contour points and identifying those points where the direction changes more than a defined amount (e.g., 80 degrees). Some implementations sample the contour points for more efficient computation. The edge lines can be computed using standard techniques, such as the Hough transform. Because the document edge lines 1310 may be curved in the image, some implementations approximate the document edge lines 1310 with a bounding quadrilateral 1306 (illustrated in gray). In some implementations, the bounding quadrilateral 1306 is selected as a rectangle with axes parallel to the sides of the image. When some turning points are inside the bounding quadrilateral 1306, there are some extraneous objects.
This is illustrated in
In
A process for identifying document intrusions is shown in the flow chart of
The scanning application 224 on the device 102 monitors (1508) preview images 234 of a first printed page 108 of the multi-page printed document. The preview images 234 are generated (1510) by the digital image sensor. Typically the preview images 234 are not full-resolution. The scanning application 224 computes a first quality metric, which includes a framing quality component and/or a sharpness quality component. The first quality is compared to a first quality threshold (a sharpness threshold 228 and/or framing quality threshold), and when the first quality metric of the first page exceeds the first quality threshold, the scanning application captures (1512) a still image of the first printed page. The capture occurs (1514) without user indication of when to capture the image (i.e., automatically when the conditions for capture are met). In some implementations, the scanning application 224 notifies (1516) the user of the capture so that the user knows to turn to the next page in the document.
In some implementations, the captured image is evaluated immediately after capture to verify that the captured image 236 is of sufficient quality before going on to the next page of the document. In these embodiments, at least a portion of the captured still image 236 is evaluated (1518) at full resolution according to a second quality metric. In some implementations, when the second quality metric is below a second quality threshold, the scanning application 224 repeats (1520) the monitoring of preview images 234 of the same printed page. In some of these implementations, the scanning application 224 discards the captured images 236 whose second quality metrics fall below the second quality threshold. The process of monitoring preview images 234, capturing a still image 236, and evaluating the captured still image 236 may occur multiple times for one page if the second quality metric of images of the page continue to be below the second quality threshold.
In some implementations, the scanning application notifies (1522) the user that a still image has been captured when the second quality metric for the captured image is at or above the second quality threshold. In these implementations, the user is notified of the capture only after the evaluation process so that the user does not prematurely turn to the next page of the document. Even though the handheld device could be capturing and evaluating several (or many) images, the user should keep the device steady and in place without turning the page until a quality image has been captured.
In alternative implementations that notify the user when an image is captured, the scanning application 224 notifies (1524) the user again when the quality of the captured image is below the second quality threshold, thus alerting the user that the capture process must be repeated. In some implementations, the user preferences 238 include the first and/or second quality thresholds, enabling a user to control the quality and speed of the photo capture. Some implementations also include a user preference 238 to specify whether image capture notifications occur before or after the image evaluation process. As described in more detail below, some implementations have a batch image evaluation process that occurs after all of the images have been captured.
Evaluating captured images immediately after capture has some weaknesses. For example, some printed pages may be inherently difficult to capture at very high quality (e.g., if the “original” printed document is blurry), so it may be difficult or impossible for the scanning application to capture a sufficiently good image. In another example, a captured image 236 may exceed the second quality threshold, and thus prompt the user to move on to the next page. However, this overlooks the possibility that an even better quality image could be captured. Some implementations address these and other issues by supporting a burst mode of image capture. In a burst mode, the scanning application 224 captures a plurality of images of the same printed page in a burst, providing feedback to the user (e.g., a “click” sound after each capture or a sound indicating that a certain number of images (e.g., 5 or 10) have been captured. When burst mode is used, the duplicate images are evaluated as a group, and the best one in each group is selected. Note that a user could also inadvertently capture more than one image of the same page. Both intentional and accidental capture of multiple images is addressed below.
The scanning application 224 repeats (1526) the monitoring and capturing of additional pages of the multi-page printed document until receiving indication from the user that capturing images is complete. In some instances, the indication from the user is active, such as pressing (1528) a button (e.g., “Done” or “Finished”) in a user interface on the device 102, or receiving (1530) a voice command from the user. In other instances, the indication from the user is passive, such as a predetermined span of inactivity 1532 or a predetermined span of time 1534 in which no motion of the handheld device 102 is detected. The passive indications include setting the device 102 down for a period of time (both inactivity and no motion) or placing the device in a pocket. Even though a device in a pocket might be moving, the preview images 234 inside a pocket should not meet the first quality threshold, and thus not result in captured images (i.e., inactivity). Typical implementations support both active and passive methods of indicating when capturing images is complete.
Some implementations support capturing (1536) still images of the multi-page document in an order not corresponding to the order of the pages in the multi-page printed document. For two-sided documents, this was illustrated previously with respect to
There are some operations that can be performed after all of the images are captured and the user has indicated that the capture process is complete or in parallel with the image capture process. In some implementations, both modes are supported, and parallel processing occurs when there are sufficient processing resources (e.g., CPU capacity, memory availability). In some implementations, one or more of the following operations processes (1538) in parallel with the monitoring and capturing operations 1508, 1512, and 1526 identified above:
-
- Evaluate (1540) each of the respective captured still images 236, and assign a second quality score to each respective captured still image;
- Sort (1542) the captured still images 236 according to their assigned second quality scores;
- Identify (1544) a set of captured still images 236 whose corresponding second quality scores are below a second quality threshold; and/or
- Determine (1546) that two or more captured still images 236 correspond to a single printed page of the multi-page printed document. When this determination occurs, select (1548) one of the captured still images 236 corresponding to the single printed page (typically the one with the highest quality score), and discard (1550) all of the other captured still images 236 corresponding to the single printed page (i.e., keep the selected one, and discard the rest). In some implementations, when the images to discard are in non-volatile memory (e.g., a hard disk), they are physically deleted. In other implementations, when the images to discard are in volatile memory (e.g., RAM), the physical space allocated for “discarded” images is not necessarily erased or overwritten immediately.
When the scanning application 224 receives the indication from the user that capturing images is complete, the scanning application performs (1552) some additional operations. The review process was described previously with respect to
In some implementations, the scanning application sorts (1556) the captured still images 236 according to their assigned second quality scores. In some implementations, the scanning application 224 prompts (1558) the user to review the captured still images in the sorted order, as illustrated in
In other implementations, the scanning application automatically identifies (1560) a set of captured still images 236 whose corresponding second quality scores are below a second quality threshold. The scanning application then repeats (1562) the monitoring and capturing operations for each of the captured still images in the identified set, thereby replacing the captured still images in the identified set with higher quality captured images.
In some implementations, the second quality threshold is a user configurable preference 238, so the user can decide what quality level is satisfactory. Some implementations support both manual review of captured images 236 (typically sorted, as illustrated in
In some implementations, the scanning application determines (1564) that two or more captured still images 236 correspond to a single printed page of the multi-page printed document. When this determination occurs, the scanning application selects (1566) one of the captured still images 236 corresponding to the single printed page (typically the one with the highest quality score), and discards (1568) all of the other captured still images 236 corresponding to the single printed page (i.e., keep the selected one, and discard the rest). In some implementations, when the images to discard are in non-volatile memory (e.g., a hard disk), they are physically deleted. In other implementations, when the images to discard are in volatile memory (e.g., RAM), the physical space allocated for “discarded” images is not necessarily erased or overwritten immediately.
Regardless of which of the review processes occur in response to receiving indication from the user that capturing images is complete, the scanning application finally concatenates (1570) the capture still images into a single digital document 240. In some implementations, the captured still images are ordered (1572) corresponding to the order of the pages in the multi-page printed document, thus creating a digital document 240 whose pages correspond to the printed document. In some instances, the reordered is done for two-sided documents as illustrated and described with respect to
The foregoing description, for purpose of explanation, has been described with reference to specific implementations. However, the illustrative discussions above are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Many modifications and variations are possible in view of the above teachings. The implementations were chosen and described in order to best explain the principles of the invention and its practical applications, to thereby enable others skilled in the art to best utilize the invention and various implementations with various modifications as are suited to the particular use contemplated.
Claims
1. A method of capturing a sequence of images of a multi-page printed document, performed by a handheld device with one or more processors, memory, and a digital image sensor, the method comprising:
- monitoring a time sequence of preview images of a first printed page of the multi-page printed document, the preview images generated by the digital image sensor;
- without user indication of when to capture an image, capturing a still image of the first printed page when a first quality metric of the preview images exceeds a first quality threshold;
- repeating the monitoring and capturing of additional pages of the multi-page printed document until receiving indication from the user that capturing images is complete; and
- in response to receiving indication from the user that capturing images is complete: concatenating the captured still images into a single digital document; and storing the single digital document.
2. The method of claim 1, wherein capturing an image of a respective printed page further comprises notifying the user to turn the page after capturing the still image.
3. The method of claim 1, wherein capturing an image of a respective printed page further comprises after capturing the still image:
- evaluating at least a portion of the captured still image at full resolution according to a second quality metric;
- when the second quality metric is below a second quality threshold, notifying the user that capture of the respective page will be repeated, and repeating the monitoring of preview images of the respective printed page.
4. The method of claim 1, wherein capturing an image of a respective printed page further comprises after capturing the still image:
- evaluating at least a portion of the captured still image at full resolution according to a second quality metric;
- when the second quality metric is below a second quality threshold, repeating the monitoring of preview images of the respective printed page; and
- when the second quality metric is at or above the second quality threshold, notifying the user the still image has been captured.
5. The method of claim 1, further comprising:
- evaluating each of the respective captured still images, and assigning a respective second quality score to each respective captured still image;
- sorting the captured still images according to their assigned second quality scores; and
- prompting the user to review the captured still images in the sorted order, and repeating the monitoring and capturing operations for images selected by the user, thereby replacing selected captured still images with higher quality captured images.
6. The method of claim 5, wherein the evaluating, sorting, and prompting are in response to receiving the indication from the user that capturing images is complete.
7. The method of claim 1, further comprising:
- evaluating each of the respective captured still images, and assigning a respective second quality score to each respective captured still image;
- identifying a set of captured still images whose corresponding second quality scores are below a second quality threshold; and
- repeating the monitoring and capturing operations for each of the captured still images in the identified set, thereby replacing the captured still images in the identified set with higher quality captured images.
8. The method of claim 7, wherein the evaluating, identifying, and repeating the monitoring and capturing operations for each of the captured still images in the identified set are in response to receiving the indication from the user that capturing images is complete.
9. The method of claim 1, wherein the indication from the user comprises a button press on a user interface of the handheld device.
10. The method of claim 1, wherein the indication from the user comprises a voice command from the user.
11. The method of claim 1, further comprising:
- determining that two or more captured still images correspond to a single printed page of the multi-page printed document;
- selecting one of the captured still images corresponding to the single printed page; and
- discarding all of the captured still images corresponding to the single printed page except the selected one.
12. The method of claim 11, wherein the determining, selecting, and discarding are in response to receiving the indication from the user that capturing images is complete.
13. The method of claim 1, further comprising:
- receiving indication from the user that the multi-page printed document is a two-sided document; and
- capturing still images of the multi-page printed document in an order not corresponding to the order of the pages in the multi-page printed document;
- wherein concatenating the captured still images into a single digital document further comprises ordering the captured still images corresponding to the order of the pages in the multi-page printed document.
14. The method of claim 1, wherein concatenating the captured still images into a single digital document further comprises ordering the captured still images based on the order in which the still images were captured.
15. The method of claim 1, wherein monitoring preview images further comprises:
- identifying four document edge lines in a first preview image;
- determining whether there are any turning points for the first preview image inside a rectangle formed by the four document edge lines; and
- when there are one or more turning points inside the rectangle, notifying the user of an image protrusion.
16. A system for capturing a sequence of images of a multi-page printed document, comprising:
- one or more processors;
- memory; and
- one or more programs stored in the memory, the one or more programs comprising instructions for: monitoring a time sequence of preview images of a first printed page of the multi-page printed document, the preview images generated by the digital image sensor; capturing a still image of the first printed page when a first quality metric of the preview images exceeds a first quality threshold, executed without user indication of when to capture the still image; repeating the monitoring and capturing of additional pages of the multi-page printed document until receiving indication from the user that capturing images is complete; and responding to receiving indication from the user that capturing images is complete, including instructions for: concatenating the captured still images into a single digital document; and storing the single digital document.
17. A computer readable storage medium storing one or more programs configured for execution by a computer, the one or more programs comprising instructions for:
- monitoring a time sequence of preview images of a first printed page of the multi-page printed document, the preview images generated by the digital image sensor;
- capturing a still image of the first printed page when a first quality metric of the preview images exceeds a first quality threshold, executed without user indication of when to capture the still image;
- repeating the monitoring and capturing of additional pages of the multi-page printed document until receiving indication from the user that capturing images is complete; and
- responding to receiving indication from the user that capturing images is complete, including instructions for: concatenating the captured still images into a single digital document; and storing the single digital document.
7849393 | December 7, 2010 | Hendricks et al. |
8607166 | December 10, 2013 | Jalon et al. |
20020031262 | March 14, 2002 | Imagawa et al. |
20060170958 | August 3, 2006 | Jung et al. |
20060171695 | August 3, 2006 | Jung et al. |
20070236505 | October 11, 2007 | Jung et al. |
20070281666 | December 6, 2007 | Yoshida |
20080037880 | February 14, 2008 | Lai |
20080082903 | April 3, 2008 | McCurdy et al. |
20080176599 | July 24, 2008 | Kim |
20080294674 | November 27, 2008 | Reztlaff et al. |
20090007019 | January 1, 2009 | Kobayashi et al. |
20090244620 | October 1, 2009 | Takahashi et al. |
20100202026 | August 12, 2010 | Chiu et al. |
20110019224 | January 27, 2011 | Austin |
20110115885 | May 19, 2011 | Wernersson |
20130215116 | August 22, 2013 | Siddique et al. |
20130232409 | September 5, 2013 | Cranfill et al. |
20130238518 | September 12, 2013 | Miller et al. |
20140092438 | April 3, 2014 | Schultz |
20140096021 | April 3, 2014 | Gowen et al. |
20140096041 | April 3, 2014 | Gowen et al. |
20140195921 | July 10, 2014 | Grosz et al. |
20140233864 | August 21, 2014 | Chan et al. |
20140236978 | August 21, 2014 | King et al. |
- Auto Shutter, REC Mode/Recording Techniques, Casio, webpage print date Sep. 19, 2013, 1 pg.
- Van Dyk, Camcard Application Review, Creative Overflow, Internet Archive Wayback Machine, Dec. 22, 2011, 6 pgs.
- CamScanner, iTunes Preview, Internet Archive Wayback Machine, Mar. 20, 2012, 3 pgs.
- Casio EX-S10 Review: Overview, Jun. 27, 2008, 38 pgs.
- Casio EX-Z60 Review: Overview, Jun. 12, 2006, 39 pgs.
- Kumar, Sharpness Estimation for Document and Scene Images, Nov. 11, 2012, 4 pgs.
- ClearCam, iTunes Preview, Internet Archive Wayback Machine, May 18, 2012, 3 pgs.
- Doermann, The Detection of Duplicates in Document Image Databases, Image and Vision Computing, Nov. 12, 1997, 14 pgs.
- Exilim, Tips & Tricks, Auto Shutter Function, Internet Archive Wayback Machine, Mar. 5, 2012, 2 pgs.
- Ferzli, A No-Reference Objective Image Sharpness Metric Based on the Notion of Just Noticeable Blur (JNB), IEEE Transactions on Image Processing, vol. 18, No. 4, Apr. 2009, 12 pgs.
- Genius Scan—PDF Scanner, iTunes Preview, Internet Archive Wayback Machine, May 14, 2012, 3 pgs.
- Jayant, Supporting Blind Photography, 2011, 8 pgs.
- JotNot Scanner Pro, MobiTech 3000 LLC, webpage date Sep. 20, 2013, 3 pgs.
- Liu, High Accuracy and Language Independent Document Retrieval with a Fast Invariant Transform, 2009, 4 pgs.
- TurboScan: Quickly Scan Multipage Documents into High-Quality PDFs, Sep. 19, 2013, 2 pgs.
- Vazquez, Facilitating Photographic Documentation of Accessibility in Street Scenes,CHI 2011 Work-in-Progress, May 7-12, 2011, 6 pgs.
- OpenCV, Internet Archive Wayback Machine, Jul. 20, 2012, 2 pgs.
- White, EasySnap: Real-time Audio Feedback for Blind Photography, University of Rochester Computer Science, Oct. 3-6, 2010, 2 pgs.
- Freeman, How to Use the Whiteboard Feature on Casio Exilim 10.1 Megapixel Camera, Internet Archive Wayback Machine, Sep. 27, 2009, 3 pgs.
Type: Grant
Filed: Mar 15, 2013
Date of Patent: Apr 14, 2015
Patent Publication Number: 20140268247
Assignee: Fuji Xerox Co., Ltd. (Tokyo)
Inventors: Hiroshi Sakaida (Los Altos, CA), Scott Carter (Los Altos, CA), John Adcock (San Francisco, CA), David M. Hilbert (Palo Alto, CA), Francine Chen (Menlo Park, CA)
Primary Examiner: Madelein A Nguyen
Application Number: 13/843,409
International Classification: G06F 3/12 (20060101); G06K 15/00 (20060101); H04H 20/71 (20080101); H04H 40/00 (20080101); H04N 1/387 (20060101); H04N 1/00 (20060101); H04N 1/195 (20060101); H04N 101/00 (20060101);