Automatic Stacking Based on Time Proximity and Visual Similarity
Automatic stacking based on time proximity and visual similarity is described, including a method, comprising analyzing a time proximity of a plurality of electronic images, performing a visual similarity analysis on the plurality of electronic images, and stacking the plurality of electronic images based on a result of the time proximity analysis and the visual similarity analysis.
Latest Adobe Systems Incorporated Patents:
This is a divisional application of, and claims priority to U.S. patent application Ser. No. 11/394,804, filed Mar. 30, 2006, the disclosure of which is incorporated by reference herein.
FIELD OF THE INVENTIONThe present invention relates generally to software. More specifically, a method and system for stacking images is described.
BACKGROUNDThe advances of conventional digital camera technology have made the process of capturing images easier. However, capturing a good quality image is difficult and problematic using conventional techniques. To work around this problem, many users capture the same image multiple times with the expectation that one of the images may be useable for printing or assembly in a slide show. For example, a feature referred to as “burst mode” or “continuous shooting mode,” may allow a digital camera to take a sequence of images in rapid succession. Some digital camera models allow for an unlimited number of image captures, while other models limit the number of successive images to a single burst.
Capturing multiple images may contribute to image window clutter and unmanageability. In some conventional applications, users can group images in folders or files, such as the filing system in an operating system of a computer. Users may be 20 required to manually select the images, drag them to a folder, and drop them in. This may be very tedious for an avid photographer. Even after the tedious organizational effort, the generic filing systems of operating systems may require the user to locate and select one particular image for printing, viewing, or copying functions.
Image-based applications may provide the concept of a “stack” which may provide a way to group images with some added benefits. The “stack” of images may be treated as if it were one image. That is, one thumbnail image is shown in a display window as a representation of the stack. The other images in the stack are stored and available, but may not have a visible thumbnail. Further, the image selected as the thumbnail may be automatically pre-selected for printing, or viewing. However, similarly to files or folders, users may be required to manually select the images in the stack, which may be tedious and time consuming.
Some applications provide for the grouping of images based on time proximity. That is, the images captured during a selected time interval may be grouped together. However, this method may group images together that have no subject matter commonality. In other words, images taken during a selected time interval may be visually different or disparate. For example, a user may be on a whale watching trip and capture images of a whale breaching when a sea bird passes between the lens and the whale. These images may be grouped by time proximity analysis simply because they were captured during the same time range. Other applications provide for the grouping of images based on visual similarity. That is, the images that share common scenery or common images are grouped together. However, this method may group differing views of the same subject captured years apart that a user may want to keep separate.
Thus, what is needed is a method and system for stacking images without the limitations of conventional techniques.
Various embodiments of the invention are disclosed in the following detailed description and the accompanying drawings:
Various embodiments of the invention may be implemented in numerous ways, including as a system, a process, an apparatus, or a series of program instructions on a computer readable medium such as a computer readable storage medium or a computer network where the program instructions are sent over optical or electronic communication links. In general, operations of disclosed processes may be performed in an arbitrary order, unless otherwise provided in the claims.
A detailed description of one or more embodiments is provided below along with accompanying figures. The detailed description is provided in connection with such embodiments, but is not limited to any particular example. The scope is limited only by the claims and numerous alternatives, modifications, and equivalents are encompassed. Numerous specific details are set forth in the following description in order to provide a thorough understanding. These details are provided for the purpose of example and the described techniques may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the embodiments has not been described in detail to avoid unnecessarily obscuring the description.
Techniques for stacking images are described, including receiving a plurality of electronic images, determining one or more groupings for the plurality of electronic images based on a time proximity and a visual similarity, and creating one or more stacks for the plurality of electronic images based on one or more of the groupings. A system for stacking digital images is also described, including a visual similarity engine that may be configured to assess a visual similarity between digital images, a time analyzer configured to analyze a time proximity between digital images, a grouping module configured to group digital images in one or more groups based on the visual similarity and the time proximity, and a user interface configured to display the groups of digital images.
In some examples, raster formats may be based on picture elements, or pixels. A display space or image may be divided up into thousands or millions of pixels. The pixels may be arranged in rows and columns and may be referred to as colored dots. Raster formats differ in the number of bits used per pixel and the compression technique used in storage. One bit per pixel can store black and white images. For example, a bit value of 0 may indicate a black pixel, and a bit value of 1 may indicate a white pixel. As the number of bits per pixel increase, the number of colors a pixel can represent may increase. Example raster formats for image files may include: Graphic Interchange Format (GIF), Joint Photographers Experts Group (JPEG), and Microsoft Windows™ Bitmap (Bitmap). Additional raster formats for image files may include: Windows® Clip Art (CLP), ZSoft Paintbrush™ (DCX), OS/2™ Warp Format (DIB), Kodak FlashPix™ (FPX), GEM Paint™ Format (IMG), JPEG Related Image Format (JIF), MacPaint™ (MAC), MacPaint™ New Version (MSP), Macintosh™ PICT Format (PCT), ZSoft Paintbrush™ (PCX), Portable Pixel Map by UNIX™ (PPM), Paint Shop Pro™ format (PSP), Unencoded image format (RAW), Run-Length Encoded (RLE), Tagged Image File Format (TIFF), and WordPerfect™ Image Format (WPG).
Vector formats are not based on pixels, but on vectors of data stored in mathematical formats, which may allow for curve shaping and improved scalability over raster images. Some example meta/vector image formats may include: CorelDraw™ (CDR), Hewlett-Packard Graphics Language (HGL), Hewlett-Packard Plotter Language (GL/2), Windows™ Metafile (EMF or WMF), Encapsulated PostScript™ (EPS), Computer Graphics Metafile (COM) Flash™ Animation, Scalable Vector Graphics (SVG), and Macintosh™ graphic file format (PICT).
Thumbnails image 1 through image 24 in
Display window 100 may be a user interface in a file management application such as a digital image management application, a web page, or other application. Although display window 100 displays thumbnails image 1 through image 24, some other embodiments may include many more images to be managed. Management of images may include creating stacks, selecting stacks for slide shows and/or printing, and the like.
In some embodiments, creating a stack refers to the grouping of image files in a common holder represented by a single thumbnail. The stack may preserve the plurality of images without cluttering up the display window or user interface with multiple thumbnails. The image chosen as the stack representative may be printed or used in a slide show by selecting the stack. That is, upon selecting the stack, the representative image is automatically selected.
In some embodiments, grouping module 202 may include time analyzer 206. In some embodiments, time analyzer 206 may analyze the plurality of images for time proximity. Here, time proximity may include the measure of the difference between the time stamps of two or more images and the comparison of the difference to a time range. Time analyzer 206 may include a time range such that images which have a time stamps that fall within the time range may be considered to have time proximity. In some embodiments, the time range may be set via user interface 210. In some embodiments, when two or more images posses time proximity and at least a low to medium visual similarity they may be grouped together by grouping module 202.
In some embodiments, grouping module may include visual similarity engine 204. Visual similarity engine 204 may perform visual simulation analysis. That is, visual similarity engine 204 may analyze the plurality of images input into grouping module 202 to determine if they are visually similar. In some embodiments, visual similarity may be the measure of the degree of similarity between two or more images. Here, visual similarity may refer to having a likeness such that the images capture the same scene. In some embodiments, visual similarity engine may select nodal points in an image for correlation to nodal points in another image. Visual similarity engine 204 may include a threshold value used as the measure beyond which visual similarity may be achieved. In some embodiments, visual similarity engine 204 may include more than one threshold value (e.g., minimum/low and maximum/high thresholds or low, medium, and high thresholds) such that the high threshold is greater than the medium threshold and the medium threshold is greater than the low threshold. Multiple threshold values may allow various levels of visual similarity to be determined. For example, if low, medium, and high visual similarity thresholds are set, the various levels of visual similarity may include low visual similarity, medium visual similarity, and high visual similarity. In some embodiments, the threshold(s) may be set via user interface 210. In some embodiments, two or more images that possess time proximity and at least a low visual similarity may be grouped. In some embodiments, two or more images that are determined to have a visual similarity above a “high” or a stricter threshold may be grouped together.
In some embodiments, the groups may be displayed in user interface 210. Once the groups are displayed in user interface 210 they may be modified. Modification may refer to alteration, adjustment, revision, or change. Group modification or alteration may include moving images from one group to another group, deleting images from a group, and splitting a group into two groups. In some other embodiments, the groups may be sent to stacking module 212 to create stacks based on the groups.
In some embodiments, system 200 may include stacking module 212. Stacking module 212 may create stacks based on the groups. As mentioned previously, creating a stack refers to the grouping of image files in a common holder represented by a single thumbnail. The stack may preserve the plurality of images without cluttering up the display window or user interface with multiple thumbnails. One image in the stack may be chosen as the stack representative. This representative image may be printed or used in a slide show by selecting the stack. That is, upon selecting the stack, the representative image is automatically selected. In some embodiments, the stack representative image may be selected by a user. In some embodiments, the stack representative image may be automatically selected. The resulting stacks may be displayed in user interface 210.
If decision block 272 determines that the time range has not been met, then decision block 282 determines if the visual similarity of image A and image B has met a maximum similarity threshold. If the maximum visual similarity is met, process action 276 may group the images. If the maximum, or stricter, similarity threshold is not met, then no grouping occurs. Process 285 flows to decision block 278 where it may be determined if there are more images to group. If there are more images, image B may become image A and the next image in time order becomes image B in process action 280. Then process 285 compares the new image A and image B as described above. The maximum similarity threshold may be stricter than, more stringent than, or greater than the minimum similarity threshold. That is, if the images do not possess time proximity, process 285 allows for a more stringent visual similarity requirement to group the images. In other words, the maximum similarity threshold requires a higher degree of similarity than the minimum similarity threshold. For example, the minimum visual similarity threshold may be 70% similarity and the maximum similarity threshold may be 80%. With these settings, process 285 may group some images that have time proximity and have a visual similarity of at least 70%. Process 285 may also group images that do not have time proximity but do have a visual similarity of 80% or more.
The groupings created by the grouping process may, in some embodiments, be modified by a user. The modifications may include splitting a group into two groups, moving images from one group to another group, and deleting images from a group.
In an example, group 336 in
According to some embodiments of the invention, computer system 500 performs specific operations by processor 504 executing one or more sequences of one or more instructions stored in system memory 506. Such instructions may be read into system memory 506 from another computer readable medium, such as static storage device 508 or disk drive 510. In some embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement the invention.
The term “computer readable medium” refers to any medium that participates in providing instructions to processor 504 for execution. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical or magnetic disks, such as disk drive 510. Volatile media includes dynamic memory, such as system memory 506. Transmission media includes coaxial cables, copper wire, and fiber optics, including wires that comprise bus 502. Transmission media can also take the form of acoustic or light waves, such as those generated during radio wave and infrared data communications.
Common forms of computer readable media includes, for example, floppy disk, flexible disk, hard disk, magnetic tape, any other magnetic medium, CD-ROM, any other optical medium, punch cards, paper tape, any other physical medium with patterns of holes, RAM, PROM, EPROM, FLASH-EPROM, any other memory chip or cartridge, carrier wave, or any other medium from which a computer can read.
In some embodiments of the invention, execution of the sequences of instructions to practice the invention is performed by a single computer system 500. According to some embodiments of the invention, two or more computer systems 500 coupled by communication link 520 (e.g., LAN, PSTN, or wireless network) may perform the sequence of instructions to practice the invention in coordination with one another. Computer system 500 may transmit and receive messages, data, and instructions, including program, i.e., application code, through communication link 520 and communication interface 512. Received program code may be executed by processor 504 as it is received, and/or stored in disk drive 510, or other non-volatile storage for later execution.
Although the foregoing examples have been described in some detail for purposes of clarity of understanding, the invention is not limited to the details provided. There are many alternative ways of implementing the invention. The disclosed examples are illustrative and not restrictive.
Claims
1. A method, comprising:
- analyzing a time proximity of a plurality of electronic images;
- performing a visual similarity analysis on the plurality of electronic images; and
- stacking the plurality of electronic images based on a result of the time proximity analysis and the visual similarity analysis.
2. The method of claim 1, wherein the stacking of the plurality of electronic images further comprises automatically sorting the plurality of electronic images into stacks having images with a time proximity within a time proximity range and a visual similarity greater than or equal to a first visual similarity threshold.
3. The method of claim 2, wherein the stacking of the plurality of electronic images further comprises automatically sorting the plurality of electronic images into stacks having electronic images with the time proximity outside the time proximity range and the visual similarity greater than a second visual similarity threshold, the second visual similarity threshold being greater than the first visual similarity threshold.
4. The method of claim 1, wherein the analyzing of the time proximity and the performing of the visual similarity analysis are triggered by a download of images from an image capturing device.
5. The method of claim 1, wherein the stacking of the plurality of electronic images further comprises selecting a stack representative image from one of the stacked images.
6. The method of claim 1, wherein the selecting of the stack representative image is based on image quality.
7. A computer readable memory comprising instructions that, responsive to execution by one or more processors, cause the one or more processors to perform operations comprising:
- analyzing a time proximity on a plurality of electronic images to generate a first result;
- performing a visual similarity analysis on the plurality of electronic images to generate a second result; and
- stacking the plurality of electronic images into at least one stack based on both the first result and the second result, the at least one stack being usable to represent a subset of the electronic images via a user interface.
8. The computer readable memory of claim 7, wherein the operations further comprise displaying the at least one stack with an image that represents the at least one stack and which is selected from the at least one stack.
9. The computer readable memory of claim 7, wherein the at least one stack is selectable to open the at least one stack and view the subset of the electronic images.
10. The computer readable memory of claim 7, wherein the stacking of the plurality of electronic images further comprises dividing the plurality of electronic images into stacks having images with a first time proximity within a time proximity range and a first visual similarity greater than or equal to a first visual similarity threshold.
11. The computer readable memory of claim 10, wherein the stacking of the plurality of electronic images further comprises dividing the plurality of electronic images into stacks having images with a second time proximity outside a time proximity range and a second visual similarity greater than or equal to a second visual similarity threshold, the second similarity threshold being stricter than the first similarity threshold.
12. The computer readable memory of claim 7, wherein the analyzing of the time proximity and the performing of the visual similarity analysis are triggered by a download of electronic images from an image capturing device.
13. A computing device comprising:
- a memory;
- one or more processors configured to utilize instructions in the memory to cause the computing device to perform operations comprising: analyzing a time proximity of a plurality of images; performing a visual similarity analysis on the plurality of images; and stacking the plurality of images based on a result of the time proximity analysis and a result of the visual similarity analysis by at least using a visual similarity threshold that is selected based on the time proximity.
14. The computing device of claim 13, wherein the plurality of images are stacked based on whether the result of the visual similarity analysis satisfies the visual similarity threshold.
15. The computing device of claim 13, wherein the operations include automatically sorting the plurality of images into stacks having two or more images with an associated time proximity within a range of time and an associated visual similarity greater than or equal to the visual similarity threshold.
16. The computing device of claim 13, wherein the visual similarity threshold is selected from a plurality of thresholds including a first visual similarity threshold and a second visual similarity threshold, wherein a first visual similarity threshold is selected based on the time proximity being within a range of time, and wherein the second visual similarity threshold is selected based on the time proximity being outside the range of time.
17. The computing device of claim 16, wherein the second visual similarity threshold is relatively higher than the first visual similarity threshold.
18. The computing device of claim 13, wherein stacking the plurality of images includes:
- stacking the plurality of images into one or more stacks that each include a subset of the plurality of images; and
- selecting, for each stack, an image from the stack to represent the stack.
19. The computing device of claim 18, wherein the image that is selected to represent the stack is selected based on one or more of image quality or image histogram.
20. The computing device of claim 13, wherein the visual similarity threshold comprises a user-adjustable threshold that is adjustable via a user interface.
Type: Application
Filed: Dec 10, 2013
Publication Date: Apr 10, 2014
Applicant: Adobe Systems Incorporated (San Jose, CA)
Inventors: Radford Spaeth (Rohnert Park, CA), Michael Slater (Sebastopol, CA)
Application Number: 14/101,718
International Classification: G06F 3/0484 (20060101);