PRINTING QUALITY DETERMINATION BASED ON TEXT ANALYSIS
A method comprising using at least one hardware processor for: analyzing text in a digital document, to identify a text segment referring to a figure of the digital document; mapping said text segment to said figure; identifying, in said text segment, reference to one or more non-grayscale colors of said figure, to determine a level of importance of said one or more colors to legibility of said figure; and printing said digital document in accordance with the level of importance.
Latest IBM Patents:
- INTERACTIVE DATASET EXPLORATION AND PREPROCESSING
- NETWORK SECURITY ASSESSMENT BASED UPON IDENTIFICATION OF AN ADVERSARY
- NON-LINEAR APPROXIMATION ROBUST TO INPUT RANGE OF HOMOMORPHIC ENCRYPTION ANALYTICS
- Back-side memory element with local memory select transistor
- Injection molded solder head with improved sealing performance
Data generated by computing devices is often printed by ink-jet, laser or other types of printers. These printers adhere ink or toner onto a printable medium, such as paper. The ink or toner may be stored, for example, in a cartridge. The cartridge may then be replaced when the ink or the toner is consumed.
In the case of significant ink or toner consumption, the high frequency of replacement of the cartridges results in higher costs. In fact, a significant cost associated with owning a printer is that of replacing used printer cartridges. While the price of printers is currently decreasing, the price of printer cartridges generally does not. Thus a user may be persuaded to buy a printer because it is less expensive, but is then committed to frequent purchases of more expensive printer cartridges.
Color ink, which usually includes cyan (C), magenta (M) and yellow (Y) colors, is sometimes more expensive than black (K) ink. Therefore, many users elect to save on color (CMY) and opt to print color documents only in black. This, however, may degrade the legibility and/or aesthetics of the printed document.
The foregoing examples of the related art and limitations related therewith are intended to be illustrative and not exclusive. Other limitations of the related art will become apparent to those of skill in the art upon a reading of the specification and a study of the figures.
SUMMARYThe following embodiments and aspects thereof are described and illustrated in conjunction with systems, tools and methods which are meant to be exemplary and illustrative, not limiting in scope.
There is provided, in accordance with an embodiment, a method comprising using at least one hardware processor for: analyzing text in a digital document, to identify a text segment referring to a figure of the digital document; mapping said text segment to said figure; identifying, in said text segment, reference to one or more non-grayscale colors of said figure, to determine a level of importance of said one or more non-grayscale colors to legibility of said figure; and printing said digital document in accordance with the level of importance.
There is further provided, in accordance with an embodiment, a printing server comprising a non-transitory computer-readable storage medium having program code embodied therewith, the program code executable by at least one hardware processor of the printing server to: receive a digital document for printing; analyze text in the digital document, to identify a text segment referring to a figure of the digital document; map said text segment to said figure; identify, in said text segment, reference to one or more non-grayscale colors of said figure, to determine a level of importance of said one or more non-grayscale colors to legibility of said figure; and transmit said digital document to one or more printers in accordance with the level of importance.
There is further provided, in accordance with en embodiments, a computer program product for document analysis, the computer program product comprising a non-transitory computer-readable storage medium having program code embodied therewith, the program code executable by at least one hardware processor to: receive a digital document for printing; analyze text in the digital document, to identify a text segment referring to a figure of the digital document; map said text segment to said figure; identify, in said text segment, reference to one or more non-grayscale colors of said figure, to determine a level of importance of said one or more non-grayscale colors to legibility of said figure; and transmit said digital document to one or more printers in accordance with the level of importance.
In some embodiments, said printing comprises printing, in said one or more non-grayscale colors, a page which includes said figure, wherein the page which includes said figure is part of the digital document.
In some embodiments, said printing in said one or more non-grayscale colors comprises polychromatic printing.
In some embodiments, said printing further comprises printing, in grayscale, a page which does not include said figure, wherein the page which does not include said figure is part of the digital document.
In some embodiments, the method further comprises decolorizing said figure, wherein said printing is printing in grayscale.
In some embodiments, the method further comprises: converting said reference to a color-invariant descriptor; and converting said one or more non-grayscale colors of said figure to a color-invariant texture corresponding to the color-invariant descriptor.
In some embodiments, the method further comprises displaying a printing recommendation to a user, based on the level of importance.
In some embodiments, said transmit comprises transmit a page which includes said figure to a polychromatic printer, wherein the page which includes said figure is part of the digital document.
In some embodiments, said transmit further comprises transmit a page which does not include said figure to a monochromatic printer, wherein the page which does not include said figure is part of the digital document.
In some embodiments, the program code is further executable by said at least one hardware processor of the printing server to decolorize said figure, wherein said printing is printing in grayscale.
In some embodiments, the program code is further executable by said at least one hardware processor of the printing server to: convert said reference to a color-invariant descriptor; and convert said one or more non-grayscale colors of said figure to a color-invariant texture corresponding to the color-invariant descriptor.
In some embodiments, the program code is further executable by said at least one hardware processor of the printing server to display a printing recommendation to a user, based on the level of importance.
In some embodiments, the program code is further executable by said at least one hardware processor to decolorize said figure, wherein said printing is printing in grayscale.
In some embodiments, the program code is further executable by said at least one hardware processor to: convert said reference to a color-invariant descriptor; and convert said one or more non-grayscale colors of said figure to a color-invariant texture corresponding to the color-invariant descriptor.
In addition to the exemplary aspects and embodiments described above, further aspects and embodiments will become apparent by reference to the figures and by study of the following detailed description.
Exemplary embodiments are illustrated in referenced figures. Dimensions of components and features shown in the figures are generally chosen for convenience and clarity of presentation and are not necessarily shown to scale. The figures are listed below.
A method for efficient utilization of printing resources is disclosed herein. The method may be embodied as a system, method or computer program product. For example, the method may be embodied in a print server configured to receive print jobs from various computing devices and transmit the print jobs to one or more printers.
In some embodiments thereof, the method includes analyzing the contents of a print job, namely—text appearing in a digital document sent for printing. The analysis includes identification of one or more text segments which refer to one or more figures appearing in the document. For example, such a text segment may read “Figure 4 shows a pie chart, in which the red slice indicates the amount of . . . ”. Once the one or more text segments have been identified, their one or more referenced figures are located, and a map of segments and their associated figures is created. For example, the aforementioned text segment may be mapped to the pie chart by way of providing coordinates of the pie chart on a respective page.
The method continues by identifying, in the one or more text segments, one or more references to one or more non-grayscale colors of the figure. For example, the wording “the red slice” of the segment may be identified. Since it refers to a non-grayscale, in this example red, color of the figure.
After these references have been identified, the method proceeds by determining a level of importance of the one or more non-grayscale colors to the legibility of the figure. In a simplistic scenario, the mere appearance of color names in the text segment is sufficient to conclude that the colors are important to the legibility of the figure. In a more complex scenario, contextual analysis may be performed, to deduce whether the color contents of the figure may be safely converted to grayscale or whether their legibility will be significantly harmed as a result of the conversion. For example, the analysis may weigh the density of color names in and/or near the text segment. The density may be computed as the ratio between words being color names and other words. Higher density implies a higher level of importance, and vice versa. The level of importance may be binary, for example “important” and “not important”. Alternatively, the level may be on a more diverse scale, for example a percentage-based scale or the like.
Advantageously, the efficient utilization of printing resources is enabled by harnessing the determined level of importance. For example, if it is determined that the level of importance of the one or more non-grayscale colors to the legibility of the one or more figures is above a certain threshold, the printing of the document may be adapted to achieve efficient utilization of printing resources. As one example, a page or multiple pages on which that figure(s) appear may be printed in polychromatic printing, while the rest of the pages of the document may be printed in cheaper, monochromatic (e.g. grayscale) printing. This printing may be carried out using two separate printers, such as a polychromatic printer (also referred to as a “color” printer) and a grayscale printer, or using a single, polychromatic printer having separate cartridges for black ink and ink in one or more other colors.
As another example of efficient utilization of printing resources, the entire document may be printed in monochrome, but the one or more figures may be decolorized using a technique which mitigates or even eliminates any illegibility in the printed outcome. For example, the technique may be that of Wei Hong Lim and Nor Ashidi Mat Isa, “A novel adaptive color to grayscale conversion algorithm for digital images”, Scientific Research and Essays Vol. 7(30), pp. 2718-2730, 2 August, 2012, which is incorporated herein by reference in its entirety.
A further example includes, similar to the previous one, a printing of the entire document in monochrome, however while replacing the non-grayscale elements of the one or more figures with suitable color-invariant texture, such as different types of hatching, etc. Optionally, the text segments which refer to these non-grayscale colors are also amended, to describe the color-invariant texture instead of describing the non-grayscale colors.
While the aforementioned examples may be carried out automatically, such as by a print server receiving a print job, a workstation sending the print job or the printer itself, it is also possible to allow the user to manually elect how to act upon the determination of the level of importance. The user may then manually trigger one or more of the methods discussed in the above examples.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a hardware processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
Reference is now made to
Examples of a digital document include word processing documents, spreadsheet documents, presentation documents, etc.
In a step 104, the text of the digital document is analyzed, to identify one or more text segments which refer to one or more figures being part of the contents of the digital document. In a step 106, it is determined whether such text segment has been found. If no text segments referring to one or more figures was found, then the method may proceed to a step 120, in which the entire digital document is printed in monochrome, for example in grayscale. That is, if no text is referring to a figure, then either the document does not include any figures and can be safely printed in monochrome, or the text does include one or more figures but the importance of printing them in polychrome is relatively low.
If, however, one or more such text segments were found, the method may proceed to a step 108, in which each found segment is mapped (also “linked” or “associated”) to its respective figure. For example, the text segments “The results are shown in the graph of Figure 32, wherein the red like illustrates the trend.” may be mapped to the location of that Figure 32 in the document. The location of the figure may be either by way of providing its coordinates relative to the page, or using any other method known in the art. For example, the figure may have a unique identifier, so that the text segment may be mapped to this unique identifier. Optionally, the mapping is stored in a memory of the computing device carrying out method 100.
In a step 110, the one or more found text segments are analyzed, in order to identify whether they contain any reference to one or more non-grayscale colors of the mapped figure(s). In the exemplary text segment “The results are shown in the graph of Figure 32, wherein the red like illustrates the trend”, there is reference to a red color in the figure. The analysis of step 110 may include a search, inside the found text segment(s), for a pre-provided list of names of colors, shades, etc.
In a step 112, it may be determined whether reference to one or more non-grayscale colors has been identified. If no such reference is identified, it may be deduced that any non-grayscale colors appearing in the figures, if such colors exist, are not important. Accordingly, the method may proceed to step 120, in which the entire digital document is printed in monochrome.
However, if it is determined that reference to one or more non-grayscale colors has been identified, the method may proceed to a step 114, in order to determine the level of importance of the identified one or more non-grayscale colors to the legibility of the method. In a simplistic scenario, the mere appearance of color names in the text segment is sufficient to conclude that the colors are important to the legibility of the figure. For example, a text analysis algorithm may be employed, to detect color names appearing at a certain distance from a name of a figure (e.g. “Figure 32”, “Illustration 1.1”, etc.). For example, the distance may be lower than a few dozen words or even than a few words. This close distance is indicative that the color names have likely been mentioned to describe the figure.
In a more complex scenario, contextual analysis may be performed, to deduce whether the color contents of the figure may be safely converted to grayscale or whether their legibility will be significantly harmed as a result of the conversion.
In a step 116 it is determined whether the determined level of importance is higher or lower than a predetermined threshold. If it is lower than the threshold, the method may continue to step 120, in which the entire digital document is printed in monochrome. If, however, the level of importance is higher than the predetermined threshold, the method may continue to a step 118, in which the digital document is printed in accordance with the level of importance.
Optionally, prior to the printing, the user may be prompted with a notification of any figured and/or text segments that were determined to have important color in them. The user may then decide if and how to print the digital document or parts thereof.
Reference is now made to
In an embodiment, method 100 (
Reference is now made to
Reference is now made to
As another option, printer 402 itself may be configured to execute method 100 (
The flowcharts and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein was chosen to best explain the principles of the embodiments, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
Claims
1. A method comprising using at least one hardware processor for:
- analyzing text in a digital document, to identify a text segment referring to a figure of the digital document;
- mapping said text segment to said figure;
- identifying, in said text segment, reference to one or more non-grayscale colors of said figure, to determine a level of importance of said one or more non-grayscale colors to legibility of said figure; and
- printing said digital document in accordance with the level of importance.
2. The method according to claim 1, wherein said printing comprises printing, in said one or more non-grayscale colors, a page which includes said figure, wherein the page which includes said figure is part of the digital document.
3. The method according to claim 2, wherein said printing in said one or more non-grayscale colors comprises polychromatic printing.
4. The method according to claim 2, wherein said printing further comprises printing, in grayscale, a page which does not include said figure, wherein the page which does not include said figure is part of the digital document.
5. The method according to claim 1, further comprising decolorizing said figure, wherein said printing is printing in grayscale.
6. The method according to claim 5, wherein:
- said decolorizing of said figure comprises converting said one or more non-grayscale colors of said figure to a color-invariant texture; and
- the method further comprises converting said reference to a color-invariant descriptor of the color-invariant texture.
7. The method according to claim 1, further comprising displaying a printing recommendation to a user, based on the level of importance.
8. A printing server comprising a non-transitory computer-readable storage medium having program code embodied therewith, the program code executable by at least one hardware processor of the printing server to:
- receive a digital document for printing;
- analyze text in the digital document, to identify a text segment referring to a figure of the digital document;
- map said text segment to said figure;
- identify, in said text segment, reference to one or more non-grayscale colors of said figure, to determine a level of importance of said one or more non-grayscale colors to legibility of said figure; and
- transmit said digital document to one or more printers in accordance with the level of importance.
9. The printing server according to claim 8, wherein said transmit comprises transmit a page which includes said figure to a polychromatic printer, wherein the page which includes said figure is part of the digital document.
10. The printing server according to claim 9, wherein said transmit further comprises transmit a page which does not include said figure to a monochromatic printer, wherein the page which does not include said figure is part of the digital document.
11. The printing server according to claim 10, wherein said monochromatic printer is a grayscale printer.
12. The printing server according to claim 8, wherein the program code is further executable by said at least one hardware processor of the printing server to decolorize said figure, wherein said printing is printing in grayscale.
13. The printing server according to claim 12, wherein:
- said decolorize of said figure comprises converting convert said one or more non-grayscale colors of said figure to a color-invariant texture; and
- the program code is further executable by said at least one hardware processor of the printing server to convert said reference to a color-invariant descriptor of the color-invariant texture.
14. The printing server according to claim 8, wherein the program code is further executable by said at least one hardware processor of the printing server to display a printing recommendation to a user, based on the level of importance.
15. A computer program product for document analysis, the computer program product comprising a non-transitory computer-readable storage medium having program code embodied therewith, the program code executable by at least one hardware processor to:
- receive a digital document for printing;
- analyze text in the digital document, to identify a text segment referring to a figure of the digital document;
- map said text segment to said figure;
- identify, in said text segment, reference to one or more non-grayscale colors of said figure, to determine a level of importance of said one or more non-grayscale colors to legibility of said figure; and
- transmit said digital document to one or more printers in accordance with the level of importance.
16. The computer program product according to claim 13, wherein said transmit comprises transmit a page which includes said figure to a polychromatic printer, wherein the page which includes said figure is part of the digital document.
17. The computer program product according to claim 14, wherein said transmit further comprises transmit a page which does not include said figure to a monochromatic printer, wherein the page which does not include said figure is part of the digital document.
18. The computer program product according to claim 15, wherein said monochromatic printer is a grayscale printer.
19. The computer program product according to claim 13, wherein the program code is further executable by said at least one hardware processor to decolorize said figure, wherein said printing is printing in grayscale.
20. The computer program product according to claim 19, wherein:
- said decolorize of said figure comprises converting convert said one or more non-grayscale colors of said figure to a color-invariant texture; and
- the program code is further executable by said at least one hardware processor to convert said reference to a color-invariant descriptor of the color-invariant texture.
Type: Application
Filed: Aug 27, 2013
Publication Date: Mar 5, 2015
Applicant: International Business Machines Corporation (Armonk, NY)
Inventors: Gilad Barkai (Haifa), Shai Erera (Kiryat Ata), Ariel Raviv (Haifa), Haggai Roitman (Yoknea'm Elit)
Application Number: 14/010,565
International Classification: G06K 15/02 (20060101); G06F 17/28 (20060101);