IMAGE PROCESSING OF VIDEO USING NOISE ESTIMATE

Info

Publication number: 20130136376
Type: Application
Filed: Jan 28, 2013
Publication Date: May 30, 2013
Inventor: John Peter Oakley (Manchester)
Application Number: 13/751,524

Abstract

A method of processing image date representing an image of a scene to generate an estimate of noise present in the image data. The method comprises evaluating a function for different values of the estimate, the function taking as input an estimate of the noise, and determining an estimate of the noise for which the function has an optimum value.

Description

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a divisional of U.S. Ser. No. 11/908,736, filed Sep. 14, 2007, now U.S. Pat. No. ______, which is a §371 National Stage filing of PCT/GB2006/000974, filed Mar. 16, 2006, and which claims the benefit of U.S. Ser. No. 60/665,491, filed Mar. 25, 2005. The entire contents of these prior applications are hereby incorporated by reference.

TECHNICAL FIELD

The present invention relates to image processing methods, and more particularly but not exclusively to methods for enhancing images which are at least partially affected by the effects of noise.

BACKGROUND OF THE INVENTION

There are many sources of noise which may degrade an image of a scene. For example an image of a scene will often be degraded by optical scattering of light caused, for example by fog or mist. This optical scattering results in additional lightness being present in some parts of the image, and has been referred to as “airlight” in the literature. It is desirable to process an image so as to remove components of pixel values which are attributable to airlight.

If the distance between a camera position and all points of a scene represented by an image generated by the camera is approximately constant, airlight can be estimated and removed by applying equation (1) to each pixel of the image:

y=m(x−c) (1)

where:

- x is an original pixel value;
- c is a correction selected to represent “airlight”;
- m is a scaling parameter; and
- y is a modified pixel value.

Assuming that parameter c is correctly chosen, processing each pixel of a monochrome image in accordance with equation (1) will enhance an image by removing air light. However determination of an appropriate value for the parameter c is often problematic.

Various known methods exist for estimating the parameter c by using contrast measurements such as a ratio of the standard deviation of pixel values to the mean of pixel values. However, such contrast measures do not discriminate between airlight-induced contrast loss and inherently low contrast scenes. For example an image of sand dunes would often provide little contrast between the light and dark parts of the scene, even when no airlight is present. Thus ad-hoc schemes to determine the parameter c will sometimes result in severe image distortion.

The method described above with reference to equation (1) is applicable to a monochrome image. Further problems arise if a colour image is to be processed. Typically the airlight contribution to a pixel value, and hence the value of the parameter c, will depend upon the wavelength (colour) of the light. Thus, if equation (1) is to be applied to colour images, different values of the parameter c may be needed for red, green and blue channels of the image.

The methods described above assume that the camera position is equidistant from all points in a scene represented by an image. Published European Patent EP 0839361 describes a method developed by one of the present inventors, in which different values of the parameter c in equation (1) are used for different pixels, in dependence upon distance between the camera position and a position in a scene represented by that pixel. This invention arose from a realisation that backscattered light may vary in dependence upon the distance between a camera position and a position in a scene.

SUMMARY OF THE INVENTION

It is an object of embodiments of the present invention to obviate or mitigate at least some of the problems outlined above.

According to the present invention, there is provided, a method of processing image data representing an image of a scene to generate an estimate of noise present in the image data. The method involves a function taking as input an estimate of the noise. The function is evaluated for a plurality of different values of the estimate, and an estimate for which the function has an optimum value is determined. The optimum value may be a stationary point of a function, for example a minimum.

Thus, the present invention provides a robust method of determining noise present in an image by simply evaluating a function and determining a value of the noise estimate for which the function has an optimum value.

The term noise as used in this document is used to mean any artefact of an image which imparts slow varying offsets to that image. Examples of noise are described in further detail below although it will be appreciated that examples of such noise include airlight.

The image data may comprise pixel values for each of a plurality of pixels of the image, and the function may take as input at least a subset of the pixel values.

The method may further comprise filtering the image data to generate filtered image data comprising filtered pixel values for each of the image pixels. The function may then take as input at least a subset of said filtered pixel values. The subset is preferably selected so as to be the same subset of pixels for which the function takes as input pixel values.

The function described above may be of the form

$\begin{matrix} S (λ) = \frac{1}{K} \sum_{k = 1}^{k = K} {(\frac{p_{k} - \overline{p_{k}}}{\overline{p_{k}} - λ})}^{2} \cdot \exp \frac{1}{K} \sum_{k = 1}^{k = K} ({\ln (\overline{p_{k}} - λ)}^{2}) & (2) \end{matrix}$

where:

- K is the number of pixels to be processed;
- p_kis the value of pixel k
- p_k is the value of pixel k after application of the lowpass filter described above;
- λ is the value of the parameter; and
- S(λ) is the function to be optimised.

The optimum value may be determined using any of the large number of numerical analysis techniques which are well known in the art. Such numerical analysis techniques preferably begin with an initial estimate of zero for the noise included in the image.

It is known that some noise present within images varies in dependence on the distance between a camera position and a position in the scene. Such variants may be taken into account using the methods set out above by generating a plurality of different estimates.

The invention also provides a method of removing noise from an image. The method comprises estimating noise included in the image using a method as set out above, and then processing each pixel of the image to remove the estimate of said noise to generate said output pixel values.

The method of removing noise may further comprise multiplying the output pixel values by a predetermined coefficient. Multiplying output pixel values by a predetermined coefficient may provide illumination compensation using one of a number of well known illumination compensation techniques.

The noise included in the image maybe at least partially attributable to one or more of atmospheric back scattered light, a dark current effect of a camera, or dirt on a camera lens.

According to a further aspect of the present invention, there is provided, a carrier medium carrying computer readable instructions configured to cause a computer to carry out a method according to any proceeding claims.

The invention further provides a computer apparatus for generating an estimate of noise present in image data representing an image. The apparatus comprises a program memory containing processor readable instructions, and a processor configured to read and execute instructions stored in said program memory. The processor readable instructions comprise instructions configured to cause the computer to carry out a method as described above.

According to a further aspect of the present invention there is provided, a method of processing a plurality of frames of video data to generate enhanced video data. The method comprises processing a first frame of video data to generate an estimate of noise included within said frame of video data, storing data indicative of said estimate of noise, processing at least one further frame of video data to remove noise from said further frame of video data using said stored estimate of noise included in said first frame of video data and outputting an enhanced frame of video data generated using said further frame of video data.

Thus, by using the methods set out above it is possible for processing of a first frame of video data to proceed relatively slowly while processing of the at least one further frame of video data may take place relatively quickly so as to prevent artefacts of ‘jerkiness’ being apparent in the output video data.

The first frame of video data may be processed in accordance with the methods set out above. This processing may further generate a contrast enhancement parameter.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the present invention will now be described, by way of example, with reference to the accompanying drawings, in which:

FIG. 1 is a flowchart of an image processing method in accordance with the present invention;

FIG. 2 is an example image to which processing in accordance with FIG. 1 is applied;

FIG. 3 is a graph showing values of a parametric function used in the processing of FIG. 1 when applied to the image of FIG. 2;

FIG. 4 is an illustration of scene geometry; and

FIG. 5 is a schematic illustration of an implementation of an image processing method in accordance with the invention.

DETAILED DESCRIPTION OF THE ILLUSTRATED EMBODIMENTS

Pixel values of an image may include a component attributable to noise caused by atmospheric backscattered light, often referred to as airlight. Such noise can be removed from the image using equation (1) set out above, and recalled below:

y=m(x−c) (1)

where:

- x is an original pixel value;
- c is a correction selected to represent “airtight”;
- m is a scaling parameter; and
- y is a modified pixel value.

An embodiment of the present invention provides a method for estimating a value of the parameter c. This method is illustrated in the flowchart of FIG. 1. The method operates by sampling pixel values associated with at least a subset of the pixels defining the image, inputting these values into a parametric function, and determining a value for the parameter of the parametric function for which the function has an optimum value. The value of the parameter so determined is the value generated for the parameter c in equation (1).

Referring to FIG. 1, at step S1 pixel values p associated with at least some of the pixels within the image to be processed are sampled. At step S2 a low-pass filter is applied to the image to be processed to generate filtered pixel values p for each of the image pixels. The filter is preferably a low-pass filter having a Gaussian shaped kernel and a value of sigma equal to approximately five. The generation of such a low pass filter will be well known to one of ordinary skill in the art.

At step S3 the parameter λ of the parametric function (described below) is initialised, typically to a value of 0. The parametric function evaluated at step S4 of FIG. 1 is that of equation (2):

$\begin{matrix} S (λ) = \frac{1}{K} \sum_{k = 1}^{k = K} {(\frac{p_{k} - \overline{p_{k}}}{\overline{p_{k}} - λ})}^{2} \cdot \exp \frac{1}{K} \sum_{k = 1}^{k = K} ({\ln (\overline{p_{k}} - λ)}^{2}) & (2) \end{matrix}$

where:

- K is the number of pixels to be processed;
- p_kis the value of pixel k
- p_k is the value of pixel k after application of the lowpass filter described above;
- λ is the value of the parameter; and
- S(λ) is the function to be optimised.

As mentioned above, λ is initialised to a value of 0 at step S3 and equation (2) is then evaluated at step S4. At step S5 a check is made to determine whether the evaluated value of equation (2) is an optimum value, in this case a minimum value. When it is determined that this is the case, parameter c is set to be the determined value of λ at step S6. Until a value of λ for which S(λ) is a minimum is found, λ is updated at step S7, and processing returns to step S4.

Location of the minimum value of S(λ) (step S3 to S7) can be carried out using a number of known numerical optimisation techniques. Further details of one such technique are presented below.

It has been indicated that processing is carried out to determine a value of λ, for which the function S(λ) is a minimum. It is now shown that such a value of λ will be a good estimate for the parameter c in equation (1).

Any image can be regarded as a collection of regions with each region corresponding to some kind of visible surface. For example a tree may be seen as collection of foliage, trunk and branch regions. Each region has a certain spectral reflectivity which is associated with its local pigmentation. Within each region there is a variation in brightness that is mainly due to macroscopic variations in local surface orientation. This consideration of an image corresponds to a fairly general and basic model for image texture.

A key feature of the model described above is that the fractional variation in brightness is approximately independent of the actual brightness. Any offsets due to airlight will change this feature and so are detectable. This is used as a basis for the processing described above with reference to FIG. 1 and equation (2).

In the described embodiment the local texture variation is assumed to have a Normal (Gaussian) distribution.

An image to be processed is assumed to contain M regions and the mean brightness of the a region m is R_m, where m=1 . . . M. No loss of generality is implied as M could be very large and the regions very small. An explicit segmentation of the image into these M regions is not performed.

The total number of pixels in the image is denoted by K, and the fraction of those pixels in region m is denoted by K_m. The k_thpixel in region m is denoted by p_k^m.

The k_thpixel in region m (p_k^m) is represented as follows:

p_k^m=R_m(1+N)+c, (3)

where:

- N=N(0,σ) is a Normal random variable with zero mean, standard deviation σ;
- c is a scalar constant indicative of airlight.

The low-pass filter described above smoothes out random fluctuations in the local pixel value p_k^mto produce a “smoothed” value p_k^m for each pixel.

If the spatial extent of the filter is sufficiently large for the purpose of smoothing but still small with respect to the size of the image region (so that border effects are not significant) then:

p_k^m≈R_m+c. (4)

That is, the filtering removes the variation represented by normal variable N.

Recalling equation (2):

$\begin{matrix} S (λ) = \frac{1}{k} \sum_{k = 0}^{k < K} {(\frac{p_{k} - \overline{p_{k}}}{\overline{p_{k}} - λ})}^{2} \cdot \exp \frac{1}{k} \sum_{k = 0}^{k < K} ({\ln (\overline{p_{k}} - λ)}^{2}) & (2) \end{matrix}$

While equation (2) requires summations over all pixels of the image, it can be rewritten in terms of each region m of the M regions, recalling that K_m, is the fraction of pixels in region m.

Recalling also equations (3) and (4), equation (2) can be written as:

$\begin{matrix} S (λ) = \sum_{m = 1}^{m = M} K_{m} (\frac{{((R_{m} (1 + N) + c) - (R_{m} + c))}^{2}}{{(R_{m} + c - λ)}^{2}}) \cdot \exp (\sum_{m = 1}^{m = M} (K_{m} \cdot {\ln (R_{m} + c - λ)}^{2})) & (5) \end{matrix}$

Simplifying equation (5), and recalling that N=N(0,σ) gives:

$\begin{matrix} S (λ) = \sum_{m = 1}^{m = M} K_{m} (\frac{R_{m}^{2} σ^{2}}{{(R_{m} + c - λ)}^{2}}) \cdot \exp (\sum_{m = 1}^{m = M} (K_{m} \cdot {\ln (R_{m} + c - λ)}^{2})) & (6) \end{matrix}$

given that σ²is the expectation value of the square of N.

Define:

u=c−λ. (7)

and substitute equation (7) into equation (6):

$\begin{matrix} S (c - u) = σ^{2} (\sum_{m = 1}^{m = M} K_{m} \frac{R_{m}^{2}}{{(R_{m} + u)}^{2}}) \cdot \exp (2 \sum_{m = 1}^{m = M} K_{m} \ln (R_{m} + u)) & (8) \end{matrix}$

Define:

$\begin{matrix} f = \sum_{m = 1}^{m = M} K_{m} \frac{R_{m}^{2}}{{(R_{m} + u)}^{2}} & (9) \\ g = \exp (2 \sum_{m = 1}^{m = M} K_{m} \ln (R_{m} + u)) & (10) \end{matrix}$

Differentiating equation (9) with respect to u gives:

$\begin{matrix} \frac{\partial f}{\partial u} = - 2 \sum_{m = 1}^{m = M} K_{m} \frac{R_{m}^{2}}{{(R_{m} + u)}^{3}} & (11) \end{matrix}$

Differentiating equation (10) with respect to u gives:

$\begin{matrix} \frac{\partial g}{\partial u} = 2 \exp (2 \sum_{m = 1}^{m = M} K_{m} \ln (R_{m} + u)) \cdot \sum_{m = 1}^{m = M} K_{m} \frac{1}{R_{m} + u} & (12) \end{matrix}$

From equation (8):

S(λ)=σ²f·g (13)

Therefore:

$\begin{matrix} \frac{\partial S}{\partial u} = σ^{2} \partial (f \cdot g) = σ^{2} (\partial f \cdot g + f \cdot \partial g) & (14) \end{matrix}$

Substituting equations (9), (10), (11) and (12) into equation (14) gives:

$\begin{matrix} \frac{\partial S}{\partial u} = - 2 σ^{2} (\sum_{m = 1}^{m = M} K_{m} \frac{R_{m}^{2}}{{(R_{m} + u)}^{3}}) \exp (2 \sum_{m = 1}^{m = M} K_{m} \ln (R_{m} + u)) + 2 σ^{2} (\sum_{m = 1}^{m = M} K_{m} \frac{1}{R_{m} + u}) \exp (2 \sum_{m = 1}^{m = 1} K_{m} \ln (R_{m} + u)) (\sum_{m = 1}^{m = M} K_{m} \frac{R_{m}^{2}}{{(R_{m} + u)}^{2}}) & (15) \end{matrix}$

Differentiating equation (7) with respect to λ gives:

$\begin{matrix} \frac{\partial u}{\partial λ} = - 1 & (16) \end{matrix}$

It can be stated that:

$\begin{matrix} \frac{\partial S}{\partial λ} = \frac{\partial S}{\partial u} \cdot \frac{\partial u}{\partial λ} & (17) \end{matrix}$

That is

$\begin{matrix} \frac{\partial S}{\partial λ} = 2 σ^{2} (\sum_{m = 1}^{m = M} K_{m} \frac{R_{m}^{2}}{{(R_{m} + u)}^{3}}) \exp (2 \sum_{m = 1}^{m = M} K_{m} \ln (R_{m} + u)) - 2 σ^{2} (\sum_{m = 1}^{m = M} K_{m} \frac{1}{R_{m} + u}) \exp (2 \sum_{m = 1}^{m = 1} K_{m} \ln (R_{m} + u)) (\sum_{m = 1}^{m = M} K_{m} \frac{R_{m}^{2}}{{(R_{m} + u)}^{2}}) & (18) \end{matrix}$

It has been stated above that λ=c when S is a minimum, mat is when

$\frac{\partial S}{\partial λ} = 0.$

If λ=c, then from equation (7) u=0. Substituting u=0 into equation (18) gives:

$\begin{matrix} \frac{\partial S}{\partial λ} = 2 σ^{2} (\sum_{m = 1}^{m = M} K_{m} \frac{1}{R_{m}}) \exp (2 \sum_{m = 1}^{m = M} K_{m} \ln R_{m}) - 2 σ^{2} (\sum_{m = 1}^{m = M} K_{m} \frac{1}{R_{m}}) \exp (2 \sum_{m = 1}^{m = M} K_{m} \ln R_{m}) (\sum_{m = 1}^{m = M} K_{m}) & (19) \end{matrix}$

Given that

$(\sum_{m = 1}^{m = M} K_{m}) = 1,$

it can be seen that

$\frac{\partial S}{\partial λ} = 0.$

This supports the claim that equation (2) has a stationary point when λ=c. This means that c may be determined by applying an efficient numerical search procedure to the equation (2).

Having shown that a value of λ for which S(λ) is a minimum will provide a good estimate for c, a method for minimization of the function S(λ) is now described. The simplest way to minimize the function of equation (2) is to directly evaluate values of equation (2). However, this formulation of the cost function suffers from numerical inaccuracies. In general terms, for numerical optimization algorithms to work efficiently the target function must be a continuous and smooth function of its parameters. An alternative, but equivalent, formulation of the cost function is described below. This formulation is preferred because it leads to more rapid determination of the optimum value of λ. An example of a suitable optimization code is the routine E04UCC from the Numerical Algorithms Group Ltd, Oxford.

Image data is processed on a row by row basis for each row i. The results of this row by row processing are then combined in a recursive manner.

From equation (2):

$\begin{matrix} s (λ) = \overline{{(\frac{P - \overline{p}}{p - λ})}^{2}} \cdot \exp \overline{({\ln (\overline{p} - λ)}^{2}} & (20) \end{matrix}$

Then define:

$\begin{matrix} RS (i) = \frac{1}{M} \sum_{m = 1}^{M} \frac{{(p_{m} - \overline{p_{m}})}^{2}}{{(\overline{p_{m}} - λ_{t})}^{2}} & (21) \\ RL (i) = \frac{1}{M} \sum_{m = 1}^{M} {\ln (\overline{p_{m}} - λ_{t})}^{2} & (22) \end{matrix}$

for a given row i, where M is the number of columns and m is the column currently being processed.

$\begin{matrix} RTL (n) = \frac{1}{n} \sum_{i = 1}^{n} RL (i) & (23) \end{matrix}$

Where n is the number of rows. Then it can be seen that:

$\begin{matrix} S (n) = \frac{1}{n} [\sum_{i = 1}^{n} RS (i)] \cdot \exp (RTL (n)) & (24) \\ S (n + 1) = \frac{1}{n + 1} [\sum_{i = 1}^{n + 1} RS (i)] \cdot \exp (RTL (n + 1) & (25) \end{matrix}$

Recalling the definition of RTL(n) from equation (23) equation 25 can be rewritten as:

$\begin{matrix} S (n + 1) = \frac{1}{n + 1} [\sum_{i = 1}^{n + 1} RS (i)] \cdot \exp [\frac{1}{n + 1} \sum_{i = 1}^{n + 1} RL (i)] & (26) \end{matrix}$

Rearranging the summations of equation 26 gives:

$\begin{matrix} S (n + 1) = \frac{1}{n + 1} [(\sum_{i = 1}^{n} RS (i)) + RS (n + 1)] \cdot \exp {\frac{1}{n + 1} [(\sum_{i = 1}^{n} RL (i)) + RL (n + 1)]} & (27) \end{matrix}$

Rearranging gives:

$\begin{matrix} S (n + 1) = [(\frac{1}{n + 1} \sum_{i = 1}^{n} RS (i)) + \frac{RS (n + 1)}{n + 1}] \cdot \exp [(\frac{1}{n + 1} \sum_{i = 1}^{n} RL (i)) + \frac{RL (n + 1)}{n + 1}] & (28) \end{matrix}$

Given that the exponential of a sum is equal to the multiplication of the exponentials of the sum's components, equation 28 can be rewritten as:

$\begin{matrix} S (n + 1) = [(\frac{1}{n + 1} \sum_{i = 1}^{n} RS (i)) + \frac{RS (n + 1)}{n + 1}] \cdot \exp [\frac{1}{n + 1} \sum_{i = 1}^{n} RL (i)] \cdot \exp [\frac{RL (n + 1)}{n + 1}] & (29) \end{matrix}$

Which can be rewritten as:

$\begin{matrix} S (n + 1) = [(\frac{n}{n + 1} \cdot \frac{1}{n} \sum_{i = 1}^{n} RS (i)) + \frac{RS (n + 1)}{n + 1}] \cdot \exp [\frac{n}{n + 1} \cdot \frac{1}{n} \sum_{i = 1}^{n} RL (i)] \cdot \exp [\frac{RL (n + 1)}{n + 1}] & (30) \end{matrix}$

Which, recalling the definition of RTL(n) from equation (23) gives:

$\begin{matrix} S (n + 1) = [(\frac{n}{n + 1} \cdot \frac{1}{n} \sum_{i = 1}^{n} RS (i)) + \frac{RS (n + 1)}{n + 1}] \cdot \exp [\frac{n}{n + 1} RTL (n)] \cdot \exp [\frac{RL (n + 1)}{n + 1}] & (31) \end{matrix}$

Multiplying out of the expression of equation (31) gives:

$\begin{matrix} S (n + 1) = \frac{n}{n +} \cdot \frac{1}{n} [\sum_{i = 1}^{n} RS (i)] \cdot \exp [\frac{n}{n + 1} RTL (n)] \cdot \exp [\frac{RL (n + 1)}{n + 1}] + \frac{RS (n + 1)}{n + 1} \cdot \exp [\frac{RL (n + 1)}{n + 1}] \cdot \exp [\frac{n}{n + 1} RTL (n)] & (32) \end{matrix}$

Which can be rewritten as:

$\begin{matrix} S (n + 1) = \frac{n}{n + 1} \cdot \frac{1}{n} [\sum_{i = 1}^{n} RS (i)] \cdot \exp [(1 + \frac{- 1}{n + 1}) \cdot RTL (n)] \cdot \exp [\frac{RL (n + 1)}{n + 1}] + \frac{RS (n + 1)}{n + 1} \cdot \exp [\frac{RL (n + 1)}{n + 1}] \cdot \exp [\frac{n}{n + 1} RTL (n)] & (33) \end{matrix}$

Which can be rewritten as:

$\begin{matrix} S (n + 1) = \frac{n}{n + 1} \cdot \frac{1}{n} [\sum_{i = 1}^{n} RS (i)] \cdot \exp [RTL (n) + \frac{- 1}{n + 1} RTL (n)] \cdot \exp [\frac{RL (n + 1)}{n + 1}] + \frac{RS (n + 1)}{n + 1} \cdot \exp [\frac{RL (n + 1)}{n + 1}] \cdot \exp [\frac{n}{n + 1} RTL (n)] & (34) \end{matrix}$

Rewriting the second term of the multiplication gives:

$\begin{matrix} S (n + 1) = \frac{n}{n + 1} \cdot \frac{1}{n} [\sum_{i = 1}^{n} RS (i)] \cdot \exp [RTL (n)] \cdot \exp [\frac{- 1}{n + 1} RTL (n)] \cdot \exp [\frac{RL (n + 1)}{n + 1}] + \frac{RS (n + 1)}{n + 1} \cdot \exp [\frac{RL (n + 1)}{n + 1}] \cdot \exp [\frac{n}{n + 1} RTL (n)] & (35) \end{matrix}$

Recalling the definition of S(n) from equation (24) gives:

$\begin{matrix} S (n + 1) = \frac{n}{n + 1} \cdot S (n) \cdot \exp [\frac{- 1}{n + 1} RTL (n)] \cdot \exp [\frac{RL (n + 1)}{n + 1}] + \frac{RS (n + 1)}{n + 1} \cdot \exp [\frac{RL (n + 1)}{n + 1}] \cdot \exp [\frac{n}{n + 1} RTL (n)] & (36) \end{matrix}$

Thus, it can be seen that the cost function of equation 2 can be effectively evaluated using equation (36). That is, by suitable initialization of the variables RS(i) and RL(i) and RTL(n) for the first row of the image the cost function S can be calculated on a row by row basis as indicated by equation 36.

In a computer program implementation of the algorithm a for loop process each row of the image in turn. An initial value of RS(i) is calculated for the first row and stored in a variable RowProdSum. An initial value for RL(i) for the first row is also calculated and stored in a variable Row_Sum_Log. A variable S is then initialized and updated for each row with equation 23 being used to update values of RTL as appropriate. A variable wf is set to be equal 1.0/(i+1.0) which corresponds to the term

$\frac{1}{n + 1}$

in the above equations. Appropriate program code is shown in code fragment 1 below.

Two double precision floating point variables, A and RTB (which represents RTL), are used in the line-recursive structure shown below. The line number is denoted by i.

CODE FRAGMENT 1 {Process each image row, starting at row 0. The row number is i} if (i==0) { RTB = row_sum_log; double exp_term = exp(RTB); A = RowProdSum*exp_term; } else { double wf = 1.0/(i+1.0); double exp_term = exp(wf*row_sum_log); double LineProdSumProd = wf*RowProdSum*exp_term; double RTA1_term1 = wf*i*A*exp(−wf*RTB); double RTA1_term2 = exp(wf*i*RTB); A = RTAl_term1*exp_term + LineProdSumProd*RTA1_term2; RTB = wf*(RTB*i + row_sum_log [colour]); } i = i + 1; {when all rows are processed the final cost function value is A}

Application of the method illustrated in FIG. 1 to an example image shown as FIG. 2 is now described.

The image of FIG. 2 is a synthetic image that has been formed to illustrate the method of the described embodiment of the invention. The image comprises two regions with different intensity levels.

Initially an image was generated having an intensity of 30 in an area 1 and intensity of 120 in an area 2. A texture effect was simulated by adding a random number to each pixel. The random number was generated using a Normal distribution with zero mean and a standard deviation equal to one-tenth of the pixel value. The effect of airtight was simulated by adding a constant value of 50 to each pixel. In the image of FIG. 2 the mean value in the area 1 is 80 and the mean value in the area is 150.

FIG. 3 shows a plot of the function S(λ) defined in equation (2) for differing values of λ The values of S(λ) are generated using image data taken from all pixels of FIG. 2, and pixel values after application of an appropriate lowpass filter in the manner described above. It can be seen that S(λ) has a minimum value when λ=53.42. This is close to the accurate value of λ=50, (50 being the simulated airtight). Slight distortion along the edge of two patches results in the estimated offset being slightly offset from the true value of λ=50.

When pixel values in the vicinity of the edge are excluded from evaluation of S(λ), the function has a minimum at λ=49.9, which is very close to the true value. Elimination from the calculation of pixels near a sharp edge in an image can be carried out automatically by computing an edge strength measure (for example the modulus of the gradient) and excluding strong edges from the analysis.

The method set out above will work for any image in which there are at least two regions of different intensity. This is not restrictive in practice.

Equation (2) set out above can be modified for situations where the image contains a known level of random noise. In such a case S(λ) is defined by equation (37) below:

$\begin{matrix} S (λ) = \frac{1}{K} \sum_{k = 1}^{k = K} (\frac{{(p_{k} - \overline{p_{k}})}^{2}}{{(\overline{p_{k}} - λ)}^{2} + \frac{A_{v}}{T_{v}}}) \exp \frac{1}{K} \sum_{k = 1}^{k = K} ({\ln (\overline{p_{k}} - λ)}^{2} + \frac{A_{v}}{T_{v}}) & (37) \end{matrix}$

where:

- K is the number of pixels to be processed;
- p_kis the value of pixel k
- p_k is the value of pixel k after application of the lowpass filter described above;
- A_vdenotes the variance of the additive noise component;
- T_vis a constant related to the image texture;
- λ is the value of the parameter to be determined; and
- S(λ) is the function to be optimised.

Again, the value of λ for which S(λ) is a minimum will be a good estimate of the airlight component present in the image.

The preceding discussion has been concerned with monochrome images. For a colour image the three different colour channels (red, green and blue) may be analysed separately (and corrected separately), using either equation (2) or equation (37) set out above.

However an alternative method is preferred, in which a composite function is used and a three-variable optimisation process carried out. One such composite function is set out as equation (38):

$(38)$ $S (λ_{r}, λ_{g}, λ_{b}) = \frac{1}{K} {\sum_{k = 1}^{k = K} {(\frac{p_{k_{r}} - \overline{p_{r}}}{(\overline{p_{k_{r}}} - λ_{r})})}^{2} + \sum_{k = 1}^{k = K} {(\frac{p_{k_{g}} - \overline{p_{k_{g}}}}{(\overline{p_{k_{g}}} - λ_{g})})}^{2} + \sum_{k = 1}^{k = K} {(\frac{p_{k_{b}} - \overline{p_{k_{b}}}}{(\overline{p_{k_{b}}} - λ_{b})})}^{2}} \cdot \exp {\frac{1}{K} \frac{(\sum_{k = 1}^{k = K} {\ln (\overline{p_{k_{r}}} - λ_{r})}^{2} + \sum_{k = 1}^{k = K} {\ln (\overline{p_{k_{g}}} - λ_{g})}^{2} + \sum_{k = 1}^{k = K} {\ln (\overline{p_{k_{b}}} - λ_{b})}^{2})}{(3)}} .$

where:

- K is the number of pixels to be processed;
- p_k_r, p_k_g, p_k_bare the values of pixel k for the red, green and blue channels respectively;

p_k_r, p_k_g, p_k_b are the values of pixel k for the red, green and blue channels respectively after application of the lowpass filter described above;

- (λ_r, λ_g, λ_b) are the values of the parameters to be determined; and
- S(λ_r, λ_g, λ_b) is the function to be optimised.

As before, subject to certain assumptions, (that the fractional variation in brightness is approximately independent of the actual brightness, and that the fractional variation is similar in the red, green and blue colour channels) the function S(λ_r, λ_g, λ_b) has a stationary point (a minimum) when (λ_r, λ_g, λ_b) indicates the component of each channel attributable to airlight.

The evaluation technique described above can be used for colour images using equation (38) above.

Here, the parameter RS referred to above is replaced by three RS parameters, one for each channel of the image. These are defined according to equations (39), (40) and (41) set out below:

$\begin{matrix} {RS}_{r} (i) = \frac{1}{M} \sum_{m = 1}^{M} \frac{{(P_{r_{im}} - {\tilde{P}}_{r_{im}})}^{2}}{{({\tilde{P}}_{r_{im}} - λ_{r_{i}})}^{2}}, & (39) \\ {RS}_{g} (i) = \frac{1}{M} \sum_{m = 1}^{M} \frac{{(P_{g_{im}} - {\tilde{P}}_{g_{im}})}^{2}}{{({\tilde{P}}_{g_{im}} - λ_{g_{i}})}^{2}}, & (40) \\ {RS}_{b} (i) = \frac{1}{M} \sum_{m = 1}^{M} \frac{{(P_{b_{im}} - {\tilde{P}}_{b_{im}})}^{2}}{{({\tilde{P}}_{b_{im}} - λ_{b_{i}})}^{2}}, & (41) \end{matrix}$

Similarly, the parameter RL referred to above is replaced by three parameters, again one for each channel of the image, as shown in equations (42), (43) and (44):

$\begin{matrix} {RL}_{r} (i) = \frac{1}{3 M} \sum_{m = 1}^{M} {\ln ({\tilde{P}}_{r_{im}} - λ_{r_{i}})}^{2}, & (42) \\ {RL}_{g} (i) = \frac{1}{3 M} \overset{M}{\sum_{m = 1}} {\ln ({\tilde{P}}_{g_{im}} - λ_{g_{i}})}^{2}, & (43) \\ {RL}_{b} (i) = \frac{1}{3 M} \sum_{m = 1}^{M} {\ln ({\tilde{P}}_{b_{im}} - λ_{b_{i}})}^{2}, & (44) \end{matrix}$

Similarly again, the parameter RTL is replaced by three values, one for each channel according to equation (45), (46) and (47) as set out below.

$\begin{matrix} {RTL}_{r} (n) = \frac{1}{n} \sum_{i = 1}^{n} {RL}_{r} (i), & (45) \\ {RTL}_{g} (n) = \frac{1}{n} \sum_{i = 1}^{n} {RL}_{g} (i), & (46) \\ {RTL}_{b} (n) = \frac{1}{n} \sum_{i = 1}^{n} {RL}_{b} (i), & (47) \end{matrix}$

Processing is carried out for a single channel at a time and accordingly a parameter S is again replaced by three parameters according to equations (48), (49) and (50):

$\begin{matrix} S_{r} (n) = \frac{1}{n} [\sum_{i = 1}^{n} {RS}_{r} (i)] \cdot \exp [{RTL}_{r} (n)], & (48) \\ S_{g} (n) = \frac{1}{n} [\sum_{i = 1}^{n} {RS}_{g} (i)] \cdot \exp [{RTL}_{g} (n)], & (49) \\ S_{b} (n) = \frac{1}{n} [\sum_{i = 1}^{n} {RS}_{b} (i)] \cdot \exp [{RTL}_{b} (n)] . & (50) \end{matrix}$

Accordingly, equation 37 can then be rewritten as:

S(λ_R1,λ_G1,λ_B1, . . . ,λ_Rn,λ_Gn,λ_Bn)=S_r(n)·Exp[RTL_g(n)+RTL_b(n)]+

S_g(n)·exp[RTL_b(n)+RTL_r(n)]+

S_b(n)·exp[RTL_r(n)+RTL_g(n)] (51)

And it will be appreciated that values of the variables S and RTL can be computed as in the single channel case described above. In a computational implementation the variable RowProdSum is replaced by a three element array and the variable Row_Sum_Log is again replaced by a three element array. The RTL variable RTB is again replaced by an appropriate three element array.

The component of airtight within an image may vary in dependence upon the distance between the camera, and a point in the scene represented by a pixel. FIG. 4 illustrates a camera 3 positioned to generate an image of a scene. The camera 3 is directed at an angle θ to the horizontal, and has a field of view angle FOV.

It is known that airtight (λ) will typically vary in dependence upon the distance between the camera 3 and the image plane 4, referred to as D, in accordance with equation (52):

λ∝(1−exp(−β_sc(ω)·D) (52)

where β_scis the total volume scattering coefficient for light of wavelength cu. A normalized volume scattering β_sc′(ω) coefficient is defined, such that:

β_sc′(ω)=β_sc(ω)·q, (53)

Where q is the value of the distance D corresponding to the centre of the image. In the described embodiment of the present invention the normalized scattering coefficient β_sc′ is represented by three scalar variables X₀, X₁and X₂, corresponding to nominal wavelengths in the red, green and blue colour bands respectively.

The geometry of the scene illustrated in FIG. 4 is, in general, unknown. Two further scalar variables, X₃and X₄, are therefore defined as follows:

$\begin{matrix} X_{4} = F O V & (54) \\ X_{3} = \frac{F O V}{θ} & (55) \end{matrix}$

That is:

Where FOV is the field of view angle and θ is the angle between the horizontal and the distance from the camera to the center of the image.

$\begin{matrix} θ = \frac{X_{4}}{X_{3}} & (56) \end{matrix}$

All five parameters (X₀, X₁X₂X₃X₄) are determined by multi-variable optimisation.

The distance between the camera and the image plane varies in dependence upon position within the image plane. This position is represented by a variable ν which takes a value of 0.5 at one extreme, −0.5 at another extreme and 0 at the centre.

Applying the sin rule to FIG. 4.

$\begin{matrix} \frac{D}{\sin θ} = \frac{q}{\sin γ} & (57) \end{matrix}$

It can be seen that

γ=180−(θ+φ) (58)

Given that:

φ=−νX₄ (59)

then:

γ=180−(θ−νX₄) (60)

and:

sin γ=sin(θ−νX₄) (61)

So

$\begin{matrix} D = \sin θ \frac{q}{\sin γ} & (62) \\ D = q \frac{\sin θ}{\sin (θ - v X_{4})} & (63) \\ D = q \frac{\sin \frac{X_{4}}{X_{3}}}{\sin X_{4} (\frac{1}{X_{3}} - V)} . & (64) \end{matrix}$

Substituting equation (64) into equation (52) gives the airlight of a pixel for a monochrome image or a single channel (e.g. the red channel) of a colour image;

$\begin{matrix} λ (x_{0} x_{3} x_{4}) = C {1 - \exp [- x_{0} \frac{\sin (x_{4} / x_{3})}{\sin (x_{4} [\frac{1}{x_{3}} - v])}]}, & (65) \end{matrix}$

where C is the constant of proportionality in equation (52). The term λ(X₀, X₃, X₄) then replaces λ in equation (2) or (37). The value of constant C depends on the product of the atmospheric radiance and the camera gain. In general these parameters are not known. An estimate for C may be formed in various ways. One way is to use the highest pixel value in the image.

If the colour version of the equation (equation (38)) is used then X₀is replaced with X₁or X₂for the green and blue channels as appropriate. This term is computed separately for each row of pixels in the image.

When the field of view angle is relatively small, approximately 15 degrees or less, the following approximation may be made.

$\begin{matrix} λ (x_{0} x_{3} x_{4}) \approx λ (x_{0} x_{3}) = C {1 - \exp [- x_{0} \frac{1 / x_{3}}{[\frac{1}{x_{3}} - v]}]} . & (66) \end{matrix}$

This expression involves only one geometrical parameter. This shows that the field of view angle (parameter X₄in equation (67)) has little effect in practice. In the preferred implementation equation (56) is used but with a fixed value for X₄. If the field of view of the camera is not known, then X₄is set to some small value, say 1 degree.

The preceding discussion has been concerned with estimating airlight in static images. It will be appreciated that the techniques described above can be applied to video data comprising a plurality of frames, by processing each frame in turn. When video data is processed in this way care must be taken to ensure that the image processing operations are sufficiently fast so as not to cause artefacts of “jerkiness” in the processed video data. Thus, it may be necessary to compromise image processing quality for the sake of speed.

In accordance with some embodiments of the present invention image processing is carried out using an arrangement illustrated in FIG. 5. Colour input video data 6 in which each pixel of each frame is represented by separate red, green and blue channels is received and passed concurrently to an analysis module 7 and a video processing module 8. The analysis module operates relatively slowly and is configured to generate values for the parameters c and m of equation (1). Determination of c may be carried out, for example, using the method outlined above. Values of the parameters c and in are stored in a coefficients buffer 9.

Not all of the image data may be required for the image analysis. For example the image may be sub-sampled by a factor of two such that analysis is based upon only half of the pixel value. This saves time in the computation. Also the analysis is usually carried out for a pre-defined part of the image in order to avoid borders or screen annotation (which is sometimes used in CCTV installations).

The video processing module 8 operates by processing each pixel of a received frame in accordance with equation (1), reading appropriate values for the parameters c and m from the coefficients buffer 9. Data processed by the video processing module 8 is output to a display screen 10.

Given that the analysis module 7 operates slowly compared to the video processing module 8, the video processing module 8 will typically apply coefficients generated using an earlier frame of video data. Given that changes in airlight tend to happen over a prolonged period of time, this is found in practice to have little effect on image quality.

The analysis module 7 preferably produces values of c and m for each pixel of the processed frame. That is a method taking into account variation of airlight with distance from a camera position such as that described above or that described in EP0839361 is preferably used. In such embodiments of the invention a total of six tables are used, two for each colour channel. Each table has one entry for each pixel in the image. The red, green and blue channels are processed separately and then re-combined to form the final image.

The value of m can be adjusted to provide compensation for uneven illumination in addition to the airlight compensation if required. Several methods to estimate local illumination levels are available in the literature, such as that described in, for example Jobson, J., Rahman, Z., Woodell, G. A., “A multiscale retinex for bridging the gap between colour images and the human observation of scenes”, IEEE Transactions on Image Processing, Vol. 6, Issue 7, July 1997, pp. 965-976.

Illumination estimation and compensation is now briefly described. Illumination compensation is based on an idea that illumination changes slowly, and that estimates of illumination distribution can be generated using a low pass filter. It is known that low pass filters such as the homomorphic filter can be used to estimate illumination for the purposes of image enhancement. This is described in, for example, Jobson, Rahman and Woodell referenced above.

Illumination estimation methods of the type described above are based on the Lambertian model for diffuse reflection. This is described in Anya Hurlbert, “Formal connections between lightness algorithms”, J. Opt. Soc. Am. A, Vol. 3, No. 10, pp. 1684-1693, October 1986. If the visible surfaces in an image have a Lambertian characteristic and the illumination is confined to a narrow range of wavelengths then:

$\begin{matrix} I (x, y) = R (x, y) L (x, y) & (67) \\ R (x, y) = \frac{I (x, y)}{L (x, y)} & (68) \end{matrix}$

where I(x,y) is the intensity value at pixel (x,y), R(x,y) is the scalar reflectance factor and L(x,y) is a scalar irradiance value. The image processing procedure is simple; first form an estimate for L(x,y) and then scale by c/L(x,y), where the constant c is chosen to keep the pixels within the display range.

This model can be generalized to colour images under the assumption that narrow-band filters are used for each colour channel. Even when the spectrum is localized using conventional ROB filters the model defined by (68) gives an approximation to the true response. The benefits of such decomposition include the possibility of removing illumination effects of back/front lighting, enhancing shots that include spatially varying illumination such as images that contain shadows.

Various methods have been proposed for estimation of the illumination function L(x,y). In Akira Suzuki, Akio Shio, Hiroyuki Arai, and Sakuichi Ohtsuka, “Dynamic shadow compensation of aerial images based on color and spatial analysis”, Pattern Recognition, 2000. Proceedings. 15th International Conference on, 3-7 Sep. 2000, Vol. 1, pp. 317-320, a morphological smoothing filter is used. The homomorphic filter, described by Thomas Stockham, Jr. in “Image processing in the context of a visual model,”, Proc. IEEE, Vol. 60, No. 7, pp. 828-842, 1972, is an early illumination compensation technique. The retinex algorithm, described by Jobson, Rahman, and Wooden in “Properties and performance of a centre/surround retinex”, IEEE Transactions on Image Processing, Vol. 6, No. 3, 1997, pp. 451-462, is a more recent method.

The illumination estimation and compensation techniques described above can be applied in the context of a system such as that illustrated in FIG. 5. Specifically, the analysis module 7 can estimate illumination, while the video processing module 8 can process data to perform illumination compensation.

Although preferred embodiments of the invention have been described above, it will be appreciated that various modifications can be made to those embodiments without parting from the spirit and scope of the present invention as defined by the amended claims.

The description set out above has been concerned with image processing techniques concerned with removing the effects of airlight from an image. It will be appreciated that there are other sources of image offsets apart from the airlight effect. In particular image offsets can be generated by the thermal response of a camera—the so-called “dark current”. Offsets can also be generated by dirt on the camera lens or protective window. Additionally noise can be caused by so called “pedestal” errors. Some video standards, including certain variants of the PAL standard, use a non-zero voltage level to represent black. This level is known as the “Pedestal”. If the video decoder makes an inappropriate allowance for the pedestal the effect is to shift all of the brightness levels either up or down. All of these types of offsets can be detected (and mitigated) by the methods described above.

Claims

1. A method of processing a plurality of frames of video data to generate enhanced video data, comprising:

processing a first frame of video data to generate an estimate of noise included within said frame of video data; wherein said noise is at least partially attributable to atmospheric backscattered light;

storing data indicative of said estimate of noise;

processing at least one further frame of video data, which further frame indicates the same scene as said first frame of video data; to remove noise from said further frame of video data by subtracting said estimate of noise included in said first frame of video data and;

outputting an enhanced frame of video data generated using said further frame of video data.

2. A method according to claim 1, wherein said subtracting comprises subtracting said estimate from values of said pixels of said video frame.

3. A method according to claim 2, further comprising rescaling values of said pixels after subtracting said estimate from values of said pixels.

4. A method according to claim 1, wherein processing at least one further frame of video data comprises processing a plurality of further frames of video data using said stored estimate of noise; and

the method comprises outputting a plurality of enhanced frames of video data.

5. A method according to claim 1, wherein said processing said first frame of video data to generate said estimate comprises:

evaluating a function for different values of said estimate, said function taking as input only an estimate of said noise; and

determining an estimate of said noise for which said function has an optimum value.

6. A method according to claim 5, wherein first frame of video data comprises pixel values for each of a plurality of pixels of said frame, and said function takes as input as least a subset of said pixel values.

7. A method according to claim 5, further comprising:

filtering said video data to generate filtered video data comprising filtered pixel values;

wherein said function takes as input at least a subset of said filtered pixel values.

8. A method according to claim 5, wherein said optimum value of said function is a stationary point of said function.

9. A method according to claim 8, wherein said stationary point is a minimum.

10. A method according to claim 5, wherein said function is of the form: S  ( λ ) = 1 K  ∑ k = 0 k < K  ( p k - p k _ p k _ - λ ) 2 · exp  1 K  ∑ k = 0 k < K  ( ln  ( p k _ - λ ) 2 )

where: K is the number of pixels to be processed; pk is the value of pixel k pk is the value of pixel k after application of the lowpass filter described above; λ is the value of the estimate; and S(λ) is the function to be optimized.

11. A method according to claim 5, wherein said optimum value is determined using a numerical analysis technique.

12. A method according to claim 10, wherein said estimate is initially zero.

13. A method according to claim 5, comprising generating a plurality of estimates, each estimate being an estimate of noise in a respective subset of pixels of said video frame.

14. A method according to claim 1, wherein said step of processing at least one further frame comprises processing each pixel of said further video frame to remove said estimate of said noise to generate output pixel values.

15. A method according to claim 14, further comprising:

multiplying said output pixel values by a predetermined coefficient.

16. A method according to claim 1, wherein said noise is at least partially attributable to dirt on a camera lens.

17. A method according to claim 1, wherein said noise comprises an offset applied to at least a subset of pixels of said video frame, said offset being substantially equal for all pixels of said subset pixels of said video frame.

18. A method according to claim 17, wherein said offset is applied to all pixels of said video frame.

19. A method according to claim 1, wherein said noise comprises an offset applied to at least some pixels of said video frame, said offset being determined for particular independence upon a distance between a camera position used to generate said video frame, a point represented by that pixel.

20. A method according to claim 1, wherein said processing said first frame of video data further generates a contrast enhancement parameter.

21. A computer apparatus for processing a plurality of frames of video data comprising:

a program memory containing processor readable instructions and;

a processor configured to read and execute instructions stored in said program memory:

wherein said processor readable instructions comprise instructions configured to cause the computer to carry out a method according to claim 1.