Three dimensional reference image segmenting method and device and object discrimination system

Info

Patent number: 5917940
Type: Grant
Filed: Jan 23, 1997
Date of Patent: Jun 29, 1999
Assignee: NEC Corporation (Tokyo)
Inventors: Kenji Okajima (Tokyo), Masanobu Miyashita (Tokyo)
Primary Examiner: Joseph Mancuso
Assistant Examiner: Gilberto Frederick, II
Law Firm: Foley & Lardner
Application Number: 8/787,928

Abstract

In a three-dimensional reference image segmenting method and device, a two-dimensional image of a reference object and a shape data of a pattern obtained by transforming the image is stored together with depth data of the reference object in a memory as a reference pattern. On the basis of a local Fourier transform image date of an input image supplied from an image transform unit and the reference data of the reference pattern read out from the memory, a deform amount estimating unit calculates the amount of deform (displacement vector) required to make both the images coincident with each other to a possible extent. An inverse Fourier transform unit generates a deformed reference image by an local inverse Fourier transform based on the displacement vector and the data of the local Fourier transform image data of the reference pattern. On the basis of the deformed reference image thus generated, an image segmentation unit extracts a reference object image from within the input image, and the reference object image thus extracted is outputted from an image output unit.

Claims

1. A three-dimensional reference image segmenting method comprising the steps of:

dividing each of left and right images respectively corresponding to a left eye view and a right eye view obtained by looking a reference image to be segmented, by a left eye and a right eye, respectively, into small image domains permitted to be overlapped to each other;

executing a Fourier transform for each of said small image domains to calculate a local Fourier transform image data;

calculating and storing a power spectrum pattern from a sum, including a phase difference, of said local Fourier transform image data of said left and right images;

calculating and storing on the basis of said power spectrum pattern a parallax between said left and right images for each of said small image domains;

dividing one of left and right images respectively corresponding to a left eye view and a right eye view obtained by looking an input image containing said reference image, by a left eye and a right eye, respectively, into small image domains permitted to be overlapped to each other;

executing a Fourier transform for each of said small image domains of said input image to calculate a local Fourier transform image data;

estimating a difference in paralax and in power spectrum pattern between said reference image and an image to be extracted within said input image, by using as a constraint condition the parallax between said left and right images of said reference image for each local image domain, on the basis of said local Fourier transform image data of said input image, said power spectrum pattern stored of said reference image and said parallax stored between said left and right images of said reference image for each local image domain; and

on the basis of the result of the above estimation, extracting from said input image only an image having a power spectrum pattern similar to that of said reference image.

2. A three-dimensional reference image segmenting device comprising:

an image input means receiving left and right images respectively corresponding to a left eye view and a right eye view obtained by looking a reference image to be segmented, by a left eye and a right eye, and also receiving one of left and right images respectively corresponding to a left eye view and a right eye view obtained by looking an input image containing said reference image, by a left eye and a right eye, respectively;

a local Fourier transform means receiving said left and right images from said image input means, for dividing each of said left and right images into small image domains permitted to be overlapped to each other, and executing a Fourier transform for each of said small image domains to calculate a local Fourier transform image data;

a power spectrum pattern calculating means for calculating a power spectrum pattern from a sum, including a phase difference, of said local Fourier transform image data of said left and right images of said reference image outputted from said local Fourier transform means;

a local parallax calculating means for calculating, on the basis of said power spectrum pattern outputted from said power spectrum pattern calculating means, a parallax between said left and right images of said reference image for each of said small image domains;

a memory means receiving and storing said power spectrum pattern outputted from said power spectrum pattern calculating means and said parallax for each of said small image domains outputted from said local parallax calculating means;

a same image predicting means receiving, when said input image containing said reference image is inputted to said image input means, said local Fourier transform image data of said input image, outputted from said local Fourier transform means, and said power spectrum pattern of said reference image and said parallax between said left and right images of said reference image for each of said small image domains, outputted from said memory means, said same image predicting means estimating a difference in parallax and in power spectrum pattern between said reference image and an image to be extracted within said input image, by using as a constraint condition the parallax between said left and right images of said reference image for each local image domain, and calculating a local Fourier transform image of said image to be extracted within said input image;

a local inverse Fourier transform means receiving from said same image predicting means said local Fourier transform image of said image to be extracted within said input image, for executing a local inverse Fourier transform to the received local Fourier transform image of said image to be segmented, to output data of left and right images of aid image to be segmented; and

an average calculating and outputting means for calculating and outputting a geometric average between said data of left and right images outputted from said local inverse Fourier transform means and data of said input image containing said reference image outputted from said image input means.

3. A three-dimensional reference image segmenting device claimed in claim 2 wherein said same image predicting means calculates the local Fourier transform images F.sub.L '(k,X) and F.sub.R '(k,X) for the image to be extracted within the input image, and which minimize the energy functions E.sub.L and E.sub.R expressed by the following equations: ##EQU16## where F.sub.L (k,X) and F.sub.R (k,X) are a local Fourier transform image of said left image and said eye image of said input image, respectively,

P.sub.B (k,X,.delta..PHI.) is the power spectrum pattern of said reference image,

.DELTA..sup.L (X) and.DELTA..sup.R (X) are a parallax for each local image domain between said left image and said eye image of said reference image,

".theta." is a fitting parameter for compensating a positional difference in a depth direction between said reference image and said image to be extracted within said input image,

.lambda..sub.L and.lambda..sub.R are fitting parameters for compensating a difference in strength between said reference image and said image to be extracted within said input image,

4. A three-dimensional reference image segmenting device comprising: a reference data generating means receiving left and right images respectively corresponding to a left eye view and a right eye view obtained by looking a reference object to be segmented, by a left eye and a right eye, and also dividing each of said left and right images into small image domains permitted to be overlapped to each other, and calculating a depth data of said reference object on the basis of parallax between said left and right images for each of said small image domains, said reference data generating means also generating a shape data concerning a predetermined one image of said received left and right images;

a memory means receiving and storing said depth data and shape data generated by said reference data generating means;

an image input means receiving an input image to be processed;

a reference image deform means receiving said input image supplied from said memory means, and said shape data read out from said memory, and for deforming the reference image to make the reference image coincident with said input image to a possible maximum extent using as a constraint condition said depth data read out from said memory;

an image segmentation means receiving said deformed reference image outputted from said reference image deform means and said input image supplied from said image input means, for segmenting said input image with reference with said deformed reference image; and

an image output means for outputting the result of the segmentation obtained in said image segmentation means.

5. A three-dimension reference image segmenting device claimed in claim 4 wherein said reference data generating means generates, as said shape data, data of said predetermined one image of said received left and right images, or data obtained by executing a transform processing to said predetermined one image of said received left and right images.

6. A three-dimensional reference image segmenting device claimed in claim 4 wherein said reference data generating means comprises:

an image transform means for executing a Fourier transform to said input image supplied from said image input means, for each of said small image domains, to calculate a local Fourier transform image;

a deform amount estimating means receiving said local Fourier transform image supplied from said image transform means and said shape data read out from said memory, for examining, for each of said small image domains, by what amount of displacement of said reference image and to what extent said reference image coincides with said input image, and for calculating and selecting, as a temporary displacement vector, for each of said small image domains, a displacement vector which makes said reference image coincident with said input image to a maximum extent, from displacement vectors, for each small image domain, generated when said reference object is rotated or displaced, by using as a constraint condition said depth data road out from said memory; and

an inverse transform means receiving said displacement vector for each small image domain outputted from said deform amount estimating means and said shape data read out from said memory, for executing said shape data with reference with said displacement vector, so as to generating said deformed reference image.

7. A three-dimensional reference image segmenting device claimed in claim 6 wherein said reference data generating means comprises:

a reference data generating image input means receiving left and right images respectively corresponding to a left eye view and a right eye view obtained by looking said reference objected to be segmented, by a left eye and a right eye, respectively;

a reference data generating image transform means for dividing each of said left and right images supplied from said reference data generating image input means, into small image domains permitted to be overlapped to each other, and executing a Fourier transform for each of said small image domains to calculate a local Fourier transform image data;

a local parallax calculating means for calculating, on the basis of said local Fourier transform image data for each of said left and right images, outputted from said reference data generating image transform means, a parallax between said left and right images of said reference image for each of said small image domains, which maximizes an interdependent correlation function between between said left and right images; and

a reference data writing means for writing, as said shape data into said memory, the local Fourier transform image data of one image, selected from said local Fourier transform image data of said left and right images calculated by said reference data generating image transform means, said reference data writing means writing, as said depth data into said memory, said parallax calculated in said local parallax calculating means.

8. A three-dimensional reference image segmenting device claimed in claim 4 wherein said reference data generating means comprises:

a reference data generating image input means receiving left and right images respectively corresponding to a left eye view and a right eye view obtained by looking said reference objected to be segmented, by a left eye and a right eye, respectively;

a reference data generating image transform means for dividing each of said left and right images supplied from said reference data generating image input means, into small image domains permitted to be overlapped to each other, and executing a Fourier transform for each of said small image domains to calculate a local Fourier transform image data;

a local parallax calculating means for calculating, on the basis of said local Fourier transform image data for each of said left and right images, outputted from said reference data generating image transform means, a parallax between said left and right images of said reference image for each of said small image domains, which maximizes an interdependent correlation function between between said left and right images; and

a reference data writing means for writing, as said shape data into said memory, the local Fourier transform image data of one image, selected from said local Fourier transform image data of said left and right images calculated by said reference data generating image transform means, said reference data writing means writing, as said depth data into said memory, said parallax calculated in said local parallax calculating means.

9. An object discrimination system comprising: a reference data generating means receiving a plurality of pairs of left and right images each pair respectively corresponding to a left eye view and a right eye view obtained by looking each of a plurality of reference objects to be discriminated, by a left eye and a right eye, and also dividing each of said left and right images into small image domains permitted to be overlapped to each other, and calculating a depth data of each of said reference objects on the basis of parallax between said left and right images for each of said small image domains, said reference data generating means also generating a shape data of each of said reference objects on the basis of a predetermined one image of each pair of left and right images of said received plurality of pairs of left and right images;

a memory means receiving and storing said depth data and shape data of said plurality of reference objects, generated by said reference data generating means;

an image input means receiving an input image to be processed;

a reference image deform means receiving said input image supplied from said memory means, and said shape data of said plurality of reference objects, read out from said memory, and for executing a pattern matching processing between said input image and each of said shape data of said plurality of reference objects, to calculate the degree of similarity between said input image and each of said shape data of said plurality of reference objects, said reference image deform means also selecting, as candidate patterns, reference objects of a predetermined number counted from a reference object having the highest degree of similarity in the order of the degree of similarity, or reference objects having the degree of similarity larger than a predetermined threshold, said reference image deform means farther deforming each of said candidate patterns to make the candidate pattern coincident with said input image to a possible maximum extent using as a constraint condition said depth data read out from said memory;

an image segmentation means receiving said deformed reference image of each of said candidate patterns, outputted from said reference image deform means, and said input image supplied from said image input means, for segmenting said input image with reference with said deformed reference image of each of said candidate patterns; and

a pattern discriminating means receiving the result of the segmentation outputted from said image segmentation means and said deformed reference image of each of said candidate patterns, outputted from said reference image deform means, for calculating the degree of final similarity between the extracted input image outputted from said image segmentation means and said deformed reference image of each of said candidate patterns, said pattern discriminating means outputting, when a highest one of the calculated degrees of final similarity exceeds a predetermined value, the result of discrimination indicating the pattern having said highest degree of final similarity.

10. A object discrimination system claimed in claim 9 wherein each of said reference data generating means and said image input means outputs executes a convolution integration based on a DoG function, to the received image, so as to process the convolutely integrated image data as a received image.

11. A object discrimination system claimed in claim 9 wherein said reference data generating means generates as said shape data, data of said predetermined one image of said received left and right images, or data obtained by executing a transform processing to said predetermined one image of said received left and right images.