DETECTION CIRCUIT AND ASSOCIATED DETECTION METHOD
The present invention provides a detection circuit including a neural network module and a calculation circuit is disclosed. The neural network module is configured to receive an image to generate an output tensor, wherein the output tensor includes position information of a specific object and distance adjustment information. The calculation circuit is coupled to the neural network module, and is configured to calculate an initial distance between an image capture device and the specific object according to the position information of the specific object, and generate an estimated distance according to the initial distance and the distance adjustment information.
Latest Realtek Semiconductor Corp. Patents:
- Method and apparatus of image compression with bit rate control
- Multi-link device and method of switching operation mode of multi-link device
- Method for establishing variation model related to circuit characteristics for performing circuit simulation, and associated circuit simulation system
- Computing device and computing method for computing packet transmission time
- SIGNAL TRANSPORTING SYSTEM AND SIGNAL TRANSPORTING METHOD
The present invention relates to a detection circuit comprising a neural network module.
2. Description of the Prior ArtThe current distance detection device mainly calculates a distance between an object and the device by detecting a time difference between a transmission signal and a reflected signal. However, due to the high production cost of high-precision distance detection device, it is difficult to popularize and apply the distance detection device in portable electronic devices.
Recently, with the popularity of artificial intelligence, the technology of using a single camera with deep learning to predict the distance between an object and the camera has begun to develop. One of the techniques is to combine object detection and monocular depth estimation models to obtain a depth map to obtain the distance between the object and the camera. However, the depth map obtained by the above models is easily affected by factors such as obstruction, light, color, etc., so the calculated distance between the object and the camera will have a large error. In addition, another kind of technology is multi-stage model prediction, that is, using multiple different models in sequence to process the image captured by the camera to obtain the distance between the object and the camera. However, multi-stage model prediction requires a lot of memory space and takes a long time for calculation, so it is not suitable for electronic devices with limited performance.
SUMMARY OF THE INVENTIONIt is therefore an objective of the present invention to provide a distance detection circuit and associated electronic device, which can process images captured by a single camera by using a single model to obtain the distance between an object in the image and the camera, to solve the problem described in the prior art.
According to one embodiment of the present invention, a detection circuit comprising a neural network module and a calculation circuit is disclosed. The neural network module is configured to receive an image to generate an output tensor, wherein the output tensor comprises position information of a specific object and distance adjustment information. The calculation circuit is coupled to the neural network module, and is configured to calculate an initial distance between an image capture device and the specific object according to the position information of the specific object, and generate an estimated distance according to the initial distance and the distance adjustment information.
According to one embodiment of the present invention, a detection method comprises the steps of: using a neural network module to receive an image to generate an output tensor, wherein the output tensor comprises position information of a specific object and distance adjustment information; and calculating an initial distance between an image capture device and the specific object according to the position information of the specific object; and generating an estimated distance according to the initial distance and the distance adjustment information.
These and other objectives of the present invention will no doubt become obvious to those of ordinary skill in the art after reading the following detailed description of the preferred embodiment that is illustrated in the various figures and drawings.
In this embodiment, the neural network module 122 can be a convolutional neural network (CNN) module, such as the known YOLO object detection model, which includes multiple convolutional layers and at least one fully connected layer, which are used to process an image to generate an output tensor, wherein the output tensor may be a matrix including a plurality of elements.
It is noted that since the position information, confidence information and class information shown in
In this embodiment, first, the calculation circuit 124 calculates the initial distance between the image capture device 110 and the specific object according to the width ‘w’ and height ‘h’ of the region 202 in the position information. Specifically, the calculation circuit 124 can use the following formula (1) to calculate the initial distance between the image capture device 110 and the specific object:
In the above formula, ‘k’ is the initial distance between the image capture device 110 and the specific object, ‘αx’ is a focal length of the image capture device 110 in the horizontal direction, ‘αy’ is a focal length of the image capture device 110 in the vertical direction, ‘wreal’ is a default width of the specific object, ‘hreal’ is the default height of the specific object, ‘wimg’ is the width of the region 202 (i.e., the width ‘w’ in the position information of the output tensor), and ‘himg’ is the height of the region 202 (i.e., the height ‘h’ in the position information of the output tensor). In this embodiment, ‘αx’ and ‘αy’ are known parameters of the image capture device 110, and if the specific object is a person, then ‘wreal’ can be a default width of a normal person, such as 0.6 meters, and ‘hreal’ can be a default height of the normal person, such as 1.7 meters.
It is noted that the above formula (1) is only used as an example rather than a limitation of the present invention. As long as the calculation circuit 124 can calculate the initial distance between the image capture device 110 and the specific object according to the height and width of the region 202, the formula (1) can be changed appropriately.
Since ‘wreal’ and hreal’ used in equation (1) are the default width and default height of the normal person and are fixed values, the initial distance ‘k’ calculated by the formula (1) will vary depending on the different posture and action of the person. Therefore, the calculation circuit 124 additionally adjusts the above-mentioned initial distance ‘k’ according to the distance adjustment information ‘r’ in the output tensor to obtain the estimated distance between the image capture device 110 and the specific object, such as ‘d’ shown in
It is noted that the above formula (2) is only used as an example rather than a limitation of the present invention.
As mentioned above, since the detection circuit 120 can calculate the estimated distance between the image capture device 110 and the specific object through the output tensor generated by the one-stage neural network module, and the calculated estimated distance has high accuracy, so the distance detection of the specific object can be accurately performed while saving the performance of the electronic device 100.
In order to make the distance adjustment information ‘r’ in the output tensor to accurately and effectively adjust the initial distance ‘k’ to obtain the estimated distance ‘d’, the training targets used by the neural network module 122 in the training phase includes the distance adjustment information calculated according to a real position of the person and a real distance between the person and the image capture device 110, so that the distance adjustment information ‘r’ in the output tensor reflects the posture or action of the person.
Specifically, as shown in
As mentioned above, by using images with different actions/postures of persons for training in the training phase of the neural network module 122, and using the distance adjustment information r1, r2, r3, . . . etc. as training targets, the distance adjustment information ‘r’ in the output tensor generated by the neural network module 122 in formal use can reflect the posture or action of the person, so as to accurately calculate the estimated distance between the image capture device 110 and the person.
Step 400: The flow starts.
Step 402: Use a neural network module to receive an image to generate an output tensor, wherein the output tensor comprises position information of a specific object and distance adjustment information.
Step 404: Calculate an initial distance between an image capture device and the specific object according to the position information of the specific object.
Step 406: Generate an estimated distance according to the initial distance and the distance adjustment information.
Those skilled in the art will readily observe that numerous modifications and alterations of the device and method may be made while retaining the teachings of the invention. Accordingly, the above disclosure should be construed as limited only by the metes and bounds of the appended claims.
Claims
1. A detection circuit, comprising:
- a neural network module, configured to receive an image to generate an output tensor, wherein the output tensor comprises position information of a specific object and distance adjustment information; and
- a calculation circuit, coupled to the neural network module, configured to calculate an initial distance between an image capture device and the specific object according to the position information of the specific object, and generate an estimated distance according to the initial distance and the distance adjustment information.
2. The detection circuit of claim 1, wherein the specific object is a person, the output tensor comprises the position information of the person and the distance adjustment information; and the calculation circuit calculates the initial distance between the image capture device and the specific object according to the position information, default width and default height of the person.
3. The detection circuit of claim 2, wherein the calculation circuit multiplies the distance adjustment information by the initial distance to generate the estimated distance.
4. The detection circuit of claim 2, wherein during a training phase of the neural network module, the neural network module receives a plurality of training images; and for each of the plurality of training images, the neural network module receives the training image to generate a training output tensor, and a loss function operation is performed on the training output tensor and a training parameter for calibrating parameters in the neural network module; wherein the plurality of training images respectively comprise persons with different actions/postures.
5. The detection circuit of claim 4, wherein the training parameter comprises another distance adjustment information calculated according to a real distance between the image capture device and the person and real position information of the person, wherein the another distance adjustment information serves as a training target in the training phase of the neural network module.
6. A detection method, comprising:
- using a neural network module to receive an image to generate an output tensor, wherein the output tensor comprises position information of a specific object and distance adjustment information; and
- calculating an initial distance between an image capture device and the specific object according to the position information of the specific object; and
- generating an estimated distance according to the initial distance and the distance adjustment information.
7. The detection method of claim 6, wherein the specific object is a person, the output tensor comprises the position information of the person and the distance adjustment information; and the step of calculating the initial distance between the image capture device and the specific object according to the position information of the specific object comprises:
- calculating the initial distance between the image capture device and the specific object according to the position information, default width and default height of the person.
8. The detection method of claim 7, wherein the step of generating the estimated distance according to the initial distance and the distance adjustment information comprises:
- multiplying the distance adjustment information by the initial distance to generate the estimated distance.
9. The detection method of claim 7, further comprising:
- during a training phase of the neural network module, receiving a plurality of training images, wherein the plurality of training images respectively comprise persons with different actions/postures;
- for each of the plurality of training images, using the neural network module to receive the training image to generate a training output tensor; and
- performing a loss function operation on the training output tensor and a training parameter, for calibrating parameters in the neural network module.
10. The detection method of claim 9, wherein the training parameter comprises another distance adjustment information calculated according to a real distance between the image capture device and the person and real position information of the person, wherein the another distance adjustment information serves as a training target in the training phase of the neural network module.
Type: Application
Filed: Aug 10, 2023
Publication Date: Aug 1, 2024
Applicant: Realtek Semiconductor Corp. (HsinChu)
Inventors: Chih-Yuan Koh (HsinChu), Shih-Tse Chen (HsinChu)
Application Number: 18/232,364