Voice control device and voice control method
A voice control device includes a hearing estimate section configured to estimate hearing of a user based on a sending/received sound ratio representing a ratio of the volume of a sending sound to the volume of a received sound; a compensation-quantity calculating section configured to calculate a compensation quantity for a received signal of the received sound responsive to the estimated hearing; and a compensation section configured to compensate the received signal based on the calculated compensation quantity.
Latest FUJITSU LIMITED Patents:
- SIGNAL RECEPTION METHOD AND APPARATUS AND SYSTEM
- COMPUTER-READABLE RECORDING MEDIUM STORING SPECIFYING PROGRAM, SPECIFYING METHOD, AND INFORMATION PROCESSING APPARATUS
- COMPUTER-READABLE RECORDING MEDIUM STORING INFORMATION PROCESSING PROGRAM, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING APPARATUS
- COMPUTER-READABLE RECORDING MEDIUM STORING INFORMATION PROCESSING PROGRAM, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING DEVICE
- Terminal device and transmission power control method
This application is a continuation application of International Application PCT/JP2011/050017 filed on Jan. 4, 2011 and designated the U.S., the entire contents of which are incorporated herein by reference.
FIELDThe invention relates to a voice control device, a voice control method, a voice control program and a portable terminal device that control received sound.
BACKGROUNDConventionally, there are portable terminal devices that execute control to make received voices easy to hear. For example, there is a technology with which multiple single-tone frequency signals are reproduced for a user to calculate the minimum hearing level based on the user's hearing result for processing a voice (Patent document 1).
Also, there is a technology in which a Lombard effect is utilized so that if a sending sound volume is loud, it is determined that it is noisy in the surroundings to increase the received sound volume, if a sending sound volume is soft, the received sound volume is decreased automatically (Patent document 2).
Also, there is a technology in which an equalizer is provided for emphasizing a voice signal in a specific range, and characteristics of the equalizer are adjusted based on a volume operation by a user (Patent document 3).
RELATED-ART DOCUMENTS Patent Document
- [Patent document 1] Japanese Laid-open Patent Publication No. 07-66767
- [Patent document 2] Japanese Laid-open Patent Publication No. 2004-165865
- [Patent document 3] Japanese Laid-open Patent Publication No. 2010-81523
However, there is a problem with Patent document 1 in that it is not easy to use because a user needs to execute a hearing test that forces a complicated procedure upon the user.
Also, there is a problem with Patent document 2 in that there are cases where sound quality may be bad depending on a user because the received sound volume is determined only by the sending sound volume, hence a hearing characteristic of the user is not taken into account.
Also, there is a problem with Patent document 3 in that voice control may not be carried out during a call because it requires a user to do a volume operation, which is, however, difficult to do during a call.
SUMMARYAccording to one embodiment, a voice control device includes a hearing estimate section configured to estimate hearing of a user based on a sending/received sound ratio representing a ratio of the volume of a sending sound to the volume of a received sound; a compensation-quantity calculating section configured to calculate a compensation quantity for a received signal of the received sound responsive to the estimated hearing; and a compensation section configured to compensate the received signal based on the calculated compensation quantity.
The object and advantages of the embodiment will be realized and attained by means of the elements and combinations particularly pointed out in the claims. It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
First, a relationship between age and hearing will be described. Hearing is, for example, a minimum audible range.
As illustrated in
Here, the Lombard effect will be described. The Lombard effect is an effect that if it is noisy in the surroundings, or it is difficult to hear a voice of a counterpart because the counterpart is a quiet talker, one's speaking voice becomes louder. For example, it has been investigated that if background noise is 50 dBSPL (simply denoted as dB hereafter), the speaking volume becomes 4 dB greater than in a quiet situation (37 dB). As for this investigation, refer to
However, the Lombard effect affects not only volumes of the surrounding noise and the voice of a counterpart, but also the hearing of a listener. If hearing is reduced, it becomes harder to hear a voice of a counterpart, hence the speaking voice tends to become louder. As illustrated in
Therefore, in the following, embodiments will be described in which The Lombard effect is used to obtain a relationship between a received sound volume and a sending sound volume, in which age is estimated to estimate hearing for controlling the received sound to make the received voice easier to hear. The embodiments will be described below with reference to the drawings.
Embodiment<Configuration>
A configuration of a voice control device 1 will be described according to the embodiment.
The time-frequency transform section 101 applies a time-frequency transform to a received signal r(t) of a received sound to obtain a spectrum R(f) according to the following formula (1). The time-frequency transform is, for example, a fast Fourier transform (FFT).
R(f)=Re{R(f)}+j·Im{R(f)} FORMULA 1
f: frequency (f=0, 1, 2, . . . , K−1)
K: Nyquist frequency
Re { }: real part
Im { }: imaginary part
The frequency transform section 101 outputs the obtained spectrum R(f) to the hearing estimate section 103, the spectrum compensation-quantity calculating section 106, and the spectrum compensation section 107.
The time-frequency transform section 102 applies a time-frequency transform to a sending signal s(t) of a sending sound to obtain a spectrum S(f) according to the following formula (2). The time-frequency transform is, for example, a fast Fourier transform (FFT).
S(f)=Re{S(f)}+j·Im{S(f)} FORMULA 2
f: frequency (f=0, 1, 2, . . . , K−1)
K: Nyquist frequency
Re { }: real part
Im { }: imaginary part
The frequency transform section 102 outputs the obtained spectrum S(f) to the hearing estimate section 103 and the noise estimate section 104.
The hearing estimate section 103 estimates a user's hearing based on the received sound volume and the sending sound volume.
The sending/received-sound-ratio calculating section 131 calculates the average electric power of the spectrum R(f) of the received sound and the spectrum S(f) of the sending sound by the following formula.
R_ave: average electric power of the spectrum of the received sound
S_ave: average electric power of the spectrum of the sending sound
The sending/received-sound-ratio calculating section 131 obtains a sending/received sound ratio sp_ratio, for example, from the average power R_ave of the received sound and the average power S_ave of the sending sound by the following formula.
sp_ratio=S_ave/R_ave FORMULA (5)
sp_ratio: sending/received sound ratio
The sending/received-sound-ratio calculating section 131 sets the sending/received sound ratio to the ratio of the volume of the received sound to the volume of the sending sound. The sending/received-sound-ratio calculating section 131 outputs the obtained sending/received sound ratio to the age estimate section 132.
Having obtained the sending/received sound ratio from the sending/received-sound-ratio calculating section 131, the age estimate section 132 estimates the age of a user with referring to information that indicates a relationship between the sending/received sound ratio and age, which is stored beforehand.
For example, a relationship between age and the sending/received sound ratio can be estimated with the following steps:
(1) For each age (or by generation such as teenagers, twenties), estimate sending sound volume with respect to received sound volume (for example, 60 dB) with an examinee.
(2) Obtain the average sending sound volume of all examinees of the ages measured in (1).
(3) Obtain a ratio of the average sending sound volume in (2) to the received sound volume (sending/received sound ratio).
(4) Execute the steps (1) to (3) for other values of the received sound volume (for example, 30 to 80 dB).
Thus, the information that indicates a relationship between age and the sending/received sound ratio can be obtained for the values of the received sound volume. The age estimate section 132 holds the information that indicates a relationship between age and sending/received sound ratio.
Based on the sending/received sound ratio obtained from the sending/received-sound-ratio calculating section 131, the age estimate section 132 estimates an age from the relationship illustrated in
Based on the age obtained from the age estimate section 132, the minimum audible range estimate section 133 estimates a minimum audible range. The minimum audible range estimate section 133 holds a minimum audible range for each generation, based on the relationship illustrated in
As illustrated in
Other than the relationship between generation and a minimum audible range, a hearing reduction quantity for each generation may be used. Also, a minimum audible range or a hearing reduction quantity based on gender may be used. For a difference of hearing characteristics by gender, see p. 72-73 of “Building Environment for Aged People”, edited by Architectural Institute of Japan, 1994 Jan. 10, Shokokusha Publishing Co., Ltd.
Returning to
The noise estimate section 104 compares the average power S_ave of the sending sound and a threshold value TH. If S_ave≧TH, the noise estimate section 104 does not update the noise quantity. If S_ave<TH, the noise estimate section 104 updates the noise quantity by the following formula.
noise_level(f)=α×S(f)+(1−α)×noise_level(f) FORMULA (6)
noise_level(f): noise quantity
α: constant
Here, an initial value of noise_level(f) is arbitrary. For example, the initial value may be set to 0. Also, α is a constant between 0 and 1. α is set to, for example, 0.1.
The threshold value TH may be set from 40 to 50 dB. The threshold value TH is set smaller than a volume of a human voice because the volume of voices in people's conversation is 70 to 80 dB. The noise estimate section 104 outputs the estimated noise quantity to the hearing compensation section 105.
The hearing compensation section 105 compensates hearing (for example, a minimum audible range) based on the minimum audible range obtained from the hearing estimate section 103 and the noise quantity obtained from the noise estimate section 104.
The compensation-quantity calculating section 151 calculates a compensation quantity in response to the noise quantity obtained from the noise estimate section 104. The compensation-quantity calculating section 151 outputs the noise quantity to the minimum audible range compensating section 152.
The minimum audible range compensating section 152 compensates the minimum audible range based on the minimum audible range obtained from the hearing estimate section 103 and the compensation quantity obtained from the compensation-quantity calculating section 151. The minimum audible range compensating section 152, for example, adds the obtained compensation quantity to the obtained minimum audible range.
In the following, a concrete example of compensation of the minimum audible range will be described.
Example 1The compensation-quantity calculating section 151 holds a compensation quantity suited to a noise quantity.
The compensation-quantity calculating section 151 determines which noise level corresponds to the obtained noise quantity with a determination using a threshold value or the like, to obtain a compensation quantity in response to the determination result from the relationship illustrated in
The minimum audible range compensating section 152 adds the compensation quantity obtained from the compensation-quantity calculating section 151 to the minimum audible range obtained from the hearing estimate section 103.
The minimum audible range compensating section 152 obtains a minimum audible range after compensation (C1 illustrated in
The compensation-quantity calculating section 151 calculates a compensation quantity by multiplying the noise quantity, or noise_level(f) which has been obtained from the noise estimate section 104, by constant β. β is a constant set to, for example, 0.1. The compensation-quantity calculating section 151 outputs the calculated compensation quantity to the minimum audible range compensating section 152.
The minimum audible range compensating section 152 obtains a compensated minimum audible range by the following formula.
H′(f)=H(f)+β×noise_level(f) FORMULA (7)
H′(f): minimum audible range after compensation
H(f): minimum audible range before compensation
β: constant
noise_level(f): noise quantity
The minimum audible range compensating section 152 obtains the compensated minimum audible range (D1 illustrated in
Thus, based on the estimated noise, it is possible to compensate a minimum audible range estimated from an age of a user.
Returning to
G(f)=H′(f)−R(f) if R(f)<H′(f),
G(f)=0 if R(f)≧H′(f)
The spectrum compensation-quantity calculating section 106 outputs the obtained spectrum compensation quantity G(f) to the spectrum compensation section 107.
The spectrum compensation section 107 obtains compensated received sound spectrum R′(f) from, for example, the spectrum R(f) of the received sound and the spectrum compensation quantity G(f) by the following formula.
R′(f)=R(f)+G(f) FORMULA (8)
The spectrum compensation-quantity calculating section 106 may compensate the received sound spectrum only within a predetermined frequency band. The predetermined frequency band is, for example, a low frequency band and/or a high frequency band where hearing tends to be reduced. This is because bands where hearing tends to be reduced are known.
Returning to
Thus, the voice control device 1 estimates hearing of a user based on a ratio of volume of sending sound and volume of received sound, and controls the voice in response to the hearing of a user to provide a voice easy to hear for the user automatically.
Also, the voice control device 1 compensates a minimum audible range estimated from the age of the user based on estimated noise to provide a voice even easier to hear for the user.
Here, the noise estimate section 104 and the hearing compensation section 105 are not necessarily required to be configured. In this case, the spectrum compensation-quantity calculating section 106 may calculate the spectrum compensation quantity using the hearing (minimum audible range) estimated by the hearing estimate section 103.
<Operation>
Next, operations of the voice control device 1 will be described according to the embodiment.
At Step S101 illustrated in
At Step S102, based on the calculated sending/received sound ratio, the age estimate section 132 estimates an age from the information that indicates a relationship between the sending/received sound ratio and age.
At Step S103, based on the estimated age, the minimum audible range estimate section 133 estimates a minimum audible range from the information that indicates a relationship between age (or generation) and minimum audible range.
At Step S104, the hearing compensation section 105 compensates the estimated minimum audible range based on noise included in the sending sound. This compensation procedure will be described using
At Step S105, the spectrum compensation-quantity calculating section 106 calculates the compensation quantity of the received sound spectrum so that the received sound becomes greater than the compensated minimum audible range.
At Step S106, the spectrum compensation section 107 compensates the received signal by adding the calculated compensation quantity, or the like.
Thus, it is possible to provide a voice easy to hear for a user during a call in response to the hearing of the user.
At Step S202, the noise estimate section 104 updates the noise quantity using the sending sound spectrum of a current frame by the formula (6).
At Step S203, the hearing compensation section 105 compensates the minimum audible range based on the estimated noise quantity (see
Thus, it is possible to make a voice easier to hear in response to the surrounding noise by compensating the minimum audible range if the noise quantity of the surroundings is loud. Here, according to the embodiment, compensation of the minimum audible range by a noise quantity is not necessarily required to obtain a sufficient effect.
As above, it is possible to execute voice control in response to a user's hearing without forcing a burden on the user according to the embodiment. Also, it is possible to execute voice control suited to a user because the user is not required to carry out a voice control operation, hence voice control can be done automatically during a call.
Also, the processing by the hearing estimate section 103 may be done at a predetermined timing (once a week, once a month, etc.) so that only hearing compensation by a noise quantity is executed usually. This is because it is not required to execute a hearing estimation every time if the user remains unchanged.
Also, the sending/received-sound-ratio calculating section 131 may calculate a sending/received sound ratio if a sending sound and a received sound include sound (voice). A determination of sound inclusion may be done with a known technology.
For example, in Japanese Patent No. 3849116, voice or non-voice is determined for each frame of an input signal, based on a first voice characterizing quantity calculated using electric power, a zero crossing rate, a peak frequency of a power spectrum, pitch cycle, etc., and a second voice characterizing quantity calculated based on a difference of peak frequency of power spectrum only in high order components. This makes it possible to estimate the hearing of a user based on the volume of a sending sound and the volume of a received sound.
Modified ExampleThe antenna 201 sends a radio signal amplified by the sending amplifier, and receives a radio single sent from a base station. The radio section 202 applies a D/A conversion to the sending signal spread by the baseband processing section 203, converts it to a high frequency signal by a quadrature modulation, and amplifies the signal by a power amplifier. The radio section 202 amplifies the received radio signal and applies an A/D conversion to the amplified signal to send it to the baseband processing section 203.
The baseband section 203 executes various baseband processing such as an addition of error correcting codes to sending data, data modulation, spread modulation, despreading of a received signal, determination of receiving environment, determination of a threshold value for each channel signal, error correction decoding, etc.
The control section 204 executes radio control such as sending/receiving of a control signal. Also, the control section 204 executes the voice control program stored in the auxiliary storage section 208 to execute voice control according to the embodiment.
The main memory section 207 is a ROM (Read-Only Memory), a RAM (Random Access Memory) or the like, which is a storage device to store or to temporarily store an OS, or the basic software executed by the control section 204, programs such as application software or the like, and data.
The auxiliary storage section 208 is an HDD (Hard Disk Drive) or the like, which is a storage device to store data related to the application software and the like. For example, the information illustrated in
The terminal interface section 209 executes adapter processing for data, and interface processing with a handset and an external data terminal.
This makes it possible to provide a voice in response to hearing of a user automatically during a call on the portable terminal device 200. Also, it is possible to implement the voice control device 1 according to the embodiments as one or multiple semiconductor integrated circuits in the portable terminal device 200.
Also, the disclosed technology can be implemented not only on the portable terminal device 200, but also on other devices. In the modified example, although the example is explained in which the voice control device is implemented on a portable terminal device, the voice control device described above or the voice control procedures described above are applicable, for example, to a TV telephone conference device and an information processing device having a telephone function, a fixed telephone, and the like.
Also, it is possible to have a computer execute voice control described in the above embodiment by recording a program implementing voice control processing according to the above embodiment in a recording medium.
Also, it is possible to implement the above voice control processing by recording the program on a recording medium and having a computer or a portable terminal device read the recording medium on which the program is recorded. Here, various types of recording media can be used including a recording medium that records information optically, electrically, or magnetically such as a CD-ROM, a flexible disk, an optical magnetic disk and the like, and a semiconductor memory and the like that records information electrically such as a ROM, a flash memory, and the like.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to a showing of the superiority and inferiority of the invention. Although the embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Claims
1. A voice control device comprising:
- a hearing estimate section configured to estimate hearing of a user based on a sending/received sound ratio representing a ratio of the volume of sending sound to the volume of received sound, the sending sound being sent by the user in a form of a sending signal, and the received sound being received by the user in a form of a received signal;
- a compensation-quantity calculating section configured to calculate a compensation quantity for the received signal of the received sound responsive to the estimated hearing; and
- a compensation section configured to compensate the received signal based on the calculated compensation quantity.
2. The voice control device as claimed in claim 1, wherein the hearing estimate section estimates an age of a user from the sending/received sound ratio for estimating a minimum audible range based on the estimated age,
- wherein the compensation-quantity calculating section obtains the compensation quantity for the received signal so that the received signal is compensated to be greater than the estimated minimum audible range.
3. The voice control device as claimed in claim 2, further comprising:
- a noise estimate section configured to estimate a noise quantity from the sending sound; and
- a hearing compensation section configured to compensate the minimum audible range based on the estimated noise quantity,
- wherein the compensation-quantity calculating section obtains the compensation quantity for the received signal so that the received signal is compensated to be greater than the estimated minimum audible range.
4. The voice control device as claimed in claim 1, wherein the hearing estimate section determines whether the received sound and the sending sound include sound, then obtains the sending/received sound ratio for the received sound and the sending sound if the received sound and the sending sound include the sound.
5. The voice control device as claimed in claim 3, wherein the noise estimate section determines whether the received sound includes no sound, then updates the noise quantity if the received sound includes no sound.
6. A method for controlling voice in a voice control device, comprising:
- estimating hearing of a user based on a sending/received sound ratio representing a ratio of the volume of sending sound to the volume of received sound, the sending sound being sent by the user in a form of a sending signal, and the received sound being received by the user in a form of a received signal;
- calculating a compensation quantity for the received signal of the received sound responsive to the estimated hearing; and
- compensating the received signal based on the calculated compensation quantity.
7. A non-transitory computer-readable recording medium having a program stored therein for causing a computer to execute a method for controlling voice, the method comprising:
- estimating hearing of a user based on a sending/received sound ratio representing a ratio of the volume of sending sound to the volume of received sound, the sending sound being sent by the user in a form of a sending signal, and the received sound being received by the user in a form of a received signal;
- calculating a compensation quantity for the received signal of the received sound responsive to the estimated hearing; and
- compensating the received signal based on the calculated compensation quantity.
5777664 | July 7, 1998 | Sakata et al. |
8560308 | October 15, 2013 | Endo et al. |
20060088154 | April 27, 2006 | Mukhtar et al. |
20070198263 | August 23, 2007 | Chen |
20110135105 | June 9, 2011 | Yano |
06-217398 | August 1994 | JP |
07-066767 | March 1995 | JP |
08-163121 | June 1996 | JP |
08-223256 | August 1996 | JP |
2000-209698 | July 2000 | JP |
2004-165865 | June 2004 | JP |
2004-235708 | August 2004 | JP |
2009-171189 | July 2009 | JP |
2010-28515 | February 2010 | JP |
2010-081523 | April 2010 | JP |
2010/035308 | April 2010 | WO |
- International Search Report, mailed in connection with PCT/JP2011/050017 and mailed Feb. 8, 2011.
- Extended European Search Report dated Nov. 18, 2015 for corresponding European Patent Application No. 11855034.2, 8 pages.
Type: Grant
Filed: Jun 21, 2013
Date of Patent: Feb 23, 2016
Patent Publication Number: 20130279709
Assignee: FUJITSU LIMITED (Kawasaki)
Inventors: Masanao Suzuki (Yokohama), Takeshi Otani (Kawasaki), Taro Togawa (Kawasaki), Chisato Ishikawa (Kawasaki)
Primary Examiner: Alexander Jamal
Application Number: 13/924,071
International Classification: H03G 3/20 (20060101); H04R 25/00 (20060101); A61B 5/12 (20060101); A61B 5/00 (20060101); G10L 21/0364 (20130101); G10L 21/057 (20130101);