Audio coding/decoding method and apparatus using excess quantization information
There is provided an audio coding device which appropriately sets the quantization bit number by a small calculation amount in each stage when coding an input audio signal by performing multi-stage normalization/quantization. A quantization information calculation section determines total quantization information idwl0, based on normalization information idsf, and allocates the total quantization information idwl0 for quantization information idwl1 and quantization information idwl2. At this time, the quantization information calculation section limits the quantization information idwl1 by a limiter lim1, and allocates the total quantization information idwl0 for quantization information idwl1. If the quantization information idwl1 exceeds the limiter lim1, the excess is allocated for the quantization information idwl2. A first normalization section and a first quantization section normalizes and quantizes a frequency spectrum mdspec1 in the first stage. A second normalization section and a second quantization section normalizes and quantizes a differential frequency spectrum mdspec2 in the second stage.
Latest SONY CORPORATION Patents:
- Communication control apparatus, communication apparatus, communication control method, communication method, communication control program, communication program, and communication system
- IC card, processing method, and information processing system
- Information processing device and information processing method
- Survey marker, image processing apparatus, image processing method, and program
- Tactile presentation apparatus and tactile presentation system
The present invention application is a reissue of U.S. Pat. No. 8,521,522, which issued from U.S. Ser. No. 11/381,791 filed on May 5, 2006 and contains subject matter related to Japanese Patent Application JP 2005-137667, filed in the Japanese Patent Office on May 10, 2005, the entire contents content of which being incorporated herein by reference.
A continuation reissue application was also filed on Feb. 16, 2017 and assigned U.S. Ser. No. 15/434,964.
The present application claims the benefit of U.S. Pat. No. 8,521,522 and priority to JP 2005-137667.
BACKGROUND OF THE INVENTION1. Field of the Invention
The present invention relates to an audio coding device and a method thereof by which an inputmdspec audio signal is coded according to so-called transform coding and an obtained code string is transferred or recorded onto a recording medium, and also relates to an audio decoding device and a method thereof by which a code string transferred or red from a recording medium is decoded to obtain an output audio signal.
2. Description of the Related Art
There has been a known method in which spectrums obtained by performing time-frequency transform on an input audio signal are subjected to normalization/quantization and differential frequency spectrums as quantization errors are subjected again to normalization/quantization (see Patent Documents 1 and 2: Japanese Patent Publications No. 3227945 and No. 3227948). Quantization accuracy of the audio coding device can be improved by this method, and scalability can be realized to fit performance and use environment of the audio decoding device.
SUMMARY OF THE INVENTIONHowever, no method has been established yet at present to appropriately set the quantization bit number by a small calculation amount in each of multiple stages in case where multistage normalization/quantization is realized according to the known technology including the above patent publications.
The present invention has been proposed in view of the situation of known technology as described above. It is desirable to provide an audio coding device and a method thereof, which are capable of appropriately setting the quantization bit number in each stage by a small calculation amount when coding an input audio signal by performing multistage normalization/quantization, and an audio decoding device and a method thereof, which obtain an output audio signal by decoding a code string obtained by the audio coding device.
According to an embodiment of the present invention, there is provided an audio coding device including: a time-frequency transform means for performing time-frequency transform on an input audio signal to generate a frequency spectrum; quantization information calculation means for generating total quantization information indicating a quantization bit number on the basis of predetermined normalization information, and for allocating the total quantization information, to generate first quantization information and second quantization information each indicating a quantization bit number; a first normalization means for normalizing the frequency spectrum for every frequency component by use of a first normalization coefficient corresponding to the normalization information, to generate a normalized frequency spectrum; a first quantization means for linearly quantizing the normalized frequency spectrum by use of a first quantization coefficient corresponding to the first quantization information, to generate a quantized frequency spectrum; a subtraction means for subtracting, from the frequency spectrum, a frequency spectrum obtained by inversely quantizing and inversely normalizing the quantized frequency spectrum, to generate a differential frequency spectrum; a second normalization means for normalizing the differential frequency spectrum by use of a second normalization coefficient corresponding to the normalization information and the first quantization information, to generate a differential normalized frequency spectrum; a second quantization means for linearly quantizing the differential normalized frequency spectrum by use of a second quantization coefficient corresponding to the second quantization information, to generate a differential quantized frequency spectrum; and a code string coding means for coding the normalization information, the first quantization information, the second quantization information, the quantized frequency spectrum, and the differential quantized frequency spectrum, to output a code string, wherein the quantization information calculation means sets a predetermined limit to the first quantization information, allocates the total quantization information for the first quantization information, and allocates an excess beyond the predetermined limit, for the second quantization information, to generate the first quantization information and the second quantization information.
According to an embodiment of the present invention, there is provided an audio coding method including: a time-frequency transform step of performing time-frequency transform on an input audio signal to generate a frequency spectrum; a quantization information calculation step of generating total quantization information indicating a quantization bit number on the basis of predetermined normalization information, and of allocating the total quantization information, to generate first quantization information and second quantization information each indicating a quantization bit number; a first normalization step of normalizing the frequency spectrum for every frequency component by use of a first normalization coefficient corresponding to the normalization information, to generate a normalized frequency spectrum; a first quantization step of linearly quantizing the normalized frequency spectrum by use of a first quantization coefficient corresponding to the first quantization information, to generate a quantized frequency spectrum; a subtraction step of subtracting, from the frequency spectrum, a frequency spectrum obtained by inversely quantizing and inversely normalizing the quantized frequency spectrum, to generate a differential frequency spectrum; a second normalization step of normalizing the differential frequency spectrum by use of a second normalization coefficient corresponding to the normalization information and the first quantization information, to generate a differential normalized frequency spectrum; a second quantization step of linearly quantizing the differential normalized frequency spectrum by use of a second quantization coefficient corresponding to the second quantization information, to generate a differential quantized frequency spectrum; and a code string coding step of coding the normalization information, the first quantization information, the second quantization information, the quantized frequency spectrum, and the differential quantized frequency spectrum, to output a code string, wherein in the quantization information calculation step, a predetermined limit is set to the first quantization information, the total quantization information is allocated for the first quantization information, and an excess beyond the predetermined limit is allocated for the second quantization information, to generate the first quantization information and the second quantization information.
According to an embodiment of the present invention, there is provided an audio coding device including: a time-frequency transform means for performing time-frequency transform on an input audio signal, to generate a frequency spectrum; a quantization information calculation means for generating total quantization information indicating a quantization bit number on the basis of predetermined normalization information, and for allocating the total quantization information, to generate first quantization information and second quantization information each indicating a quantization bit number; a first normalization means for normalizing the frequency spectrum for every frequency component by use of a first normalization coefficient corresponding to the normalization information, to generate a normalized frequency spectrum; a first quantization means for linearly quantizing the normalized frequency spectrum by use of a first quantization coefficient corresponding to the first quantization information, to generate a quantized frequency spectrum; a subtraction means for subtracting, from the frequency spectrum, a frequency spectrum obtained by inversely quantizing and inversely normalizing the quantized frequency spectrum, to generate a differential frequency spectrum; a second normalization means for normalizing the differential frequency spectrum by use of a second normalization coefficient corresponding to the normalization information and the first quantization information, to generate a differential normalized frequency spectrum; a second quantization means for linearly quantizing the differential normalized frequency spectrum by use of a second quantization coefficient corresponding to the second quantization information, to generate a differential quantized frequency spectrum; and a code string coding means for coding the normalization information, the quantized frequency spectrum, and the differential quantized frequency spectrum, to output a code string, wherein the quantization information calculation means sets a predetermined limit to the first quantization information, allocates the total quantization information for the first quantization information, and allocates an excess beyond the predetermined limit, for the second quantization information, to generate the first quantization information and the second quantization information.
According to an embodiment of the present invention, there is provided an audio coding method including: a time-frequency transform step of performing time-frequency transform on an input audio signal to generate a frequency spectrum; a quantization information calculation step of generating total quantization information indicating a quantization bit number on the basis of predetermined normalization information, and of allocating the total quantization information, to generate first quantization information and second quantization information each indicating a quantization bit number; a first normalization step of normalizing the frequency spectrum for every frequency component by use of a first normalization coefficient corresponding to the normalization information, to generate a normalized frequency spectrum; a first quantization step of linearly quantizing the normalized frequency spectrum by use of a first quantization coefficient corresponding to the first quantization information, to generate a quantized frequency spectrum; a subtraction step of subtracting, from the frequency spectrum, a frequency spectrum obtained by inversely quantizing and inversely normalizing the quantized frequency spectrum, to generate a differential frequency spectrum; a second normalization step of normalizing the differential frequency spectrum by use of a second normalization coefficient corresponding to the normalization information and the first quantization information, to generate a differential normalized frequency spectrum; a second quantization step of linearly quantizing the differential normalized frequency spectrum by use of a second quantization coefficient corresponding to the second quantization information, to generate a differential quantized frequency spectrum; and a code string coding step of coding the normalization information, the quantized frequency spectrum, and the differential quantized frequency spectrum, to output a code string, wherein in the quantization information calculation step, a predetermined limit is set to the first quantization information, the total quantization information is allocated for the first quantization information, and an excess beyond the predetermined limit is allocated for the second quantization information, to generate the first quantization information and the second quantization information.
According to an embodiment of the present invention, there is provided an audio coding device including: a time-frequency transform means for performing time-frequency transform on an input audio signal to generate a frequency spectrum; a quantization information calculation means for generating total quantization information indicating a quantization bit number on the basis of predetermined normalization information, and for allocating the total quantization information, to generate first quantization information and second quantization information each indicating a quantization bit number; a first normalization means for normalizing the frequency spectrum for every frequency component by use of a first normalization coefficient corresponding to the normalization information, to generate a normalized frequency spectrum; a first quantization means for linearly quantizing the normalized frequency spectrum by use of a first quantization coefficient corresponding to the first quantization information, to generate a quantized frequency spectrum; a subtraction means for subtracting, from the normalized frequency spectrum, a normalized frequency spectrum obtained by inversely quantizing the quantized frequency spectrum, to generate a differential normalized frequency spectrum; a second normalization means for normalizing the differential normalized frequency spectrum by use of a second normalization coefficient corresponding to the first quantization information, to generate a differential renormalized frequency spectrum; a second quantization means for linearly quantizing the differential renormalized frequency spectrum by use of a second quantization coefficient corresponding to the second quantization information, to generate a differential quantized frequency spectrum; and a code string coding means for coding the normalization information, the first quantization information, the second quantization information, the quantized frequency spectrum, and the differential quantized frequency spectrum, to output a code string, wherein the quantization information calculation means sets a predetermined limit to the first quantization information, allocates the total quantization information for the first quantization information, and allocates an excess beyond the predetermined limit, for the second quantization information, to generate the first quantization information and the second quantization information.
According to an embodiment of the present invention, there is provided an audio coding method including: a time-frequency transform step of performing time-frequency transform on an input audio signal to generate a frequency spectrum; a quantization information calculation step of generating total quantization information indicating a quantization bit number on the basis of predetermined normalization information, and of allocating the total quantization information, to generate first quantization information and second quantization information each indicating a quantization bit number; a first normalization step of normalizing the frequency spectrum for every frequency component by use of a first normalization coefficient corresponding to the normalization information, to generate a normalized frequency spectrum; a first quantization step of linearly quantizing the normalized frequency spectrum by use of a first quantization coefficient corresponding to the first quantization information, to generate a quantized frequency spectrum; a subtraction step of subtracting, from the normalized frequency spectrum, a normalized frequency spectrum obtained by inversely quantizing the quantized frequency spectrum, to generate a differential normalized frequency spectrum; a second normalization step of normalizing the differential normalized frequency spectrum by use of a second normalization coefficient corresponding to the first quantization information, to generate a differential renormalized frequency spectrum; a second quantization step of linearly quantizing the differential renormalized frequency spectrum by use of a second quantization coefficient corresponding to the second quantization information, to generate a differential quantized frequency spectrum; and a code string coding step of coding the normalization information, the first quantization information, the second quantization information, the quantized frequency spectrum, and the differential quantized frequency spectrum, to output a code string, wherein in the quantization information calculation step, a predetermined limit is set to the first quantization information, the total quantization information is allocated for the first quantization information, and an excess beyond the predetermined limit is allocated for the second quantization information, to generate the first quantization information and the second quantization information.
According to an embodiment of the present invention, there is provided an audio decoding device including: a code string decoding means for decoding an input code string, to generate normalization information, a quantized frequency spectrum, and a differential quantized frequency spectrum; a quantization information calculation means for generating total quantization information indicating a quantization bit number on the basis of the normalization information, and for allocating the total quantization information, to generate first quantization information and second quantization information each indicating a quantization bit number; a first inverse quantization means for linearly inversely quantizing the quantized frequency spectrum by use of a first inverse quantization coefficient corresponding to the first quantization information, to generate a normalized frequency spectrum; a first inverse normalization means for inversely normalizing the normalized frequency spectrum by use of a first inverse normalization coefficient corresponding to the normalization information, to generate a frequency spectrum; a second inverse quantization means for linearly inversely quantizing the differential quantized frequency spectrum by use of a second inverse quantization coefficient corresponding to the second quantization information, to generate a differential normalized frequency spectrum; a second inverse normalization means for inversely normalizing the differential normalized frequency spectrum by use of a second inverse normalization coefficient corresponding to the normalization information and the first quantization information, to generate a differential frequency spectrum; an addition means for adding up the frequency spectrum and the differential frequency spectrum; and a frequency-time transform means for performing frequency-time transform on a frequency spectrum obtained by the addition means, to generate an output audio signal, wherein the quantization information calculation means sets a predetermined limit to the first quantization information, allocates the total quantization information for the first quantization information, and allocates an excess beyond the predetermined limit, for the second quantization information, to generate the first quantization information and the second quantization information.
According to an embodiment of the present invention, there is provided an audio decoding method including: a code string decoding step of decoding an input code string, to generate normalization information, a quantized frequency spectrum, and a differential quantized frequency spectrum; a quantization information calculation step of generating total quantization information indicating a quantization bit number on the basis of the normalization information, and of allocating the total quantization information, to generate first quantization information and second quantization information each indicating a quantization bit number; a first inverse quantization step of linearly inversely quantizing the quantized frequency spectrum by use of a first inverse quantization coefficient corresponding to the first quantization information, to generate a normalized frequency spectrum; a first inverse normalization step of inversely normalizing the normalized frequency spectrum by use of a first inverse normalization coefficient corresponding to the normalization information, to generate a frequency spectrum; a second inverse quantization step of linearly inversely quantizing the differential quantized frequency spectrum by use of a second inverse quantization coefficient corresponding to the second quantization information, to generate a differential normalized frequency spectrum; a second inverse normalization step of inversely normalizing the differential normalized frequency spectrum by use of a second inverse normalization coefficient corresponding to the normalization information and the first quantization information, to generate a differential frequency spectrum; an addition step of adding up the frequency spectrum and the differential frequency spectrum; and a frequency-time transform step of performing frequency-time transform on a frequency spectrum obtained by the addition step, to generate an output audio signal, wherein in the quantization information calculation step, a predetermined limit is set to the first quantization information, the total quantization information is allocated for the first guantization information, and an excess beyond the predetermined limit is allocated for the second quantization information, to generate the first quantization information and the second quantization information.
In the audio coding device and the method thereof according to the embodiments of the present invention as well as the audio decoding device and the method thereof according to the embodiments of the present invention, an input audio signal is coded by performing multi-stage normalization/quantization, to generate a code string. When the code string is decoded to obtain an output audio signal, the quantization bit number in each stage can be appropriately set with a small calculation amount.
Embodiments to which the present invention is applied will now be specifically described below with reference to the drawings. In the embodiments, the present invention is applied to an audio coding device and a method thereof by which two-stage normalization/quantization is preformed on frequency spectrums obtained by subjecting an input audio signal to time-frequency transform, to generate a code string. The present invention is also applied to an audio decoding device and a method thereof by which the code string is decoded to obtain an output audio signal.
[First Embodiment]
At first,
In step S1 in
Next in step S3, based on the normalization information idsf, the quantization information calculation section 12 determines quantization information idwl1 expressing a quantization bit number to quantize the frequency spectrum mdspec1 and quantization information idwl2 expressing another quantization bit number for quantization in the second stage described later. The processing to determine quantization information idwl1 and idwl2 based on the normalization information idsf and the like in the quantization information calculation section 12 will be described in more details later.
In subsequent step S4, the first normalization section 13 normalizes the frequency spectrum mdspec1 by use of a normalization coefficient sf1 (idsf) corresponding to normalization information idsf, as expressed by the following equation (1):
nspec1=mdspec1※sf1(idsf) (1)
The first normalization section 13 supplies a first quantization section 14 with an obtained normalized frequency spectrum nspec1. By this processing, the frequency spectrum mdspec1 is normalized to a range of ±f ε R. The relationship between the normalization information idsf and the normalization coefficient sf1(idsf) is expressed as shown in the table 1 below.
In subsequent step S5, the first quantization section 14 quantizes the normalized frequency spectrum nspec1 by use of a quantization coefficient qf1(idwl1) corresponding to quantization information idwl1. The first quantization section 14 supplies an inverse quantization section 15 and a code string coding section 20 with a quantized frequency spectrum qspec1 obtained. For example, if linear quantization is performed as shown in
qspec1=(int)(floor(nspec1※qf1(idwl1))+0.5) (2)
By this processing, the normalization frequency spectrum nspec1 is quantized to a quantized frequency spectrum qspec1 having step number expressed by a quantization step width nstep(idwl1). The relationship between the quantization information idwl1, quantization step width nstep(idwl1), and quantization coefficient qf1(idwl1) is expressed as shown in the table 2 below.
In subsequent step S6, the inverse quantization section 15 inversely quantizes the quantized frequency spectrum qspec1 by use of an inverse quantization coefficient iqf1(idwl1) corresponding to quantization information idwl1, as expressed below by the following equation (3):
nspec1′=qspec1※iqf1(idwl1) (3)
The inverse quantization section 15 supplies an inverse normalization section 16 with an obtained normalization frequency spectrum nspec1′. The relationship between the quantization coefficient qf1(idwl1) and the inverse quantization coefficient iqf1(idwl1) is expressed below by the equation (4):
iqf1(idwl1)=1/qf1(idwl1) (4)
In subsequent step S7, the inverse normalization section 16 inversely normalizes the normalized frequency spectrum nspec1′ by use of an inverse normalization coefficient isf1 (idsf) corresponding to the normalization information idsf, as expressed below by the following equation (5):
mdspec1′=nspec1′※isf1(idsf) (5)
The inverse normalization section 16 supplies the subtraction section 17 with an obtained frequency spectrum mdspec1′. The relationship between the normalization coefficient sf1 (idsf) and the inverse normalization coefficient isf1(idsf) is expressed below by the equation (6):
isf1(idsf)=1/sf1(idsf) (6)
In subsequent step S8, the subtraction section 17 subtracts the frequency spectrum mdspec1′ from the frequency spectrum mdspec1, as expressed by the following equation (7):
mdspec2=mdspec1−mdspec1′ (7)
The subtraction section 17 supplies a second normalization section 18 with an obtained differential frequency spectrum mdspec2.
In subsequent step S9, the second normalization section 18 normalizes the differential frequency spectrum mdspec2 by use of a normalization coefficient sf2, as expressed by the following equation (8):
The second normalization section 18 supplies a second quantization section 19 with an obtained differential normalized frequency spectrum nspec2.
The normalized frequency spectrum nspec1 is normalized to a range of ±f ε R by the normalization coefficient sf1(idsf) corresponding to the normalization information idsf. Therefore, in case of performing linear quantization by which the quantization step width nstep(idwl1) is uniquely determined in correspondence with the quantization information idwl1, for example as shown in
sf2(idsf,idwl1)=sf1(idsf)※nstep(idwl1)/f (9)
That is, the normalization coefficient sf2(idsf,dw11) can be calculated based on the normalization information idsf and the quantization information idwl1.
In subsequent step S10, the second quantization section 19 quantizes the differential normalized frequency spectrum nspec2 by use of the quantization coefficient qf2(idwl2) corresponding to the quantization information idwl2. The second quantization section 19 supplies the code string coding section 20 with an obtained differential quantized frequency spectrum qspec2. For example, in case of performing linear quantization as shown in
qspec2=(int)(floor(nspec2※qf2(idwl2))+0.5) (10)
The relationship between the quantization information idwl2 and the quantization coefficient qf2(idwl2) may be identical with or different from that in the table 2 described previously.
In subsequent step S11, the code string coding section 20 codes the quantized frequency spectrum qspec1, differential quantized frequency spectrum qspec2, normalization information idsf, quantization information idwl1, and quantization information idwl2. In step S12, the code string coding section 20 outputs an obtained code string.
In subsequent step S13, whether an input audio signal has ended or not is determined. If the input audio signal has not ended, the processing procedure returns to step S1. Otherwise, if the input audio signal has ended, the coding processing is terminated.
Hereinafter, a detailed description will be made of processing of determining the quantization information idwl1 and idwl2 on the basis of the normalization information idsf in the quantization information calculation section 12. In the following description, a consideration is taken into a case of calculating the quantization information idwl1 and idwl2 for every processing unit, with respect to frequency spectrums having spectral envelope curves a drawn by continuous lines in
At first, the total quantization information idwl0 is calculated based on the normalization information idsf or the like. For example, in case of a frequency spectrum having the spectral envelope curve as shown in
If the maximum quantization bit number of, for example, 24 (bits) or so can be ensured by calculator simulation or large-scale hardware, quantization can be achieved based on the total quantization information idwl0. In normal cases, however, there are difficulties in granting limitless permission to the total quantization information idwl0. For example, the quantization bit number is limited to 16 (bits) at maximum. Therefore, higher quantization accuracy than that with a maximum SNR (Signal to Noise Ratio) of 16-bit quantization is not ensured with respect to a frequency spectrum which has to be of 16 or higher in total quantization information idwl0, i.e., a quantization bit number of 16 (bits) or higher. Noise floors as drawn by broken lines b in
Therefore, quantization in the second stage is performed on the differential frequency spectrum as an error obtained as a result of quantization in the first stage, to improve the SNR which has locally deteriorated. No method of setting appropriately the quantization bit number in each stage with a small calculation amount has been established.
Hence, the quantization information calculation section 12 in the present embodiment uses predetermined limiters lim1 and lim2 to set appropriately the quantization bit number in each stage with a small calculation amount. That is, the quantization information idwl1 in the first quantization section 14 is limited by the limiter lim1. If this limit is exceeded, the excess over the limit is allocated for quantization information idwl2 in the second quantization section 19. The quantization information idwl2 in the second quantization section 19 is limited by the other limiter lim2. If this limit is exceeded, the quantization information idwl2 is set to fall within the limit.
The processing procedure of the quantization information calculation section 12 is shown in the flowchart of
Next in step S23, whether the value of the quantization information idwl1 is greater than the value of the limiter lim1 or not. If the value of the quantization information idwl1 is not greater than the value of the limiter lim1, the processing procedure goes to step S25. Otherwise, if the value of the quantization information idwl1 is greater than the value of the limiter lim1, the value of the quantization information idwl1 is limited to the value of the limiter lim1, in step S24, and the processing procedure then goes to step S25.
Next in step S25, a value obtained by subtracting the value of the quantization information idwl1 from the value of the total quantization information idwl0 is set as the value of the quantization information idwl2.
In a subsequent step S26, whether the value of the quantization information idwl2 is greater than the value of the limiter lim2 or not is determined. If the value of the quantization information idwl2 is not greater than the value of the limiter lim2, the quantization information idwl1 and the quantization information idwl2 are determined, in step S28. Otherwise, if the value of the quantization information idwl2 is greater than the value of the limiter lim2, the value of the quantization information idwl2 is limited to the value of the limiter lim2, in step S27, and thereafter, the quantization information idwl1 and the quantization information idwl2 are determined, in step S28.
For example, if the total quantization information idwl0 has been calculated as shown in the upper rows in the tables 3 and 4 described above, the quantization information idwl1 and the quantization information idwl2 are determined as shown in the middle and lower rows in each of the tables 3 and 4. In these tables, the maximum quantization bit number in the first quantization section 14 is set to 16 (bits), so that the quantization information idwl1 falls within a range from 0 to 15 (nstep(idwl1)=65535(±32767)<2^16 where idwl1=15). Therefore, the value of the limiter lim1 is set to 15 with respect to the quantization information idwl1. Further, the total quantization information idwl0 limited by the limiter lim1 (=15) is set as the quantization information idwl1, and quantization information of an excess (idwl0−idwl1) is set as the quantization information idwl2.
By use of the quantization information idwl1 and the quantization information idwl2 thus determined, frequency spectrums having spectral envelope curves drawn by continuous lines a in
Next, schematic structure of an audio decoding device corresponding to the audio coding device 10 is shown in
In step S31 shown in
Next in step S33, the first inverse quantization section 32 inversely quantizes the quantized frequency spectrum qspec1 by use of an inverse quantization coefficient iqf1(idwl1) corresponding to the quantization information idwl1, as expressed by the following equation (11):
nspec1′=qspec1※iqf1(idwl1) (11)
The first inverse quantization section 32 supplies a first inverse normalization section 33 with an obtained normalized frequency spectrum nspec1′. The relationship between the quantization coefficient qf1(idwl1) and the inverse quantization coefficient iqf1(idwl1) is expressed by the equation (4) described previously.
In subsequent step S34, the first inverse normalization section 33 inversely normalizes the normalized frequency spectrum nspec1′ by use of an inverse normalization coefficient isf1(idsf) corresponding to the normalization information idsf, as expressed by the following equation (12):
mdspec1′=nspec1′※isf1(idsf) (12)
The first inverse normalization section 33 supplies an addition section 36 with an obtained frequency spectrum mdspec1′. The relationship between the normalization coefficient sf1(idsf) and the inverse normalization coefficient isf1 (idsf) is expressed by the equation (6) described previously.
In subsequent step S35, the second inverse quantization section 34 inversely quantizes the differential quantized frequency spectrum qspec2 by use of an inverse quantization coefficient iqf2(idwl2) corresponding to the quantization information idwl2, as expressed by the following equation (13):
nspec2′=qspec2※iqf2(idwl2) (13)
The second inverse quantization section 34 supplies a second inverse normalization section 35 with an obtained differential normalized frequency spectrum nspec2′. The relationship between the quantization coefficient qf2(idwl2) and the inverse quantization coefficient iqf2(idwl2) is expressed by the following equation (14):
iqf2(idwl2)=1/qf2(idwl2) (14)
In subsequent step S36, a second inverse normalization section 35 inversely normalizes the differential normalized frequency spectrum nspec2′ by use of an inverse normalization coefficient isf2(idsf,idwl1) corresponding to the normalization information idsf and the quantization information idwl1, as expressed by the following equation (15):
mdspec2′=nspec2′※isf2(idsf,idwl1) (15)
The second inverse normalization section 35 supplies the addition section 36 with an obtained differential frequency spectrum mdspec2′. The relationship between the inverse normalization coefficient isf2(idsf,idwl1), normalization information idsf, and quantization information idwl1 is expressed by the following equation (16):
isf2(idsf,idwl1)=1/sf2(idsf,idwl1)=isf1(idsf)※f/nstep(idwl1) (16)
The processings of steps S35 and S36 may be executed either before or in parallel with the processings of steps S33 and S34.
In subsequent step S37, the addition section 36 adds up the frequency spectrum mdspec1′ and the differential frequency spectrum mdspec2′, as expressed by the following equation (17):
mdspec′=mdspec1′+mdspec2′ (17)
The addition section 36 supplies a frequency-time transform section 37 with an obtained frequency spectrum mdspec′.
In subsequent step S38, the frequency-time transform section 37 performs frequency-time transform on the frequency spectrum mdspec′ to generate an audio signal. In step S39, the frequency-time transform section 37 outputs this audio signal. For example, if inverse MDCT (IMDCT) is used as the frequency-time transform, a MDCT coefficient of N/2 samples is transformed into an audio signal of N samples.
In subsequent step S40, whether an input code string has ended or not is determined. If not, the processing procedure returns to step S31. Otherwise, if the input code string has ended, the decoding processing is terminated.
[Second Embodiment]
In case of performing two-stage normalization/quantization as described above, the quantization information idwl1 in the first stage and the quantization information idwl2 in the second stage have to be coded. Therefore, the coding efficiency of frequency spectrum information lowers in accordance with the number of stages. Hence, the present embodiment will now be described with respect to a method of improving coding efficiency of frequency spectrum information by omitting the coding of the quantization information idwl1 and quantization information idwl2.
In this audio coding device 40, an quantization information calculation section 41 uniquely determines quantization information idwl1 and quantization information idwl2, based on normalization information idsf and the like. Processing of uniquely determining the quantization information idwl1 and quantization information idwl2 based on the normalization information idsf and the like in the quantization information calculation section 41 will be specifically described later. The code string coding section 20 codes a quantized frequency spectrum qspec1, differential quantized frequency spectrum qspec2, and normalization information idsf, and outputs an obtained code string.
On the other side, in the audio decoding device 50, a quantization information calculation section 51 uniquely determines quantization information idwl1 and quantization information idwl2, based on the normalization information idsf and the like. Processing of uniquely determining the quantization information idwl1 and quantization information idwl2 based on the normalization information idsf and the like in the quantization information calculation section 51 will also be specifically described later.
Hereinafter, the processing of uniquely determining the quantization information idwl1 and quantization information idwl2 based on the normalization information idsf and the like in the quantization information calculation sections 41 and 51 will now be described specifically.
The quantization information calculation sections 41 and 51 uniquely determine quantization information idwl0 from normalization information idsf and a predetermined parameter A, as shown in the table 5 below.
As can be seen from this table 5, the quantization information idwl0 decreases by one as the normalization information idsf decreases by one. This is achieved by paying attention to the following. Suppose that the absolute SNR is SNRabs where the normalization information idsf is X and the quantization information is B. On this supposition, if the normalization information idsf is X-1, a quantization bit number indicated by the quantization information of substantial B-1 is necessary, in order to obtain an equivalent SNRabs. Alternatively, if the normalization information idsf is X-2, a quantization bit number indicated by the quantization information of substantial B-2 is necessary.
The parameter A described previously means the maximum quantization information assigned to the maximum normalization information idsf. This value is included as additional information in a code string. A maximum quantization bit number which is available from the standard is firstly set as the parameter A. If the total number of used bits exceeds the total usable number of bits, as a result of coding, the parameter A is decreased one by one.
In case where the value of the parameter A is 17 (bits), an example of a table representing the relationship between the normalization information idsf and the quantization information idwl0 is shown in the table 6 below. In this table 6, circled numbers each represent the total quantization information idwl0 determined for every spectrum.
As shown in the table 6, if the normalization information idsf is maximized to 31, and the total quantization information idwl0 is maximized to 17. For example, if the normalization information idsf is 29 which is smaller by two than the maximum normalization information idsf, the total quantization information idwl0 is 15. If corresponding normalization information idsf is smaller by 17 or more than the maximum normalization information idsf, the quantization bit number is a minus value. In this case, a lower limit of zero (bit) is set.
The quantization information calculation sections 41 and 51 determine the quantization information idwl1 and the quantization information idwl2, based on the total quantization information idwl0 thus obtained for every spectrum. That is, the quantization information idwl1 is limited by a limiter lim1. If this limit is exceeded, the excess is allocated for the quantization information idwl2. The quantization information idwl2 is limited by the limiter lim2. If this limit is exceeded, the quantization information idwl2 is set to fall within the limit.
If the quantization information idwl1 and the quantization information idwl2 are thus uniquely determined, noise floors are substantially flat. That is, quantization is performed with equal quantization accuracy with respect to a low-frequency range which is important for human auditory sense as well as a high-frequency range which is not. Therefore, audible noise is not minimized.
Hence, in the quantization information calculation sections 41 and 51, the normalization information idsf for every spectrum may be added with a weighting coefficient Wn[i](i=0 to N/2−1), to generate new normalization information idsf1, as shown in the table 7 below.
In the example of the table 7, a value of 4 to 1 is added to normalization information idsf for a low-frequency range while nothing is added to normalization information idsf for a high-frequency range. By thus adding the weighting coefficient Wn[i] to the normalization information idsf, bits can be concentrated on the low-frequency range, to improve tone quality in the range which is important for human auditory sense.
If the weighting coefficient Wn[i] is added as shown in the table 7, the maximum value of the normalization information idsf is 35. Therefore, if the table 6 is extended simply in a direction in which the normalization information idsf is increased by four as the maximum added number of the normalization information idsf, for example, the table 8 below is obtained. Numbers circled by broken lines in the table 8 each represent total quantization information idwl0 for every spectrum in case where no weighting is executed. Other numbers circled by continuous lines represent total quantization information idwl0 for every spectrum in case where weighting is executed.
In the example of this table 8, quantization accuracy in the low-frequency range improves. However, the maximum quantization information increases thereby to increase the total number of used bits. Therefore, bit adjustment should preferably be performed such that the total number of used bits falls below the total number of usable bits, in actual.
A fixed coefficient may be used as the weighting coefficient Wn[i] described above both in the coding side and decoding side. Alternatively, an optimal weighting coefficient Wn[i] may be generated based on characteristics of an audio source (frequency energy, transit characteristic, gain, masking characteristic, etc.) in the coding side. In the latter case, the quantization information calculation section 41 generates the weighting coefficient Wn[i], for example, based on the frequency spectrum mdspec1. The code string coding section 20 codes the weighting coefficient Wn[i] and includes the coded result in a code string.
Thus, according to the audio coding device 40 and audio decoding device 50 in the present embodiment, the quantization information idwl1 and quantization information idwl2 are determined uniquely based on the normalization information idsf. Based on the normalization information idsf and quantization information idwl1, the normalization coefficient sf2(idsf,dw11) is calculated. Therefore, the normalization information idsf has to be included as side information other than frequency spectrum information in a code string. Further, excessive bits generated by reducing the side information are used for coding the quantized frequency spectrum qspec1 and the differential quantized frequency spectrum qspec2. In this manner, coding efficiency of the quantized frequency spectrum qspec1 and differential quantized frequency spectrum qspec2 can be improved.
[Third Embodiment]
An audio coding device 60 shown in
In this audio coding device 60, the subtraction section 61 subtracts the normalized frequency spectrum nspec1′ from the normalized frequency spectrum nspec1, as expressed by the following equation (18):
nspec2=nspec1−nspec1′ (18)
The subtraction section 61 supplies a second normalization section 62 with an obtained differential normalized frequency spectrum nspec2.
The second normalization section 62 normalizes the differential normalized frequency spectrum nspec2 by use of a normalization coefficient sf2, as expressed by the following equation (19):
nnspec2=nspec2※sf2=(nspec1−nspec1′)※sf2 (19)
The second normalization section 62 supplies a second quantization section 63 with an obtained differential renormalized frequency spectrum nnspec2.
The normalized frequency spectrum nspec1 is normalized to a range of ±f ε R by a normalization coefficient sf1(idsf) corresponding to the normalization information idsf. Therefore, in case of performing linear quantization by which the quantization step width nstep(idwl1) is uniquely determined in correspondence with the quantization information idwl1, for example as shown in
sf2(idwl1)=nstep(idwl1)/f (20)
That is, the normalization coefficient sf2(idwl1) can be calculated based on the quantization information idwl1.
The second quantization section 63 quantizes the differential renormalized frequency spectrum nnspec2 by use of a quantization coefficient qf2(idwl2) corresponding to the quantization information idwl2. The second quantization section 63 supplies the code string coding section 20 with an obtained differential quantized frequency spectrum qspec2. For example, in case of performing linear quantization as shown in
qspec2=(int)(floor(nnspec2※qf2(idwl2))+0.5) (21)
The code string coding section 20 codes the quantized frequency spectrum qspec1, differential quantized frequency spectrum qspec2, normalization information idsf, quantization information idwl1, and quantization information idwl2. The code string coding section 20 outputs an obtained code string.
Next, schematic structure of an audio decoding device corresponding to the audio coding device 60 is shown in
In the audio decoding device 70, a second inverse quantization section 71 inversely quantizes the differential quantized frequency spectrum qspec2 by use of an inverse quantization coefficient iqf2(idwl2) corresponding to the quantization information idwl2, as expressed by the following equation (22):
nnspec2′=qspec2※iqf2(idwl2) (22)
The second inverse quantization section 71 supplies a second inverse normalization section 72 with an obtained differential renormalized frequency spectrum nnspec2′.
The second inverse normalization section 72 inversely normalizes the differential renormalized frequency spectrum nnspec2′ by use of an inverse normalization coefficient isf2 (idwl1) corresponding to the quantization information idwl1, as expressed by the following equation (23):
nspec2′=nnspec2′※isf2(idwl1) (23)
The second inverse normalization section 72 supplies an addition section 73 with an obtained differential normalized frequency spectrum nspec2′. The relationship between the inverse normalization coefficient isf2(idwl1) and the quantization information idwl1 is expressed by the following equation (24):
isf2(idwl1)=1/sf2(idwl1)=f/nstep(idwl1) (24)
The addition section 73 adds up the normalized frequency spectrum nspec1′ and the differential normalized frequency spectrum nspec2′, as expressed by the following equation (25):
nspec′=nspec1′+nspec2′ (25)
The addition section 73 supplies a first inverse normalization section 74 with an obtained normalized frequency spectrum nspec′.
The first inverse normalization section 74 inversely quantizes the normalized frequency spectrum nspec′ by use of an inverse normalization coefficient isf1(idsf) corresponding to the normalization information idsf, as expressed by the following equation (26):
mdspec′=nspec′※isf1(idsf) (26)
The first inverse normalization section 74 supplies the frequency-time transform section 37 with an obtained frequency spectrum mdspec′.
The frequency-time transform section 37 performs frequency-time transform on the frequency spectrum mdspec′ to generate an audio signal. The frequency-time transform section 37 outputs this audio signal.
[Fourth Embodiment]
In the first to third embodiments described above, three kinds of basic structures of audio coding devices and audio decoding devices have been described. In the present embodiment, however, modifications of the audio coding devices and the audio decoding devices will be described. The same structures as those of the audio coding device 10 and the audio decoding device 30 are denoted at the same reference symbols, and detailed descriptions thereof will be omitted.
At first,
Next,
The foregoing first to third embodiments have been described on the assumption that the first quantization section 14 performs linear quantization. However, non-linear quantization is equivalent to linear quantization performed after non-linear transform. Therefore, if the first preprocessing section 101 to perform non-linear transform is provided in the front stage of the first quantization section 14, these embodiments are applicable to a case of executing non-linear quantization, as shown in
Next,
This modification has been described as a modification to the audio coding device 10 and the audio decoding device 30 in the first embodiment. However, the same modification may be made to the audio coding device 40 and the audio decoding device 50 in the second embodiment as well as the audio coding device 60 and the audio decoding device 70 in the third embodiment.
Although best modes for carrying out the present invention have thus been described above, the present invention is not limited to the embodiments as described above but various changes can be made without deviating from the subject matter of the invention.
For example, the above embodiments have been described such that coding is achieved by performing two-stage normalization/quantization on a frequency spectrum obtained by subjecting an input audio signal to time-frequency transform. The present invention is not limited to these embodiments but can be extended such that coding is achieved by performing normalization/quantization through an arbitrary number of stages. In this case, quantization information idwlk in the k-th stage (k is an integer not smaller than 1) is limited by a limiter link. If this limit is exceeded, the excess is allocated for quantization information idwl(k+1) for the (k+1)-th stage.
Although the above embodiments each have been described as hardware structure, the present invention is not limited to hardware structure. Arbitrary processing can be realized by letting a CPU (Central Processing Unit) execute a computer program. In this case, the computer program may be provided, recorded on a recording medium or transferred by a transfer medium such as the Internet, etc.
It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and alterations may occur depending on design requirements and other factors insofar as they are within the scope of the appended claims or the equivalents thereof.
Claims
1. An audio coding device including processing circuitry and programmed to execute a program via the processing circuitry, the program comprising:
- a time frequency transformation unit configured to perform time-frequency transform on an input audio signal to generate a frequency spectrum;
- a quantization unit configured to (a) generate total quantization information indicating a quantization bit number on the basis of predetermined normalization information, (b) allocate the total quantization information, by setting a predetermined limit to a first quantization information, allocating, up to the predetermined limit, the total quantization information to the first quantization information, and allocating an excess beyond the predetermined limit to the second quantization information, and (c) in each of for a plurality of stages, (i) generate the first quantization information and the second quantization information, each indicating a respective quantization bit number, and (ii) normalize the frequency spectrum for every frequency component by use of a first normalization coefficient corresponding to the normalization information to generate a normalized frequency spectrum, each stage having a predetermined limit to quantization information, and if quantization information allocated for a k-th stage, ‘k’ being an integer greater than zero, exceeds a limit in the k-th stage, an excess for quantization information is allocated to a (k+1)-th stage, the limit being based on a predetermined allowed quantization bit number for each of the respective plurality of stages;
- a first quantization unit configured to linearly quantize the normalized frequency spectrum by use of a first quantization coefficient corresponding to the first quantization information, to generate a quantized frequency spectrum;
- a subtraction unit configured to subtract from the frequency spectrum, a frequency spectrum obtained by inversely quantizing and inversely normalizing the quantized frequency spectrum, to generate a differential frequency spectrum;
- a normalization unit configured to normalize the differential frequency spectrum by use of a second normalization coefficient corresponding to the normalization information and the first quantization information, to generate a differential normalized frequency spectrum;
- a second normalization unit configured to linearly quantize the differential normalized frequency spectrum by use of a second quantization coefficient corresponding to the second quantization information, to generate a differential quantized frequency spectrum; and a code unit configured to code the normalization information, the first quantization information, the second quantization information, the quantized frequency spectrum, and the differential quantized frequency spectrum, to output a code string.
2. The audio coding device of claim 1, wherein the program further comprises a non-linear transformation unit configured to:
- perform non-linear transform on the frequency spectrum or the normalized frequency spectrum; and
- perform non-linear inverse transform on a normalized frequency spectrum obtained by inversely quantizing the quantized frequency spectrum, or a frequency spectrum obtained by inversely normalizing the normalized frequency spectrum.
3. A method executed by an audio coding device comprising the steps of:
- a time-frequency transform step of performing time-frequency transform on an input audio signal to generate a frequency spectrum;
- a quantization information calculation step including the steps of (a) generating total quantization information indicating a quantization bit number on the basis of predetermined normalization information, (b) allocating the total quantization information by setting a predetermined limit to a first quantization information, (c) allocating, up to the predetermined limit, the total quantization information to the first quantization information, (d) allocating an excess beyond the predetermined limit to the second quantization information, and, (e) in each of for a plurality of stages, generating the first quantization information and the second quantization information, each indicating a respective quantization bit number;
- a first normalization step of normalizing the frequency spectrum for every frequency component by use of a first normalization coefficient corresponding to the normalization information, to generate a normalized frequency spectrum, wherein, a predetermined limit to quantization information is set in each stage, and if quantization information allocated for a k-th stage, ‘k’ being an integer greater than zero, exceeds a limit in the k-th stage, an excess for quantization information is allocated for a (k+1)-th stage, the limit being based on a predetermined allowed quantization bit number for each of the respective plurality of stages;
- a first quantization step of linearly quantizing the normalized frequency spectrum by use of a first quantization coefficient corresponding to the first quantization information, to generate a quantized frequency spectrum;
- a subtraction step of subtracting, from the frequency spectrum, a frequency spectrum obtained by inversely quantizing and inversely normalizing the quantized frequency spectrum, to generate a differential frequency spectrum;
- a second normalization step of normalizing the differential frequency spectrum by use of a second normalization coefficient corresponding to the normalization information and the first quantization information, to generate a differential normalized frequency spectrum;
- a second quantization step of linearly quantizing the differential normalized frequency spectrum by use of a second quantization coefficient corresponding to the second quantization information, to generate a differential quantized frequency spectrum; and
- a code string coding step of coding the normalization information, the first quantization information, the second quantization information, the quantized frequency spectrum, and the differential quantized frequency spectrum, to output a code string.
4. An audio coding device including processing circuitry and programmed to execute a program via the processing circuitry, the program comprising:
- a time frequency transformation unit configured to perform time-frequency transform on an input audio signal, to generate a frequency spectrum;
- a quantization unit configured to (a) generate total quantization information indicating a quantization bit number on the basis of predetermined normalization information, (b) allocate the total quantization information, by setting a predetermined limit to a first quantization information, allocating, up to the predetermined limit, the total quantization information to the first quantization information, and allocating an excess beyond the predetermined limit to the second quantization information, and (c) in each of for a plurality of stages, (i) generate the first quantization information and the second quantization information, each indicating a respective quantization bit number, and (ii) normalize the frequency spectrum for every frequency component by use of a first normalization coefficient corresponding to the normalization information to generate a normalized frequency spectrum, each stage having a predetermined limit to quantization information, and if quantization information allocated for a k-th stage, ‘k’ being an integer greater than zero, exceeds a limit in the k-th stage, an excess for quantization information is allocated to a (k+1)-th stage, the limit being based on a predetermined allowed quantization bit number for each of the respective plurality of stages;
- a first quantization unit configured to linearly quantize the normalized frequency spectrum by use of a first quantization coefficient corresponding to the first quantization information, to generate a quantized frequency spectrum;
- subtraction unit configured to subtract from the frequency spectrum, a frequency spectrum obtained by inversely quantizing and inversely normalizing the quantized frequency spectrum, to generate a differential frequency spectrum;
- a normalization unit configured to normalize the differential frequency spectrum by use of a second normalization coefficient corresponding to the normalization information and the first quantization information, to generate a differential normalized frequency spectrum;
- a second quantization unit configured to linearly quantize the differential normalized frequency spectrum by use of a second quantization coefficient corresponding to the second quantization information, to generate a differential quantized frequency spectrum; and
- a code unit configured to code string the normalization information, the quantized frequency spectrum, and the differential quantized frequency spectrum, to output a code string.
5. The device according to claim 4, wherein:
- a maximum quantization error, corresponding to the first quantization information, is uniquely determined and
- the second normalization coefficient is determined by the product of the first normalization coefficient and the reciprocal of the maximum quantization error.
6. The device according to claim 4, wherein the quantization bit number indicated by the total quantization information increases or decreases one by one as the normalization information is increased or decreased one by one.
7. The device according to claim 4, wherein the audio coding device is further configured to:
- perform non-linear transform on the frequency spectrum or the normalized frequency spectrum; and
- perform non-linear inverse transform on a normalized frequency spectrum obtained by inversely quantizing the quantized frequency spectrum, or a frequency spectrum obtained by inversely normalizing the normalized frequency spectrum.
8. A method executed by an audio coding device comprising the steps of:
- a time-frequency transform step of performing time-frequency transform on an input audio signal to generate a frequency spectrum;
- a quantization information calculation step including the steps of (a) generating total quantization information indicating a quantization bit number on the basis of predetermined normalization information, (b) allocating the total quantization information by setting a predetermined limit to a first quantization information, (c) allocating, up to the predetermined limit, the total quantization information to the first quantization information, and (d) in each of for a plurality of stages, allocating an excess beyond the predetermined limit to the second quantization information to generate, the first quantization information and the second quantization information each indicating a respective quantization bit number;
- a first normalization step of normalizing the frequency spectrum for every frequency component by use of a first normalization coefficient corresponding to the normalization information, to generate a normalized frequency spectrum, wherein, a predetermined limit to quantization information is set in each stage, and if quantization information allocated for a k-th stage, ‘k’ being an integer greater than zero, exceeds a limit in the k-th stage, an excess for quantization information is allocated for a (k+1)-th stage, the limit being based on a predetermined allowed quantization bit number for each of the respective plurality of stages;
- a first quantization step of linearly quantizing the normalized frequency spectrum by use of a first quantization coefficient corresponding to the first quantization information, to generate a quantized frequency spectrum;
- a subtraction step of subtracting, from the frequency spectrum, a frequency spectrum obtained by inversely quantizing and inversely normalizing the quantized frequency spectrum, to generate a differential frequency spectrum;
- a second normalization step of normalizing the differential frequency spectrum by use of a second normalization coefficient corresponding to the normalization information and the first quantization information, to generate a differential normalized frequency spectrum;
- a second quantization step of linearly quantizing the differential normalized frequency spectrum by use of a second quantization coefficient corresponding to the second quantization information, to generate a differential quantized frequency spectrum; and
- a code string coding step of coding the normalization information, the quantized frequency spectrum, and the differential quantized frequency spectrum, to output a code string.
9. An apparatus including an audio coding device with processing circuitry and programmed to execute a program via the processing circuitry, the program comprising:
- a time frequency transformation unit configured to perform time-frequency transform on an input audio signal to generate a frequency spectrum;
- a quantization unit configured to (a) generate total quantization information indicating a quantization bit number on the basis of predetermined normalization information (b) allocate the total quantization information, by setting a predetermined limit to a first quantization information, allocating, up to the predetermined limit, the total quantization information to the first quantization information, and allocating an excess beyond the predetermined limit to the second quantization information, and (c) in each of for a plurality of stages, (i) generate the first quantization information and the second quantization information, each indicating a respective quantization bit number, and (ii) normalize the frequency spectrum for every frequency component by use of a first normalization coefficient corresponding to the normalization information, to generate a normalized frequency spectrum each stage having a predetermined limit to quantization information, and (iii) if quantization information allocated for a k-th stage, ‘k’ being an integer greater than zero, exceeds a limit in the k-th stage, an excess for quantization information is allocated to a (k+1)-th stage, the limit being based on a predetermined allowed quantization bit number for each of the respective plurality of stages;
- a first quantization unit configured to linearly quantize the normalized frequency spectrum by use of a first quantization coefficient corresponding to the first quantization information, to generate a quantized frequency spectrum;
- a subtraction unit configured to subtract from the normalized frequency spectrum, a normalized frequency spectrum obtained by inversely quantizing the quantized frequency spectrum, to generate a differential normalized frequency spectrum;
- a normalization unit configured to normalize the differential normalized frequency spectrum by use of a second normalization coefficient corresponding to the first quantization information, to generate a differential renormalized frequency spectrum;
- a second quantization unit configured to linearly quantize the differential renormalized frequency spectrum by use of a second quantization coefficient corresponding to the second quantization information, to generate a differential quantized frequency spectrum; and
- a code unit configured to code the normalization information, the first quantization information, the second quantization information, the quantized frequency spectrum, and the differential quantized frequency spectrum, to output a code string.
10. The apparatus according to claim 9, wherein the audio coding device is further configured to:
- perform non-linear transform on the frequency spectrum or the normalized frequency spectrum; and
- perform non-linear inverse transform on a normalized frequency spectrum obtained by inversely quantizing the quantized frequency spectrum, or a frequency spectrum obtained by inversely normalizing the normalized frequency spectrum.
11. A method executed by an audio coding device comprising the steps of:
- a time-frequency transform step of performing time-frequency transform on an input audio signal to generate a frequency spectrum;
- a quantization information calculation step including the steps of (a) generating total quantization information indicating a quantization bit number on the basis of predetermined normalization information, (b) allocating the total quantization information by setting a predetermined limit to a first quantization information, (c) allocating, up to the predetermined limit, the total quantization information to the first quantization information, and (d) in each of for a plurality of stages, allocating an excess beyond the predetermined limit to the second quantization information, and generating the first quantization information and the second quantization information, each indicating a respective quantization bit number;
- a first normalization step of normalizing the frequency spectrum for every frequency component by use of a first normalization coefficient corresponding to the normalization information, to generate a normalized frequency spectrum, wherein, a predetermined limit to quantization information is set in each stage, and if quantization information allocated for a k-th stage, ‘k’ being an integer greater than zero, exceeds a limit in the k-th stage, an excess for quantization information is allocated for a (k+1)-th stage, the limit being based on a predetermined allowed quantization bit number for each of the respective plurality of stages;
- a first quantization step of linearly quantizing the normalized frequency spectrum by use of a first quantization coefficient corresponding to the first quantization information, to generate a quantized frequency spectrum;
- a subtraction step of subtracting, from the normalized frequency spectrum, a normalized frequency spectrum obtained by inversely quantizing the quantized frequency spectrum, to generate a differential normalized frequency spectrum;
- a second normalization step of normalizing the differential normalized frequency spectrum by use of a second normalization coefficient corresponding to the first quantization information, to generate a differential renormalized frequency spectrum;
- a second quantization step of linearly quantizing the differential renormalized frequency spectrum by use of a second quantization coefficient corresponding to the second quantization information, to generate a differential quantized frequency spectrum; and
- a code string coding step of coding the normalization information, the first quantization information, the second quantization information, the quantized frequency spectrum, and the differential quantized frequency spectrum, to output a code string.
12. An apparatus comprising an audio decoding device including processing circuitry and programmed to execute a program via the processing circuitry, the program comprising:
- a time frequency transformation unit configured to decode an input code string, to generate normalization information, a quantized frequency spectrum, and a differential quantized frequency spectrum;
- a quantization unit configured to (a) generate total quantization information indicating a quantization bit number on the basis of the normalization information (b) allocate the total quantization information, by setting a predetermined limit to a first quantization information, allocating, up to the predetermined limit, the total quantization information to the first quantization information, and allocating an excess beyond the predetermined limit to the second quantization information, and (c) in each of for a plurality of stages, (i) generate the first quantization information and the second quantization information, each indicating a respective quantization bit number and linearly inversely quantize the quantized frequency spectrum by use of a first inverse quantization coefficient corresponding to the first quantization information and (ii) generate a normalized frequency spectrum, each stage having a predetermined limit to quantization information, and if quantization information allocated for a k-th stage, ‘k’ being an integer greater than zero, exceeds a limit in the k-th stage, an excess for quantization information is allocated to a (k+1)-th stage, the limit being based on a predetermined allowed quantization bit number for each of the respective plurality of stages;
- a first normalization unit configured to inversely normalize the normalized frequency spectrum by use of a first inverse normalization coefficient corresponding to the normalization information, to generate a frequency spectrum;
- a subtraction unit configured to linearly inversely quantize the differential quantized frequency spectrum by use of a second inverse quantization coefficient corresponding to the second quantization information, to generate a differential normalized frequency spectrum;
- a second normalization unit configured to inversely normalize the differential normalized frequency spectrum by use of a second inverse normalization coefficient corresponding to the normalization information and the first quantization information, to generate a differential frequency spectrum;
- an addition unit configured to add the frequency spectrum and the differential frequency spectrum; and
- a second time transformation unit configured to perform frequency-time transform on a frequency spectrum obtained by the addition means, to generate an output audio signal.
13. A method executed performed by an audio coding decoding device comprising the steps of:
- a code string decoding step of decoding an input code string, to generate normalization information, a quantized frequency spectrum, and a differential quantized frequency spectrum;
- a quantization information calculation step including the steps of (a) generating total quantization information indicating a quantization bit number on the basis of the normalization information, (b) allocating the total quantization information, by setting a predetermined limit to a first quantization information, (c) allocating, up to the predetermined limit, the total quantization information to the first quantization information, and allocating an excess beyond the predetermined limit to the second quantization information, and (d) in each of for a plurality of stages, generate the first quantization information and second quantization information each indicating a quantization bit number;
- a first inverse quantization step of linearly inversely quantizing the quantized frequency spectrum by use of a first inverse quantization coefficient corresponding to the first quantization information, to generate a normalized frequency spectrum, wherein, a predetermined limit to quantization information is set in each stage, and if quantization information allocated for a k-th stage, ‘k’ being an integer greater than zero, exceeds a limit in the k-th stage, an excess for quantization information is allocated for a (k+1)-th stage, the limit being based on a predetermined allowed quantization bit number for each of the respective plurality of stages;
- a first inverse normalization step of inversely normalizing the normalized frequency spectrum by use of a first inverse normalization coefficient corresponding to the normalization information, to generate a frequency spectrum;
- a second inverse quantization step of linearly inversely quantizing the differential quantized frequency spectrum by use of a second inverse quantization coefficient corresponding to the second quantization information, to generate a differential normalized frequency spectrum;
- a second inverse normalization step of inversely normalizing the differential normalized frequency spectrum by use of a second inverse normalization coefficient corresponding to the normalization information and the first quantization information, to generate a differential frequency spectrum;
- an addition step of adding the frequency spectrum and the differential frequency spectrum; and
- a frequency-time transform step of performing frequency-time transform on a frequency spectrum obtained by the addition step, to generate an output audio signal.
5774844 | June 30, 1998 | Akagiri |
5966688 | October 12, 1999 | Nandkumar et al. |
6593872 | July 15, 2003 | Makino et al. |
6871106 | March 22, 2005 | Ishikawa et al. |
6904404 | June 7, 2005 | Norimatsu et al. |
7212973 | May 1, 2007 | Toyama et al. |
7283967 | October 16, 2007 | Nishio et al. |
7406412 | July 29, 2008 | Vernon et al. |
8090577 | January 3, 2012 | El-Maleh et al. |
20010047256 | November 29, 2001 | Tsurushima et al. |
20020010577 | January 24, 2002 | Matsumoto et al. |
20040002859 | January 1, 2004 | Liu et al. |
20040024593 | February 5, 2004 | Tsuji et al. |
20050075872 | April 7, 2005 | Kikuiri et al. |
Type: Grant
Filed: Aug 25, 2015
Date of Patent: May 2, 2017
Assignee: SONY CORPORATION (Tokyo)
Inventors: Yuuki Matsumura (Saitama), Shiro Suzuki (Kanagawa), Keisuke Toyama (Kanagawa), Mitsuyuki Hatanaka (Kanagawa), Yuhki Mitsufuji (Tokyo)
Primary Examiner: Mary Steelman
Application Number: 14/835,121
International Classification: G10L 19/032 (20130101);