Short Term Spectral Analysis, Modification Discrete Fourier Transform Synthesis, and
Short Term Spectral Analysis, Modification Discrete Fourier Transform Synthesis, and
TRANSACTIONS ON ACOUSTICS,
SPEECH, AND SIGNAL
PROCESSING, VOL. ASSP-25, NO. 3, JUNE 1977 235
Abstract-A theory of short term spectral analysis, synthesis, and amount of additional filtering. In our method, only a small
modification is presented with an attemptatpointing out certain number of adds (four fo'r a Hamming window) per sample are
practical and theoreticalquestions. The methods discussed here are
required.
useful in designing filter banks when the filter bank outputs are to be
used for synthesis after multiplicative modifications aremade to the The major advantage of the present scheme is that it allows
spectrum. arbitrary modifications of theshorttermspectrum. These
modifications may be directly interpreted in the time domain
as a.filter whose impulse response is given by the Fourier trans-
form of the modification. The price paid for allowing modifi-
I N THIS paper, some practical and theoretical questions are
considered concerning the analysis of and synthesis from a
signal's shorttermspectrum.Theshorttermspectrum, or
cations of the spectrum w i t b e seen to be an increase in the
number of frequency channels required. A modification made
time spectrum, are those signals which result from analyzing prior to the usual method of synthesis, namely, of adding the
a single input signal with a set of filters which areselective filter outputs of a contiguous filter set, does not satisfy the
over a range of frequencies [ l ] -[3]. In the case of analysis by convolution rule.
a spectrum analyzer, the filters are either spaced contiguously
ANALYSISOF SHORTTERM SPECTRA
or one filter is heterodyned over the frequency rangeof in-
terest. Formany applications, this is quiteadequate. How- We have defined the short term spectra as an output derived
ever, when one is interested in both analysis and synthesis, a from a bank of filters. At each filter frequency, we require
more rigorous approachis in order. two filters which have the same magnitude response but differ
In this paper, we shall restrict ourselves to the case of uni- in phase by 90".
formlyspaced,symmetricbandpass filters. We arealso not It is well known [2] , [3] that a filterbank of this form may
concerned with bandwidth reduction. In general, the channel berealized by weighting theinput signal x ( t ) bya sliding
capacity in the short term spectral domain will be greater than low-pass filter impulse responsew ( t ) and Fourier transforming
that of the original signal. We will be interested, however, in the result. Thus, we have
beingable tomodifytheshorttermspectrum in eitherits m
phase or amplitude content without introducing undesireddis- X(f,t ) =/ w (t - 7)x (7)e i 2 n f ~d7 (1)
tortion in the synthesized signal. -m
Previous filter bank analysis-synthesis techniques have been where X(f,t ) is the short term frequency spectrum, t is the
given by Flanagan and Golden [ l ] , Schafer and Rabiner [2], time variable, w ( t - 7) is the shifted window, X ( T ) is the input
and Portnoff [3]. Our approach differs inseveral important signal, and exp (j27r.f~) is the complex exponential.
ways. Previous approaches haveused contiguous filter banks We define W(f)as the Fourier transform of w ( t ) :
in the analysis process. We shall show that this results in an
undersampledspectrum and, as a result, synthesisbecomes
verysensitive to phase or delay modifications. We will then W ( f ) =Jm w(7) ej2nfT d7. (2)
-m
show that by using a properly sampled overlapping filter set,
we may avoid this sensitivity. By recognizing the need for a W(f) is assumed to be small for frequencies above some crit-
greater number of filters, both the analysis and synthesis pro- ical frequency. The short term spectrum X(f,t ) is equivalent
cedures are simplified. However, the number ofsamples of to frequency shifting the frequency band of x(t) centered at
data in the short term spectral domain that results per sample f downtozerofrequencywiththecomplexexponential
of input data is greater than one; thus, the relevance of the exp (j277ft) and low-pass filtering the result with the low-pass
present method to bandwidth reduction remainsunclear. filter w(t). The resulting X(f,t ) is a complex function of time
A second important difference between our approach and (see Fig. 1 and [ l ] -[3]). In the applications considered here,
that of others is during synthesis. All previous authors [ l ] - x ( t ) is to be a sampled data signal x ( k ) and ,
'
(
X t ) will be
[3] have summedthe filter outputsforthe synthesis; we found by replacing the Fourier transform by adiscrete Fourier
synthesize in a way that is similar to the overlap add method transform (DFT).
[4] . As a result, our method does not require an interpolating An important question is relevant at this point, namely, how
filter prior totheadd, and @us, there is a savings in the many frequency and time samples are required to fully repre-
sent the data X(f,t ) in a sampled data system. This question
Manuscript received September 29, 1976;revised January 25, 1977. is answered by applying the Nyquist theorem twice. w ( t ) has
The author is with Bell Laboratories, Murray Hill, NJ 07974. two characteristic "lengths," one in the time domain and one
236 IEEE TRANSACTIONS ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, JUNE 1977
LOW PASS
FILTER
+ tth
x ( k ) w(nD - k ) = F - l { x n m }
where
(10)
4 D ‘v
n FRAME OUTPUT
1 T-1
F-’ } 67
{xnrn X,, e-j2nkmlT. (1 1) t
x ( k )= F-’ {Xnm 1 (1 6)
-ea
n=
where
1 T-1 = P(k) * x(k) (20)
~ - {1x n m } = xnm e-i2rkmlT (17) where
m=o
Equation (16) states that x ( k ) is a sum over inverse DFT’s. It ~ ( k= )F - l {pm 1 (21)
issimilar to the overlap add rule, asdiscussed by Stockham and “*” denotesconvolution.Thus,afixedmodification
[4] , which may be used to do continuous convolution using to theshorttermspectrum is equivalent to convolution
FFT’s; it differs in that the sections are taken as overlapping with p ( k ).
and are not rectangular windows. Fixed modifications may not be made in the schemes pre-
238 TRANSACTIONS
IEEE ON ACOUSTICS,
SIGNAL
PROCESSING,
SPEECH,
AND
JUNE 1977
sented by other authors [2], [3]. For example, the synthesis to properly represent the short term spectra. The analysis is
techniques of both authors are particularly sensitive to a 180’ performed in frames by a sliding low-pass filter window and a
phase modification in one channel. Such a modification will DFT, a frame being defined bythe Nyquist period of the
produce zeros in the transition regions between bands, and will bandlimited window. The synthesis is reminiscent of the over-
therefore not result in the desired all-pass modification. lap add process as discussed by Stockham [4], and consists of
The next obvious step is to allow the modification to be- an inverse DFT and a vector add each frame. Spectral modi-
come a function of time, giving rise to the final synthesis rule: fications may be included if zeros are appended to the window
m
function prior to the analysis, the number of zeros being equal
~ ( k= ) F-’ {Prim x n m 1. (22) to the time characteristic length of the modification.
-m Advantages of the new technique are that modifications may
n= be included and no interpolation is necessary during synthesis.
The effect of a time-varying spectral modification is beyond A possible disadvantage is the increased amount of bandwidth
the scope of the present paper. required to transmit the short term spectrum as compared to
Finally, we would like to point out some further differences that required to transmit the original signal.
between the present and previous [2], [3] results. It is true,
in general, that the short term spectra may be subsampled in APPENDIX
either frequency or time and x ( k ) can still be determined from We wish to show that given any.function w ( k ) which is
X,, . When this happens, however, the synthesis is no longer bandlimited to a frequency of 1/(2D) and normalized as given
robust to modifications. That this is true may beseen from by (15), then (14) is true, namely, that the sum of any set of
two examples. samples of w ( k ) taken with aperiod D is one.
Suppose we generate the time subsampled, short term spec- This is easily proved using the Poisson summation formula
tra using a Hamming window which has been shifted by its [5, eq. (3-56), p. 471. If W(f) is the Fouriertransform of
full period w(k),then
x,, =F{w(nTo - k ) x ( k ) } m
1 ”
w ( n o - k ) = - 2 e-i2amklD W(m/D). (All
where w ( k ) is the Hamming window with length T o . x ( k ) -m D -m