where U1=Acosϕ and U2=−Rsinϕ, and U1 and U2 are taken to be normally distributed random variables. The amplitude is A=U12+U22 and the phase is ϕ=arctan(U2/U1). If we assume that U1 and U2 are uncorrelated random variables with mean 0 and variance σ2, then we also can show that
Because this quantity depends only on the time lag t−s. We can create a generalization of this signal that allows mixtures of periodic signals with multiple freqauencies and amplitudes:
Now let’s say we wanted to estimate the component frequencies of a signal like this, where we didn’t know what the underlying components were. One way to do this is using the periodogram.
for t=1,…,n and coefficients aj and bj. ⌊⌋ is the greatest integer function (also called the “floor”), which rounds numbers down to the nearest integer. If n is even, we have an/2cos(2πt21)=an/2(−1)t and bn/2=0.
Here the values of j correspond to frequencies indices. Each j represents a different frequency component in the decomposition. j/n is the frequency in cycles per sample. As j goes from 1 to ⌊n/2⌋, we sweep up through all distinguishable frequencies from the slowest oscillation up to the Nyquist frequency. For example, let’s say we have n=100 data points:
j=1 means the wave completes exactly 1 full cycle over the n samples, which is the slowest possible oscillation that fits in your data window.
j=2 completes exactly 2 full cycles, j=3 completes 3, and so on.
j=50(=n/2) completes 50 cycles, the fastest oscillation you can resolve, alternating up-down-up-down every sample.
We can now use regression to get the coefficients:
aj=n2∑t=1nxtcos(2πtj/n) and bj=n2∑t=1nxtsin(2πtj/n)
Here aj and bj represent how much of a particular frequency is present in our signal, with aj and bj together controlling the amplitude and phase at frequency j. These are free parameters and set independently, but jointly contribute to A and ϕ.
From this, we can then define the scaled periodogram:
for j/n=0,1/2. The scaled periodogram is the sample variance at each frequency component and is an estimate of σj2 corresponding to a sinusoid at frequency fj=j/n. These frequencies are called the Fourier frequencies. Large values of P(j/n) indicate which frequencies dominate the series, small values may represent noise.
Next time we will relate this to the Discrete Fourier Transform (DFT) of a signal.