Stat153/248 - Lab 5 - Solutions

Here you will be practicing nonlinear sinusoidal regressions on some fake data (that we will generate). Go through each of the cells and fill in the parts that say #FILLIN

import numpy as np
import math
from matplotlib import pyplot as plt
import statsmodels.api as sm

Creating a sinusoid¶

We will first make a sinusoid as we discussed this week in class. This is defined as

$y = \beta_0 + R \cos (2 \pi ft+\phi) + \epsilon$

Where $\epsilon \overset{iid}\sim N(0, \sigma^2)$ . We will also specify a sampling rate fs, a duration of our signal in seconds, and the initial parameters $(\beta_0, f, R, \phi, \sigma)$ .

Recall that:

$\beta_0$ is the intercept that shifts the sinusoid from mean 0.
$f$ is the frequency of the oscillation (how many peaks in one second)
$R$ is the amplitude
$\phi$ is the phase shift
$\sigma$ is the variance of our white noise signal

fs = 500 # sampling rate 
duration = 2
t = np.arange(0,duration,step=1/fs)
B0 = 2
phi = 0
f = 3.2
R =  2.5
var_eps = 0.2 # Variance of white noise

# Our true sinusoid
y = B0 + R*np.cos(2*math.pi*f*t + phi) + np.sqrt(var_eps)*np.random.randn(len(t))

# Let's plot it and our x and y origin lines
plt.plot(t, y)
plt.axhline(0, color='k', linewidth=0.5)
plt.axvline(0, color='k', linewidth=0.5)
plt.xlabel('Time')
plt.ylabel('value')

Finding the frequency, $f$ ¶

Let’s assume we have received these data and we have no idea what the true $f$ is. We can find it with a grid search across a number of potential $f_hat$ values, then do OLS using the relationship that we discussed, which is that we can rewrite $y$ as:

$y = \beta_0 + \beta_1 \cos (2\pi \hat{f}t) + \beta_2 \sin (2\pi \hat{f}t ) + \epsilon_t$

Where $\beta_1=R\cos\phi$ and $\beta_2=-R\sin\phi$ . We will still solve for these $\beta$ values with OLS, since we also don’t know what $\phi$ or $R$ are.

We will start with a linearly spaced vector of values for the potential $f$ . Looking at the graph above, we can count the peaks and narrow this down to probably ~3 peaks, so we don’t actually have to test everything, though if we want to be exhaustive, we would test frequencies from $[0, \text{fs}/2]$ .

We then fit the regression model using our $X$ matrix, defined as:

X_f = \begin{pmatrix} 1 & \cos(2\pi \hat{f} t_0) & \sin(2\pi \hat{f} t_0) \\ \vdots & \vdots & \vdots \\ 1 & \cos(2\pi \hat{f} t_n) & \sin(2\pi \hat{f} t_n) \\ \end{pmatrix}

(1)

and repeat this for every value of $\hat{f}$ and plot the residual sum of squares (RSS). We are looking for the minimum value of this plot. The number of values of $f$ we test in the grid allows us to be more accurate, so try varying the number of values nf.

nf = 100  # Test this number of frequencies 
f_grid = np.linspace(0,fs/2,nf)

RSS = np.zeros(len(f_grid))
for idx, f_hat in enumerate(f_grid):
    cos_col = np.cos(2 * np.pi * f_hat * t )
    sin_col = np.sin(2 * np.pi * f_hat * t )
    
    # create our X matrix
    X = np.column_stack([np.ones(len(t)), cos_col, sin_col])

    model = sm.OLS(y, X).fit()
    betas = model.params
    RSS[idx] = np.sum(model.resid **2)

plt.plot(f_grid, RSS,'-')
best_f = f_grid[RSS.argmin()] # Find the best f_hat
plt.axvline(best_f, color='r', linewidth=0.5, linestyle='--')
plt.text(best_f + 1, np.min(RSS), f'f={best_f:.3f}')
plt.xlabel('Frequency')
plt.ylabel('RSS')

Use our best f to calculate the other coefficients¶

From this we can choose the best $f$ and try fitting OLS again and looking at the parameters and performance of the model.

Again:

X_f = \begin{pmatrix} 1 & \cos(2\pi \hat{f} t_0) & \sin(2\pi \hat{f} t_0) \\ \vdots & \vdots & \vdots \\ 1 & \cos(2\pi \hat{f} t_n) & \sin(2\pi \hat{f} t_n) \\ \end{pmatrix}

(2)

but with only the best $\hat{f}$

cos_col = np.cos(2 * np.pi * best_f * t )
sin_col = np.sin(2 * np.pi * best_f * t )
X = np.column_stack([np.ones(len(t)), cos_col, sin_col])

model = sm.OLS(y, X).fit()
betas = model.params
y_hat = model.fittedvalues

plt.figure()
plt.plot(y, label='y')
plt.plot(y_hat, label='yhat')
plt.legend()

plt.figure()
plt.plot(y, y_hat,'.')
plt.xlabel('y')
plt.ylabel('yhat')

Estimating the original parameters of the sinusoid¶

Recall we can get back the original parameters through their relationship to the $\beta$ values:

R=\sqrt{\beta_1^2+ \beta_2^2}, \quad \phi = \arctan\left(-\frac{\beta_2}{\beta_1}\right)

(3)

Let’s see what we get.

B0_hat = betas[0]
R_hat = np.sqrt(betas[1]**2 + betas[2]**2)
phi_hat = np.arctan(-betas[2]/betas[1])

print(f'B0={B0}\t B0_hat={B0_hat}')
print(f'R={R}\t R_hat={R_hat}')
print(f'phi={phi}\t phi_hat={phi_hat}')
print(f'f={f}\t f_hat={best_f}')

B0=2	 B0_hat=2.029404412553876
R=2.5	 R_hat=0.4737475535789633
phi=0	 phi_hat=1.0137021097636343
f=3.2	 f_hat=2.525252525252525

How does this look? It’s okay, but not quite right. Try redoing the grid search again below in a way that you think will make this more accurate.

nf = 10000  # Test this number of frequencies 
f_grid = np.linspace(0,fs/2,nf)
# or
nf = 1000
f_grid = np.linspace(2,4,nf)

RSS = np.zeros(len(f_grid))
for idx, f_hat in enumerate(f_grid):
    cos_col = np.cos(2 * np.pi * f_hat * t )
    sin_col = np.sin(2 * np.pi * f_hat * t )
    
    # create our X matrix
    X = np.column_stack([np.ones(len(t)), cos_col, sin_col])

    model = sm.OLS(y, X).fit()
    betas = model.params
    RSS[idx] = np.sum(model.resid **2)

plt.plot(f_grid, RSS,'-')
best_f = f_grid[RSS.argmin()] # Find the best f_hat
plt.axvline(best_f, color='r', linewidth=0.5, linestyle='--')
plt.text(best_f + 1, np.min(RSS), f'f={best_f:.3f}')
plt.xlabel('Frequency')
plt.ylabel('RSS')

cos_col = np.cos(2 * np.pi * best_f * t )
sin_col = np.sin(2 * np.pi * best_f * t )
X = np.column_stack([np.ones(len(t)), cos_col, sin_col])

model = sm.OLS(y, X).fit()
betas = model.params
y_hat = model.fittedvalues

plt.figure()
plt.plot(y, label='y')
plt.plot(y_hat, label='yhat')
plt.legend()

Compare $y$ and $\hat{y}$ directly¶

We can also just plot our actual data versus the estimate. If our linear model is perfect, we should see the unity line $\hat{y}=y$

plt.figure()
plt.plot(y, y_hat,'.')
plt.plot(y,y,'-')
plt.xlabel('y')
plt.ylabel('yhat');

Estimating the original parameters of the sinusoid¶

Let’s get back the original parameters again with this (hopefully) improved regression:

R=\sqrt{\beta_1^2+ \beta_2^2}, \quad \phi = \arctan\left(-\frac{\beta_2}{\beta_1}\right)

(4)

Let’s see what we get.

B0_hat = betas[0]
R_hat = np.sqrt(betas[1]**2 + betas[2]**2)
phi_hat = np.arctan(-betas[2]/betas[1])

print(f'B0={B0}\t B0_hat={B0_hat}')
print(f'R={R}\t R_hat={R_hat}')
print(f'phi={phi}\t phi_hat={phi_hat}')
print(f'f={f}\t f_hat={best_f}')

B0=2	 B0_hat=1.9930655070550236
R=2.5	 R_hat=2.4680140137480113
phi=0	 phi_hat=-0.006247204538511174
f=3.2	 f_hat=3.201201201201201

How does this look now?¶

What about new data?¶

We have fit our data using the sinusoid pretty successfully here. What if we tried to use this as a predictive model to forecast new data? Let’s try generating some new data in the future from the same underlying sinusoid, then use the coefficients that we fit above on our “training set” to fit the new “test set” data.

# Generate some new data for some new time points in the future $t2$
duration2 = 1
t2 = np.arange(t[-1],t[-1]+duration2,step=1/fs)

y_test = B0 + R*np.cos(2*math.pi*f*t2 + phi) + np.sqrt(var_eps)*np.random.randn(len(t2))

plt.plot(t, y, label='original training')
plt.plot(t2, y_test, label='new test data')
plt.legend()
plt.xlabel('t')
plt.ylabel('value');

# Make a new X matrix with the original best_f we estimated from the training data, and 
# use t2 as our new time observations, then calculate our new estimate of y
cos_col = np.cos(2 * np.pi * best_f * t2 )
sin_col = np.sin(2 * np.pi * best_f * t2 )
X = np.column_stack([np.ones(len(t2)), cos_col, sin_col])

y_hat_test_orig = np.dot(X, model.params) 

plt.plot(t, y)
plt.plot(t, model.fittedvalues)

plt.plot(t2, y_test)
plt.plot(t2, y_hat_test_orig)

train_RMSE = np.sqrt(np.mean((model.resid)**2))
test_RMSE = np.sqrt(np.mean((y_test - y_hat_test_orig)**2))
print(f'root mean squared error on training data: {train_RMSE}')
print(f'root mean squared error on test data: {test_RMSE}')

plt.figure()
plt.plot(t2, y_test - y_hat_test_orig)
plt.xlabel('time')
plt.ylabel('Residual')
plt.title('Residuals')

root mean squared error on training data: 0.44177693508270116
root mean squared error on test data: 0.4524990060606751

How does this look?¶

More noise¶

Now try re-running the analysis above using a higher noise level (i.e. set var_eps to be much larger for the error variance). What do you notice? Fill in your edited code below, and first try doing a grid search over all frequencies from 0 to the Nyquist limit.

When do things break down?

fs = 500 # sampling rate 
duration = 2
t = np.arange(0,duration,step=1/fs)
B0 = 2
phi = 0
f = 3.2
R =  2.5
var_eps = 1000 # Variance of white noise

# Our true sinusoid
y = B0 + R*np.cos(2*math.pi*f*t + phi) + np.sqrt(var_eps)*np.random.randn(len(t))

# Let's plot it and our x and y origin lines
plt.plot(t, y)
plt.axhline(0, color='k', linewidth=0.5)
plt.axvline(0, color='k', linewidth=0.5)
plt.xlabel('Time')
plt.ylabel('value')

nf = 1000  # Test this number of frequencies 
f_grid = np.linspace(0,fs/2,nf)
# or
#nf = 1000
#f_grid = np.linspace(2,4,nf)

RSS = np.zeros(len(f_grid))
for idx, f_hat in enumerate(f_grid):
    cos_col = np.cos(2 * np.pi * f_hat * t )
    sin_col = np.sin(2 * np.pi * f_hat * t )
    
    # create our X matrix
    X = np.column_stack([np.ones(len(t)), cos_col, sin_col])

    model = sm.OLS(y, X).fit()
    betas = model.params
    RSS[idx] = np.sum(model.resid **2)

plt.plot(f_grid, RSS,'-')
best_f = f_grid[RSS.argmin()] # Find the best f_hat
plt.axvline(best_f, color='r', linewidth=0.5, linestyle='--')
plt.text(best_f + 1, np.min(RSS), f'f={best_f:.3f}')
plt.xlabel('Frequency')
plt.ylabel('RSS')

cos_col = np.cos(2 * np.pi * best_f * t )
sin_col = np.sin(2 * np.pi * best_f * t )
X = np.column_stack([np.ones(len(t)), cos_col, sin_col])

model = sm.OLS(y, X).fit()
betas = model.params
y_hat = model.fittedvalues

plt.figure()
plt.plot(y, label='y')
plt.plot(y_hat, label='yhat')
plt.legend()

Stat153/248 - Lab 5 - Solutions

Creating a sinusoid¶

Finding the frequency, fff¶

Use our best f to calculate the other coefficients¶

Estimating the original parameters of the sinusoid¶

Compare yyy and y^\hat{y}y^​ directly¶

Estimating the original parameters of the sinusoid¶

How does this look now?¶

What about new data?¶

How does this look?¶

More noise¶

Finding the frequency, $f$ ¶

Compare $y$ and $\hat{y}$ directly¶