Parametric model \(\mathcal{F} = \{f(x;\theta); \theta\in\Theta\}\)
An independent sample \(X_1,...,X_n\)
Likelihood: probability of observing the sample as a function of parameter
\[\mathcal{L}(\theta) = \prod_{i=1}^n f(X_i;\theta)\]
mu = 1
sigma = 2
n = 100
x = rnorm(n, mu, sigma) # sample
hist(x, freq=FALSE) # show histogram
for(m in -1:3){
# compare with Gaussians with differnt means
curve(dnorm(x, m, sigma), add=TRUE, col=m+3)
}
m = seq(-1, 3, 0.1) # grid
s = seq(1, 3, 0.1)
L = matrix(0, length(m), length(s))
for(i in 1:length(s)){
for(j in 1:length(m)){
L[j,i] = prod(dnorm(x, m[j], s[i]))
}
}
persp(m, s, L, ticktype="detailed", theta=30)
\[\mathcal{l}(\theta) = \log \mathcal{L}(\theta) = \sum_{i=1}^n \log f(X_i;\theta)\]
# 3D plot
persp(m, s, log(L), ticktype="detailed", theta=30, phi=30)
The surface of log likelihood is smoother than likelihood, which is convenient for parameter fitting.
# countour plot
contour(m, s, log(L), xlab="m", ylab="s")
Maximum likelihood estimator (MLE) \(\hat{\theta}\)
\[f(x;p) = p^x(1-p)^{1-x}\]
for samples \(x_1,...,x_n\), let \(S = \sum_i x_i\), i.e. number of 1s
let \(\frac{d\mathcal{l}(p)}{dp} = 0\)
MLE is \(\hat{p} = \frac{S}{n}\) … sample mean
\[f(x;\mu,\sigma^2)=\frac{1}{\sqrt{2\pi}\sigma} e^{–\frac{(x–\mu)^2}{2\sigma^2}}\]
let \(M=\frac{1}{n}\sum_i x_i\) and \(S=\frac{1}{n}\sum_i x_i^2\)
Likelihood:
\[\mathcal{L}(\mu,\sigma^2) = \prod_i \frac{1}{\sqrt{2\pi}\sigma} e^{–\frac{(x_i–\mu)^2}{2\sigma^2}} = (2\pi)^{-\frac{n}{2}}\sigma^{-n} e^{\sum_i\frac{-(x_i–\mu)^2}{2\sigma^2}}\\ = (2\pi)^{-\frac{n}{2}}\sigma^{-n} e^\frac{n(–S+2M\mu-\mu^2)}{2\sigma^2}\]
\[\mathcal{l}(\mu,\sigma^2) = -\frac{n}{2}\log(2\pi) -n\log\sigma +n\frac{–S+2M\mu-\mu^2}{2\sigma^2}\]
then \(n\frac{2M-2\mu}{2\sigma^2} = 0\),
i.e. \(\hat{\mu}=M\) … sample mean
then \(-n\frac{1}{\sigma} - 2n\frac{–S^2+M^2}{2\sigma^3}=0\)
i.e. \(\sigma^2=S-M^2=\frac{1}{n}\sum_i (x_i-M)^2\) … sample variance
x = rnorm(10, 1, 1)
M = mean(x)
S = mean(x^2)
c(M, S-M^2)
[1] 1.1234711 0.8463601
Pairs of data \((X_1,Y_1),...,(X_n,Y_n)\) sampled from \((X,Y)\) * \(X\): independent var., predictor var., covariate, regressor
* \(Y\): dependent var., outcome, response var.
Generative model: \[Y = r(X) + \epsilon \quad\mbox{ where } E(\epsilon)=0\]
Linear model: \[Y = \beta_0 + \beta_1X + \epsilon\]
\(E(\epsilon)=0\) and \(V(\epsilon)=\sigma^2\)
\[\mbox{RSS} = \sum_i\epsilon_i^2 = \sum_i\{y_i – (\beta_0 + \beta_1x_i)\}^2\]
Let \(\bar(X)=\frac{1}{n}\sum_ix_i\) and \(\bar(Y)=\frac{1}{n}\sum_iy_i\)
Let \(\frac{\partial\mbox{RSS}}{\partial \beta_0}=0\)
then \(\sum_i (y_i–\beta_0-\beta_1x_i) = \bar{Y}-\beta_1\bar{X}-\beta_0=0\)
thus \(\beta_0=\bar{Y}-\beta_1\bar{X}\)
Let \(\frac{\partial\mbox{RSS}}{\partial \beta_1}=0\)
then \(\sum_i \{(y_i–\bar{Y})-\beta_1(x_i-\bar{X})\}(x_i-\bar{X}) = 0\) i.e. \(\sum_i(x_i-\bar{X})(y_i–\bar{Y}) - \beta_1\sum_i(x_i-\bar{X})^2 = 0\)
\[\hat{\beta_1} = \frac{\sum_i(x_i–\bar{X})(y_i–\bar{Y})}{\sum_i(x_i–\bar{X})^2} = \frac{Cov(X,Y)}{Var(X)}\]
\[\hat{\beta}_0 = \bar{Y} – \hat{\beta}_1\bar{X}\]
n = 1000
b0 = 5
b1 = 2
eps = 1
x = runif(n, 0, 10)
y = b0 + b1*x + rnorm(n, 0, eps)
plot(x, y) # plot sample points
xbar = mean(x)
ybar = mean(y)
b1hat = sum((x-xbar)*(y-ybar))/sum((x-xbar)^2)
b0hat = ybar - b1hat*xbar
c(b0hat, b1hat)
[1] 4.979394 1.994324
curve(b0hat+b1hat*x, add=TRUE)
Assumption: \(\epsilon \sim \mathcal{N}(0,\sigma^2)\)
Likelihood:
\[\mathcal{L}(\beta_0,\beta_1,\sigma) = \prod_i f(y_i;\beta_0+\beta_1x_i,\sigma^2)\\ = (2\pi)^{-\frac{n}{2}}\sigma^{-n} e^{–\sum_i\frac{(y_i – (\beta_0 + \beta_1x_i))^2}{2\sigma^2}}\]
\[\mathcal{l}(\beta_0,\beta_1,\sigma) = -\frac{n}{2}\log(2\pi) –n\log\sigma -\frac{1}{2\sigma^2}\sum_i\{y_i–(\beta_0+\beta_1x_i)\}^2\]
tss = sum((y-ybar)^2)
ess = sum(((b0hat+b1hat*x)-ybar)^2)
rss = sum((y-(b0hat+b1hat*x))^2)
r2 = ess/tss
c(tss, ess, rss, r2)
[1] 3.362383e+04 3.254664e+04 1.077196e+03 9.679633e-01
A data frame is a list made of vectors or matrices with the same length or rows, defined as data.frame
class.
df = data.frame(id=1:3, value=diag(1, 3), note=c("low","hi","low"))
df
id value.1 value.2 value.3 note
1 1 1 0 0 low
2 2 0 1 0 hi
3 3 0 0 1 low
[ ]
or $
.df$id
[1] 1 2 3
df$value.1
[1] 1 0 0
df$note
[1] low hi low
Levels: hi low
R comes with many datasets convenient for testing.
data()
summary(ChickWeight)
weight Time Chick Diet
Min. : 35.0 Min. : 0.00 13 : 12 1:220
1st Qu.: 63.0 1st Qu.: 4.00 9 : 12 2:120
Median :103.0 Median :10.00 20 : 12 3:120
Mean :121.8 Mean :10.72 10 : 12 4:118
3rd Qu.:163.8 3rd Qu.:16.00 17 : 12
Max. :373.0 Max. :21.00 19 : 12
(Other):506
attach()
a data frame, its compnents are copied to your work space, unless a variable of the same name has already been defined.attach(ChickWeight)
plot(Time, weight, col=Diet)
plot(Time[Diet==3], weight[Diet==3])
Try linear regression of weight by time.
x = Time
y = weight
plot(x, y) # plot sample points
xbar = mean(x)
ybar = mean(y)
b1hat = sum((x-xbar)*(y-ybar))/sum((x-xbar)^2)
b0hat = ybar - b1hat*xbar
curve(b0hat+b1hat*x, add=TRUE)
title(sprintf("beta0=%5.2f, beta1=%5.2f", b0hat, b1hat))
tss = sum((y-ybar)^2)
ess = sum(((b0hat+b1hat*x)-ybar)^2)
rss = sum((y-(b0hat+b1hat*x))^2)
r2 = ess/tss
paste("tss=", tss, "ess=", ess, "rss=", rss)
[1] "tss= 2914555.92560554 ess= 2042343.74905088 rss= 872212.176554658"
paste("R^2=", r2, "Cor^2=", cor(x,y)^2)
[1] "R^2= 0.700739255372688 Cor^2= 0.700739255372688"
\(k\)-dimensional input \(X = (X_1,...,X_k)\)
\[Y = \beta_0 + \beta_1X_1 + ... + \beta_kX_k + \epsilon\]
\[Y = X \beta + \epsilon\]
\[X = \left(\begin{array}{ccc} X_{11} & \cdots & X_{1k} \\ \vdots & & \vdots\\ X_{n1} & \cdots & X_{nk} \end{array}\right)\]
\[Y =\left(\begin{array}{c} Y_{1} \\ \vdots \\ Y_{n} \end{array}\right)\]
\[\mbox{RSS} = \sum_i\epsilon_i^2 = \sum_i (Y_i – X_i \beta)^2 = ||Y – X\beta||^2\]
+ RSS is minimized by
\[\hat{\beta} = (X^TX)^{–1} X^TY\]
pairs(state.x77)
attach(as.data.frame(state.x77))
plot(Area, Population)
lm()
in RR offers lm()
for linear regression.
See how Life Expectancy varies with Income.
summary()
gives a variety of information,
LifeIncome = lm(`Life Exp`~Income)
summary(LifeIncome)
Call:
lm(formula = `Life Exp` ~ Income)
Residuals:
Min 1Q Median 3Q Max
-2.96547 -0.76381 -0.03428 0.92876 2.32951
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.758e+01 1.328e+00 50.906 <2e-16 ***
Income 7.433e-04 2.965e-04 2.507 0.0156 *
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 1.275 on 48 degrees of freedom
Multiple R-squared: 0.1158, Adjusted R-squared: 0.09735
F-statistic: 6.285 on 1 and 48 DF, p-value: 0.01562
Store the coeficients in a vector.
beta = coefficients(LifeIncome)
beta
(Intercept) Income
6.758132e+01 7.433343e-04
We can predict the values for new inputs
income = seq(0, 10000, 500)
lifetime = predict(LifeIncome, data.frame(Income=income))
plot(income, lifetime)
abline()
adds the fitted line to a plot.
plot(Income, `Life Exp`)
abline(LifeIncome)
Try linear regression between other variables…
LifeMurder = lm(`Life Exp`~Murder)
summary(LifeMurder)
Call:
lm(formula = `Life Exp` ~ Murder)
Residuals:
Min 1Q Median 3Q Max
-1.81690 -0.48139 0.09591 0.39769 2.38691
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 72.97356 0.26997 270.30 < 2e-16 ***
Murder -0.28395 0.03279 -8.66 2.26e-11 ***
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8473 on 48 degrees of freedom
Multiple R-squared: 0.6097, Adjusted R-squared: 0.6016
F-statistic: 74.99 on 1 and 48 DF, p-value: 2.26e-11
lm()
lm()
can also be used for multiple regression.
LifeIM = lm(`Life Exp` ~ Income + Murder)
summary(LifeIM)
Call:
lm(formula = `Life Exp` ~ Income + Murder)
Residuals:
Min 1Q Median 3Q Max
-1.48301 -0.62099 -0.01714 0.47768 2.20831
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 71.2255815 0.9673952 73.626 < 2e-16 ***
Income 0.0003705 0.0001973 1.878 0.0666 .
Murder -0.2697594 0.0328408 -8.214 1.22e-10 ***
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8259 on 47 degrees of freedom
Multiple R-squared: 0.637, Adjusted R-squared: 0.6215
F-statistic: 41.23 on 2 and 47 DF, p-value: 4.561e-11
beta2 = coefficients(LifeIM)
Please try regression with other variables or other data sets.
Let us use “rgl” package for ineractive viewing.
Tools >Install Packages… menu to install it. You may also need the X Window installed (XQuartz, etc.).
library(rgl)
plot3d(Income, Murder, `Life Exp`)
You can draw a surface by persp3d().
plot3d(Income, Murder, `Life Exp`)
xr = range(Income)
xc = cbind(xr,xr) # values at 4 corners
yr = range(Murder)
yc = rbind(yr,yr) # values at 4 corners
z = beta2[1] + beta2[2]*xc + beta2[3]*yc # regression surface
persp3d(xr, yr, z, col='green', add=TRUE)
X = seq(-5,5,0.1)
X2 = X**2
X3 = X**3
X4 = X**4
X5 = X**5
plot(X, X, type="l")
lines(X, X2, type="l")
lines(X, X3, type="l")
lines(X, X4, type="l")
lines(X, X5, type="l")
Y = sin(X)
plot(X, Y, type="l")
n = 50
x = runif(n, -5, 5)
sigma = 0.1
y = sin(x) + rnorm(n, 0, sigma)
plot(x, y, type="p", col="green")
curve(sin(x), add=TRUE)
x2 = x**2
x3 = x**3
x4 = x**4
x5 = x**5
poly3 = lm(y~x+x2+x3)
poly5 = lm(y~x+x2+x3+x4+x5)
coefficients(poly3)
(Intercept) x x2 x3
0.20649506 0.44711899 -0.02415746 -0.03556397
coefficients(poly5)
(Intercept) x x2 x3
0.1042882031 0.8719270771 -0.0207464605 -0.1154905502
x4 x5
0.0008189321 0.0030131912
Predict the output
Y3 = predict(poly3,data.frame(x=X,x2=X2,x3=X3))
Y5 = predict(poly5,data.frame(x=X,x2=X2,x3=X3,x4=X4,x5=X5))
plot(x, y, col="green")
lines(X, Y, type="l",col="blue")
lines(X, Y3,col="orange")
lines(X, Y5,col="red")
As we increase the number of regressors, even noisy outcome can be precisely fit.
This can make the regression function deviated from the underlying function, and can increase the errors for new data not used for fitting.
ns = 20 # sample points
xs = runif(ns, -4, 4)
sigma = 0.2 # sampling noise
target = sin # target function, e.g. sin()
ys = target(xs) + rnorm(ns, 0, sigma)
plot(xs, ys, type="p", col="green")
curve(target(x), add=TRUE, lwd=2, col="red")
# polynomial regression
np = 50
xp = seq(-4, 4, length.out=np)
for(k in 3:11){
x = xs # samples for fitting
pm = lm(ys ~ poly(x, k, raw=TRUE))
print( summary(pm))
x = xp # prediction
yp = predict(pm, newdata=data.frame(poly(x, k, raw=TRUE)))
#yp = predict(pm, data.frame(x=xp))
lines(x, yp, type="l",col=gray(k/20))
}
Call:
lm(formula = ys ~ poly(x, k, raw = TRUE))
Residuals:
Min 1Q Median 3Q Max
-0.29131 -0.17449 -0.01542 0.10599 0.65599
Coefficients:
Estimate Std. Error t value
(Intercept) -0.181688 0.091932 -1.976
poly(x, k, raw = TRUE)1 0.701614 0.064937 10.804
poly(x, k, raw = TRUE)2 0.014968 0.011724 1.277
poly(x, k, raw = TRUE)3 -0.065701 0.005964 -11.017
Pr(>|t|)
(Intercept) 0.0656 .
poly(x, k, raw = TRUE)1 9.26e-09 ***
poly(x, k, raw = TRUE)2 0.2199
poly(x, k, raw = TRUE)3 7.02e-09 ***
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.254 on 16 degrees of freedom
Multiple R-squared: 0.8973, Adjusted R-squared: 0.878
F-statistic: 46.59 on 3 and 16 DF, p-value: 3.945e-08
Call:
lm(formula = ys ~ poly(x, k, raw = TRUE))
Residuals:
Min 1Q Median 3Q Max
-0.29654 -0.16850 -0.03514 0.12371 0.63332
Coefficients:
Estimate Std. Error t value
(Intercept) -0.2015699 0.1250754 -1.612
poly(x, k, raw = TRUE)1 0.6984913 0.0681521 10.249
poly(x, k, raw = TRUE)2 0.0265492 0.0490652 0.541
poly(x, k, raw = TRUE)3 -0.0655553 0.0061761 -10.614
poly(x, k, raw = TRUE)4 -0.0008186 0.0033614 -0.244
Pr(>|t|)
(Intercept) 0.128
poly(x, k, raw = TRUE)1 3.61e-08 ***
poly(x, k, raw = TRUE)2 0.596
poly(x, k, raw = TRUE)3 2.27e-08 ***
poly(x, k, raw = TRUE)4 0.811
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2618 on 15 degrees of freedom
Multiple R-squared: 0.8977, Adjusted R-squared: 0.8704
F-statistic: 32.9 on 4 and 15 DF, p-value: 2.903e-07
Call:
lm(formula = ys ~ poly(x, k, raw = TRUE))
Residuals:
Min 1Q Median 3Q Max
-0.2368 -0.1508 0.0139 0.1205 0.3242
Coefficients:
Estimate Std. Error t value
(Intercept) -0.173092 0.097029 -1.784
poly(x, k, raw = TRUE)1 0.958092 0.093992 10.193
poly(x, k, raw = TRUE)2 0.042573 0.038219 1.114
poly(x, k, raw = TRUE)3 -0.143754 0.023932 -6.007
poly(x, k, raw = TRUE)4 -0.002053 0.002624 -0.783
poly(x, k, raw = TRUE)5 0.004475 0.001342 3.335
Pr(>|t|)
(Intercept) 0.09612 .
poly(x, k, raw = TRUE)1 7.36e-08 ***
poly(x, k, raw = TRUE)2 0.28406
poly(x, k, raw = TRUE)3 3.22e-05 ***
poly(x, k, raw = TRUE)4 0.44689
poly(x, k, raw = TRUE)5 0.00491 **
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2023 on 14 degrees of freedom
Multiple R-squared: 0.943, Adjusted R-squared: 0.9226
F-statistic: 46.3 on 5 and 14 DF, p-value: 3.233e-08
Call:
lm(formula = ys ~ poly(x, k, raw = TRUE))
Residuals:
Min 1Q Median 3Q Max
-0.26657 -0.12466 0.02248 0.09091 0.35059
Coefficients:
Estimate Std. Error t value
(Intercept) -0.0572615 0.1085473 -0.528
poly(x, k, raw = TRUE)1 0.9689429 0.0867066 11.175
poly(x, k, raw = TRUE)2 -0.0986055 0.0830182 -1.188
poly(x, k, raw = TRUE)3 -0.1493887 0.0222311 -6.720
poly(x, k, raw = TRUE)4 0.0249399 0.0145790 1.711
poly(x, k, raw = TRUE)5 0.0048394 0.0012504 3.870
poly(x, k, raw = TRUE)6 -0.0012723 0.0006776 -1.877
Pr(>|t|)
(Intercept) 0.60672
poly(x, k, raw = TRUE)1 4.89e-08 ***
poly(x, k, raw = TRUE)2 0.25618
poly(x, k, raw = TRUE)3 1.43e-05 ***
poly(x, k, raw = TRUE)4 0.11088
poly(x, k, raw = TRUE)5 0.00193 **
poly(x, k, raw = TRUE)6 0.08307 .
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1862 on 13 degrees of freedom
Multiple R-squared: 0.9551, Adjusted R-squared: 0.9344
F-statistic: 46.13 on 6 and 13 DF, p-value: 5.084e-08
Call:
lm(formula = ys ~ poly(x, k, raw = TRUE))
Residuals:
Min 1Q Median 3Q Max
-0.26978 -0.12330 0.02373 0.11564 0.33937
Coefficients:
Estimate Std. Error t value
(Intercept) -0.0536803 0.1124704 -0.477
poly(x, k, raw = TRUE)1 0.9255077 0.1364300 6.784
poly(x, k, raw = TRUE)2 -0.1101458 0.0900251 -1.224
poly(x, k, raw = TRUE)3 -0.1234700 0.0655560 -1.883
poly(x, k, raw = TRUE)4 0.0273701 0.0161256 1.697
poly(x, k, raw = TRUE)5 0.0010497 0.0090703 0.116
poly(x, k, raw = TRUE)6 -0.0013921 0.0007555 -1.843
poly(x, k, raw = TRUE)7 0.0001553 0.0003679 0.422
Pr(>|t|)
(Intercept) 0.6417
poly(x, k, raw = TRUE)1 1.95e-05 ***
poly(x, k, raw = TRUE)2 0.2446
poly(x, k, raw = TRUE)3 0.0841 .
poly(x, k, raw = TRUE)4 0.1154
poly(x, k, raw = TRUE)5 0.9098
poly(x, k, raw = TRUE)6 0.0902 .
poly(x, k, raw = TRUE)7 0.6804
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1924 on 12 degrees of freedom
Multiple R-squared: 0.9558, Adjusted R-squared: 0.93
F-statistic: 37.07 on 7 and 12 DF, p-value: 3.378e-07
Call:
lm(formula = ys ~ poly(x, k, raw = TRUE))
Residuals:
Min 1Q Median 3Q Max
-0.23466 -0.14559 0.00039 0.10458 0.31720
Coefficients:
Estimate Std. Error t value
(Intercept) -0.1190007 0.1386773 -0.858
poly(x, k, raw = TRUE)1 0.9445997 0.1401779 6.739
poly(x, k, raw = TRUE)2 0.0276791 0.1900057 0.146
poly(x, k, raw = TRUE)3 -0.1237675 0.0664381 -1.863
poly(x, k, raw = TRUE)4 -0.0207701 0.0604652 -0.344
poly(x, k, raw = TRUE)5 0.0004303 0.0092226 0.047
poly(x, k, raw = TRUE)6 0.0040351 0.0066076 0.611
poly(x, k, raw = TRUE)7 0.0001975 0.0003763 0.525
poly(x, k, raw = TRUE)8 -0.0001908 0.0002307 -0.827
Pr(>|t|)
(Intercept) 0.4091
poly(x, k, raw = TRUE)1 3.21e-05 ***
poly(x, k, raw = TRUE)2 0.8868
poly(x, k, raw = TRUE)3 0.0894 .
poly(x, k, raw = TRUE)4 0.7377
poly(x, k, raw = TRUE)5 0.9636
poly(x, k, raw = TRUE)6 0.5538
poly(x, k, raw = TRUE)7 0.6102
poly(x, k, raw = TRUE)8 0.4258
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.195 on 11 degrees of freedom
Multiple R-squared: 0.9584, Adjusted R-squared: 0.9281
F-statistic: 31.66 on 8 and 11 DF, p-value: 1.58e-06
Call:
lm(formula = ys ~ poly(x, k, raw = TRUE))
Residuals:
Min 1Q Median 3Q Max
-0.22217 -0.09489 0.02552 0.07952 0.30819
Coefficients:
Estimate Std. Error t value
(Intercept) -0.1077859 0.1324799 -0.814
poly(x, k, raw = TRUE)1 0.7698886 0.1801226 4.274
poly(x, k, raw = TRUE)2 0.0093877 0.1816439 0.052
poly(x, k, raw = TRUE)3 0.0644109 0.1446390 0.445
poly(x, k, raw = TRUE)4 -0.0133295 0.0578929 -0.230
poly(x, k, raw = TRUE)5 -0.0509017 0.0365424 -1.393
poly(x, k, raw = TRUE)6 0.0030614 0.0063373 0.483
poly(x, k, raw = TRUE)7 0.0052422 0.0035041 1.496
poly(x, k, raw = TRUE)8 -0.0001507 0.0002218 -0.680
poly(x, k, raw = TRUE)9 -0.0001625 0.0001122 -1.447
Pr(>|t|)
(Intercept) 0.43482
poly(x, k, raw = TRUE)1 0.00163 **
poly(x, k, raw = TRUE)2 0.95980
poly(x, k, raw = TRUE)3 0.66557
poly(x, k, raw = TRUE)4 0.82254
poly(x, k, raw = TRUE)5 0.19382
poly(x, k, raw = TRUE)6 0.63944
poly(x, k, raw = TRUE)7 0.16552
poly(x, k, raw = TRUE)8 0.51224
poly(x, k, raw = TRUE)9 0.17843
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1859 on 10 degrees of freedom
Multiple R-squared: 0.9656, Adjusted R-squared: 0.9346
F-statistic: 31.18 on 9 and 10 DF, p-value: 3.724e-06
Call:
lm(formula = ys ~ poly(x, k, raw = TRUE))
Residuals:
Min 1Q Median 3Q Max
-0.194427 -0.111667 0.008215 0.101650 0.280957
Coefficients:
Estimate Std. Error t value
(Intercept) -1.840e-01 1.490e-01 -1.235
poly(x, k, raw = TRUE)1 8.169e-01 1.838e-01 4.445
poly(x, k, raw = TRUE)2 2.300e-01 2.719e-01 0.846
poly(x, k, raw = TRUE)3 3.898e-02 1.453e-01 0.268
poly(x, k, raw = TRUE)4 -1.348e-01 1.260e-01 -1.070
poly(x, k, raw = TRUE)5 -4.561e-02 3.656e-02 -1.248
poly(x, k, raw = TRUE)6 2.725e-02 2.320e-02 1.174
poly(x, k, raw = TRUE)7 4.786e-03 3.500e-03 1.367
poly(x, k, raw = TRUE)8 -2.129e-03 1.840e-03 -1.157
poly(x, k, raw = TRUE)9 -1.492e-04 1.120e-04 -1.333
poly(x, k, raw = TRUE)10 5.650e-05 5.217e-05 1.083
Pr(>|t|)
(Intercept) 0.24820
poly(x, k, raw = TRUE)1 0.00161 **
poly(x, k, raw = TRUE)2 0.41951
poly(x, k, raw = TRUE)3 0.79456
poly(x, k, raw = TRUE)4 0.31253
poly(x, k, raw = TRUE)5 0.24362
poly(x, k, raw = TRUE)6 0.27035
poly(x, k, raw = TRUE)7 0.20466
poly(x, k, raw = TRUE)8 0.27701
poly(x, k, raw = TRUE)9 0.21527
poly(x, k, raw = TRUE)10 0.30698
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1843 on 9 degrees of freedom
Multiple R-squared: 0.9696, Adjusted R-squared: 0.9357
F-statistic: 28.66 on 10 and 9 DF, p-value: 1.287e-05
Call:
lm(formula = ys ~ poly(x, k, raw = TRUE))
Residuals:
Min 1Q Median 3Q Max
-0.194733 -0.097289 -0.001349 0.103516 0.276502
Coefficients:
Estimate Std. Error t value
(Intercept) -1.857e-01 1.551e-01 -1.198
poly(x, k, raw = TRUE)1 9.504e-01 3.055e-01 3.111
poly(x, k, raw = TRUE)2 2.081e-01 2.856e-01 0.729
poly(x, k, raw = TRUE)3 -1.601e-01 3.863e-01 -0.415
poly(x, k, raw = TRUE)4 -1.213e-01 1.333e-01 -0.910
poly(x, k, raw = TRUE)5 3.255e-02 1.446e-01 0.225
poly(x, k, raw = TRUE)6 2.416e-02 2.476e-02 0.976
poly(x, k, raw = TRUE)7 -7.646e-03 2.249e-02 -0.340
poly(x, k, raw = TRUE)8 -1.843e-03 1.981e-03 -0.930
poly(x, k, raw = TRUE)9 7.157e-04 1.549e-03 0.462
poly(x, k, raw = TRUE)10 4.756e-05 5.658e-05 0.841
poly(x, k, raw = TRUE)11 -2.188e-05 3.907e-05 -0.560
Pr(>|t|)
(Intercept) 0.2654
poly(x, k, raw = TRUE)1 0.0144 *
poly(x, k, raw = TRUE)2 0.4870
poly(x, k, raw = TRUE)3 0.6894
poly(x, k, raw = TRUE)4 0.3894
poly(x, k, raw = TRUE)5 0.8276
poly(x, k, raw = TRUE)6 0.3577
poly(x, k, raw = TRUE)7 0.7427
poly(x, k, raw = TRUE)8 0.3795
poly(x, k, raw = TRUE)9 0.6563
poly(x, k, raw = TRUE)10 0.4250
poly(x, k, raw = TRUE)11 0.5907
---
Signif. codes:
0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1918 on 8 degrees of freedom
Multiple R-squared: 0.9707, Adjusted R-squared: 0.9304
F-statistic: 24.1 on 11 and 8 DF, p-value: 6.286e-05
For a sample \(x_1,...,x_n\) from Poisson(\(\lambda\)), derive the MLE of \(\lambda\).
Sample \(n\) (e.g. 100) data from a normal distribution and make MLEs of the mean \(\hat{\mu}\) and the variance \(\hat{\sigma^2}\).
Repeat it \(m\) (e.g., 1000) times and see the distributions of the MLEs \(P(\hat{\mu})\) and \(P(\hat{\sigma^2})\), e.g., histogram, mean, and variance.
See how the distributions change with the sample size \(n\).
Take a dateset with three or more components.
Make different multiple regression models and compare \(R^2\) to see which model describes the relationship better.
for different orders \(k\), sample size \(ns\), and the noise level \(\sigma\).