In order to do this homework you must
Fork the Biostat778_HW2 repository on GitHub
Once you have forked the repository on GitHub, you can clone it to your local computer to actually do the work.
As you are working, make sure to commit changes at logical points
via git add and git commit.
Once you have completed the assignment, you can push your changes
back to your GitHub repository via git push.
Once you have pushed your changes, make a pull request on GitHub so that I can see that you're ready to submit your homework.
Your finished code should be submitted in the form of an R package. The R package should be named Homework2.
Your R package should pass R CMD check without any warnings,
errors, or notes.
I will put unit tests in the tests directory of the R package
for the master branch. Please DO NOT make any changes in the tests
directory.
You can use the tests in the tests directory to check your code
as the expected results will also be in the tests directory.
Consider data \(y_1,y_2,\dots,y_n\) which are iid from a mixture of 2 Normal distributions,
\[ y_i \sim \lambda\mathcal{N}(\mu_1,\sigma_1^2)+(1-\lambda)\mathcal{N}(\mu_2,\sigma_2^2) \]
Write a function that estimates the unknown parameters \(\lambda\), \(\mu_1\), \(\mu_2\), \(\sigma_1^2\), and \(\sigma_2^2\) using either Newton's method or the EM algorithm.
Do not use the optim, nlm, nlminb, or optimize functions in
your code.
There should be a method argument that takes options “newton” or
“EM” to allow the user to choose which fitting method is used
For the “newton” method you may be interested in using the deriv
or deriv3 functions.
Your function should return a list with elements mle containing
the vector of maximum likelihood estimates and stderr containing
the vector of corresponding asymptotic standard errors for the
MLEs. The elements of both the mle and stderr vectors should be
named with the following names: lambda, mu1, mu2,
sigma1, sigma2.
There should be a param0 argument that allows users to specify the
starting value for either the Newton or the EM algorithm. The
default value for param0 should be NULL, in which case your
function should choose the starting value.
Your function should check to see that the value specified for
method is valid. The easiest way to do this is with the
match.arg() function.
There should be a maxit' argument specifying the maximum number of
iterations for each method. It defaults toNULL, in which case
maxit should be 100 for Newton's method and 500 for the EM
algorithm.
There should be a tol argument that controls the tolerance for
convergence and it should default to 1e-8.
Place your function in an R package with appropriate documentation.
You can test your function with the data provided in the Git repository.
Your function should follow the following prototype:
mixture <- function(y, method, maxit = NULL, tol = 1e-08, param0 = NULL) {
## Your code goes here
## Return a list with elements `mle' for the maximum likelhood estimates and
## `stderr' for their standard errors.
}