Estimation of multivariate t distribution in R -

May 15, 2011

i konw if there function in r allows estimate df of multivariate t distribution.

the problem easy: have matrix of 5 variables (columns) 75 observations (rows). estimate df of multivariate t on sample.

thanks,

juan.

***edition: after fabians suggestions implemented dmvt() formula****

# "residuals" matrix residuals model. want estimate df of   # sample assuming multivariate-t  sigma<-cor(residuals, use="pairwise.complete.obs", method="pearson") my_means<-vector(length = 8)  (i in 1:8){   my_means[i]<-mean(my_matrix[,i])  }  residuals.scaled<-scale(residuals) df.1 <-dmvt(residuals.scaled, my_means, sigma, log= false, type = "shifted", df = 1)

i have doubts regarding: 1) scaling: i'm centering data. don't know if correct. 2) using log = false don't know why densities should given log(d) in case 3) here should estimate likehood of sample data each df. thus, more code lines df.2, df.3, etc should added , calculate likelihood of each. then, choose highest. correct?

package mvtnorm supplies density of (shifted) multivariate t-distribution in function dmvt. enter (scaled) data , sample correlation , compute likelihood of data different values of df. pick value of dfthat maximizes likelihood of data.

edit:

library(mvtnorm) set.seed(12121212) ################################################################################ ## simulate n vectors of p-dim. t-distributed data in matrix x: n <- 300 p <- 8  # draw random column means means <- 10 * rnorm(p)  # correlation ar(1) correlation rho=.8 rho <- 0.8 sigma <- rho ^ abs(outer(1:p, 1:p, "-"))  # column s.d.s sqrt(1:8) df <- 3 x <- t(t(rmvt(n, sigma=sigma, delta=means, df=df)) * sqrt(1:8))   ################################################################################ # evaluate t-likelihood scaled x:  x_scale <- scale(x) sigma_est <- cor(x_scale)  df_candidates <- seq(1, 20, by=2) loglik <- numeric(length(df_candidates)) names(loglik) <- df_candidates for(df in df_candidates){     # no need delta since we're working on scaled & centered data.     # use sum(log(likelihood)), not prod(likelihood) avoid numeric over/underflow      loglik[as.character(df)] <- sum(dmvt(x=x_scale, sigma=sigma_est,                                           df=df, log=true)) } loglik #        1         3         5         7         9        11        13  #-1788.219 -1756.301 -1768.885 -1783.724 -1797.386 -1809.556 -1820.382  #       15        17        19  #-1830.066 -1838.788 -1846.698  ## --> maximal df=3, used simulation.  ## verify mean shift can incorporated pre-processing above: dmvt(x[1,], delta=means) == dmvt(x[1,] - means) #[1] true

Search This Blog

And

Estimation of multivariate t distribution in R -

Comments

Post a Comment

Popular posts from this blog

Android layout hidden on keyboard show -

google app engine - 403 Forbidden POST - Flask WTForms -

how to run a query SQL in node.js mysql -