Simulation from the CLF (Conditional Linear Family) and other routines (a brief description) ---------------------------------------------------------------- Conventions used here: Cluster size = n Mean vector = u Covariance matrix = v Correlation matrix = r Pairwise odds ratio vector = psi Correlation (scalar) = rho All vectors will be column vectors. Pairwise odds ratios will be stored in a column vector in lexicographic order, e.g. for n=4, they will be: psi_12, psi_13, psi_14, psi_23, psi_24, psi_34. ---------------------------------------------------------------- To compute E[Y_i*Y_j] from E[Y_i], E[Y_j], O.R.(Y_i, Y_j) mu_ij = solv2 (ui, uj, psi_ij); ui = E[Y_i] (scalar) uj = E[Y_j] (scalar) psi_ij = O.R.(Y_i, Y_j) = odds ratio between Y_i and Y_j (scalar) mu_ij <- E[Y_i*Y_j] = pr(Y_i = Y_j = 1) (scalar) ---------------------------------------------------------------- To convert covariances to correlations r = var2cor (v); v = variance matrix r <- correlation matrix ---------------------------------------------------------------- To convert correlations to covariances v = cor2var (r, u); u = mean of a binary random vector r = correlation matrix of a binary random vector v <- variance matrix ---------------------------------------------------------------- To convert correlations to odds ratios psi = cor2psi (r, u); r = corr matrix u = mean vector (r, u) assumed compatible psi <- odds ratios, column vector, in lexicographic order, e.g. for n=4, they will be: psi_12, psi_13, psi_14, psi_23, psi_24, psi_34. ---------------------------------------------------------------- To convert odds ratios to correlations r = psi2cor (psi, u); psi = odds ratios, column vector, in lexicographic order u = mean vector r <- corr matrix ---------------------------------------------------------------- To convert odds ratios to covariances v = psi2var(psi, u); psi = pairwise odds ratios column vector (n*(n-1)/2 by 1), in lexicographic order u = mean of a binary random vector v <- variance matrix ---------------------------------------------------------------- To convert covariances to odds ratios psi = var2psi(v, u); v = variance matrix u = mean of a binary random vector psi <-pairwise odds ratios column vector (n*(n-1)/2 by 1), in lexicographic order ---------------------------------------------------------------- To simulate a Bernoulli vector with mean u and varaince V, an intermediate matrix B is needed. It is obtained by b = allreg(v); v = variance matrix b <- intermediate matrix (same dim as v) to be used by other routines ---------------------------------------------------------------- Multivariate Bernoulli simulation from the CLF y = mbsclf1 (u, b); b = matrix computed by b = allreg(v), v=var matrix y <- one simulated column vector ---------------------------------------------------------------- Multivariate Bernoulli simulation from the CLF y = mbsclf (m, u, B); m = # vectors to simulate u = mean vector b = matrix computed by b = allreg(v), v=var matrix y <- m by n matrix, m independent vectors (stored in rows). Each row y[,1:n] will be a binary random vector with mean u[1:n] and variance v[1:n,1:n]. ---------------------------------------------------------------- To simulate bivariate Bernoulli's y = ranbin2 (m, u, psi); m = # vectors to simulate u = mean vector, 2 by 1 psi = odds ratio, scalar y <- m by 2 matrix, m independent vectors (stored in rows). Each row will be a binary random vector y[,1:2] with mean u[1:2] and odds ratio psi. ---------------------------------------------------------------- To check that the pair (u, v) is compatible with the CLF err = blrchk (u, v); /* err <- 1 if there is an error (cond. means out of range) */ or err = blrchk1 (u, b); /* err <- 1 if there is an error (cond. means out of range) */ u = mean vector b = matrix computed by b = allreg(v), v=var matrix v = variance matrix err=1 indicates that (u, v) is not compatible with the CLF, i.e. The CLF can't be used to simulate Bernoulli vectors with mean u and variance v. If matrix b has already been computed, blrchk1() is more efficient than blrchk(). ---------------------------------------------------------------- To compute the joint probability of a given y vector under the CLF p = clf_prob (nu, y, u, b); y[1:n] = 0/1 vector (column) u[1:n] = mean vector (column) b = matrix computed by b = allreg(v), v=var matrix p <- prob(Y=y; u, b), i.e. prob (vector Y=y) (scalar) nu <- conditional means, nu[i] <- E[Y[i] | Y[1:i-1]] (vector) ---------------------------------------------------------------- To compute the probabilities of a all possible y vectors under the CLF (i.e. 2^n possible values of y) p = allprob (u, b, pflag); u[1:n] = mean vector (column) b = matrix computed by b = allreg(v), v=var matrix pflag = flag that controls printing, print all possible y vectors and their conditional probabilities if pflag != 0 p[1:2^n] <- probabilities for all 2^n possible y vectors. The ordering of p[] is such that p[1] is the probability of a zero vector, then y_1 changes fastest. Example, n=3, p will be 8x1, p[1] <- prob(Y_1 = 0, Y_2 = 0, Y_3 = 0) p[2] <- prob(Y_1 = 1, Y_2 = 0, Y_3 = 0) p[3] <- prob(Y_1 = 0, Y_2 = 1, Y_3 = 0) p[4] <- prob(Y_1 = 1, Y_2 = 1, Y_3 = 0) p[5] <- prob(Y_1 = 0, Y_2 = 0, Y_3 = 1) p[6] <- prob(Y_1 = 1, Y_2 = 0, Y_3 = 1) p[7] <- prob(Y_1 = 0, Y_2 = 1, Y_3 = 1) p[8] <- prob(Y_1 = 1, Y_2 = 1, Y_3 = 1) ---------------------------------------------------------------- For the special case of exchangeable correlation, can use y = ranxch (u, rho); u[1:n] = marginal mean vector rho = correlation. Returns -1 (scalar) if not CLF compatible. y <- column vector multivariate binary, with mean u[1:n] and exchangeable correlation parameter rho ---------------------------------------------------------------- To find all pairwise range violations. err = chkbinc (r, mu, i, j); Input: r, mu Output: i, j, return value r[1:n, 1:n] = the correlation matrix (only upper half is checked) mu[1:n] = the mean vector To check that the pairwise correlations are compatible with the means: err <- number of range violation detected (0 means no violations). (i,j) <- (row,column) locations of all errors found. If no violations are found, i and j are set to "." (missing). i=row, j=column, i