¦BASIC MS WRITTEN EXAMINATION IN BIOSTATISTICSÈ ¦PART IÈ ¦March 3, 1995: 8:30 amÛ,212:30 pm È ‚INSTRUCTIONSƒ: ™a)™This is a ‚closed bookƒ examination. ™b)™Answer any ‚threeƒ questions during the three-hour time period. ™c)™Put the answers to different questions on separate sheets of paper. ™d)™Put your code letter, ‚notƒ your name, on each page. þì™e)™Return the examination with a signed statement of the honor pledge on a page separate from þìîyour answers. þìî™f)™You are required to answer only what is askedƒ in the questions, not to tell all you knowƒ about þìîthe topics. þèîQuestion 1. Let Y¬Û18È, Y¬Û28È, ..., Y¬nÈ constitute a random sample of size n from the population ™™ ™™p¬YÈ(y; Ûp;) = Ûp;«yÈ (1Û,2Ûp;)«Û18Û,2yÈ, y = 0, 1; 0<Ûp;<1. Let S = Û 0µnȶi=1ÈY¬iÈ, and further assume that n is large. þïLP(5 pts.)™™ a)™Develop an appropriate large-sample 100(1Û,2Ûa;)% confidence interval for Ûp;. If n = 100 þîand the observed value of S is s = 60, compute an appropriate 95% confidence interval for Ûp;. þî(8 pts.)™™ b)™‚Directly useƒ the confidence interval formulation developed in part (a) to derive an þîappropriate large-sample 100(1Û,2Ûa;)% confidence interval for the parameter Ûh; = Ûp; / (1Û,2Ûp;). Use your derived result and the data given in part (a) to compute an appropriate 95% confidence interval for Ûh;. þî(12 pts.)™™ c)™If Ûh; = 2.0 and Ûa; = 0.05, what is the minimum sample size required so that the þîprobability is at least 0.90 that the lower limit of the confidence interval for Ûh; derived in part (b) exceeds one in value? þèîQuestion 2. In a certain company, the number of N of reported on-the-job accidents per week is assumed to have the distribution ™™ ™™p¬NÈ(n) = Ûh;(1Û,2Ûh;)«nÈ, n = 0, 1, ..., + Ûr2, 0 < Ûh; < 1 . „Given that… n reported accidents occur in any given week, the conditional distribution of X, the number of those reported accidents that require hospitalization, is assumed to be ™™ ™p¬XÈ(x | N = n) = C«nɬxÈÛp;«xÈ(1Û,2Ûp;)«n-xÈ, x = 0, 1, ..., n, 0 < Ûp; < 1 . (5 pts.)™™ a)™Find „explicit expressionsƒ… for E(X) and V(X). (7 pts.)™™ b)™Find an „explicit expression… for corr(X, N). (9 pts.)™™ c)™Find an „explicit expression… for p¬XÈ(x), the (unconditional) probability distribution of X. (4 pts.)™™ d)™If Ûh; = 0.10 and Ûp; = 0.01, what is the probability that there will be at least two reported þîaccidents requiring hospitalization in any particular week? þèîQuestion 3. Suppose that stratified SRS (Simple Random Sampling) is used to choose a sample of size n = fN from a population of size N; i.e., an SRS of n¬hÈ out of N¬hÈ stratum members is separately and independently chosen þìin the h-th stratum (h= 1, 2, ..., H), such that n = Û 0µHȶh=1Èn¬hÈ . Suppose further that the sample of size n is þì‚disƒproportionately allocated, so that the stratum sampling rate, f¬hÈ = n¬hÈ/N¬hÈ, differs among strata. (8 pts.)™™ a)™Using the notation given above, determine how many possible unique samples (i.e., þîdifferent sets of selected population members) would this stratified SRS design yield. Would each unique sample be equally likely to be chosen? Briefly explain. þî(9 pts.)™™ b)™To estimate the population mean per member (¤YÊ), the estimator one would use is ™™ ™¤yʬwoÈ = Û 0µHȶh=1È W¬hȤyʬhoÈ, where W¬hÈ = N¬hÈ / N and ¤yʬhoÈ = Û 0µn¬hÈȶj=1È y¬hjÈ / n¬hÈ . ™™ ™Show that ¤yʬwoÈ is equivalent to the weighted ratio-type estimator r¬wÈ = ºÛ 0µHȶh=1ÈÛ 0µn¬hÈȶj=1ÈÛo;¬hjÈy¬hjÈÂÛ 0µHȶh=1ÈÛ 0µn¬hÈȶj=1ÈÛo;¬hjÈË , ™™ ™where Ûo;¬hjÈ = 1/Ûp;¬hjÈ is the sample weight and Ûp;¬hjÈ is the selection probability for the hj-th þîsample member. þî(8 pts.)™™ c)™Suppose that the following formula was (‚incorrectlyƒ) used to estimate the variance of þî¤yʬwoÈ: ™™ ™™™var«Û*8È(¤yʬwoÈ) = ÛB0ºÛ18Û,2fÂnËÛK0 ºÛ 0µHȶh=1È Û 0µn¬hÈȶj=1È(y¬hjÈÛ,2¤yʬwoÈ)«Û28ÈÂnÛ,2Û18Ë Assuming that the criteria used to define the strata are strongly correlated with the measure of interest (i.e., the y-variable), in what direction (i.e., too high or too low) is this estimator likely to deviate from the actual variance of ¤yʬwoÈ? Briefly explain. þèîQuestion 4. þïPRConsider a finite population (Y¬iÈ, x¬iÈ, a¬iÈ), i = 1, ..., N, where E[Y¬iÈ] = Ûb;x¬iÈ, Var(Y¬iÈ) = Ûs;«Û27Èx¬iÈ, and Cov(Y¬iÈ, Y¬jÈ) = 0 if iÛW2j. Here N, Ûs;«Û28È, and x¬iÈ are known positive numbers, and the a¬iÈ are known and only assume the values 0 or 1, such that Û!0¬iÛ:2NÈ a¬iÈ = n, the sample size (known): n Û:2 N. Let s = {i: a¬iÈ = 1}, the sample, and ¤sÊ = {i: a¬iÈ = 0}, the complementary part. Observe Y¬iÈ only if i Ûe; s. (5 pts.)™™ a)™Based on the sample s, derive the weighted least squares estimator Ûb;µÛ^8È of Ûb;. (4 pts.)™™ b)™Find the variance of Ûb;µÛ^8È. (5 pts.)™™ c)™Let ™™ ™™T = Û!0¬iÛ:2NÈ Y¬iÈ = T¬sÈ + T¬¤sÊÈ , ™™ ™where ™™ ™™T¬sÈ = Û!0¬iÛe;sÈ Y¬iÈ and T¬¤sÊÈ = Û!0¬iÛe;¤sÊÈ Y¬iÈ . ™™ ™Define a predictor TµÛ^8È of T as ™™ ™™TµÛ^8È = T¬sÈ + TµÛ^8Ȭ¤sÊÈ ; TµÛ^8Ȭ¤sÊÈ = Û!0¬iÛe;¤sÊÈ Ûb;µÛ^8Èx¬iÈ . ™™ ™Simplify the expression for TµÛ^8È . ™™ ™Under what conditions is TµÛ^8È = (N/n)T¬sÈ? (4 pts.)™™ d)™Show that E[TµÛ^8ÈÛ,2T] = 0 . (5 pts.)™™ e)™Derive the prediction MSE (mean square error) E[( TµÛ^8ÈÛ,2T )«Û28È]. (2 pts.)™™ f)™For fixed N, n, Ûs;«Û28È, x¬Û18È, ..., x¬NÈ , how can the MSE be minimized?