þì¦MS WRITTEN EXAMINATION IN BIOSTATISTICSÈ ¦PART IÈ ¦March 12, 1991: 9:30Û,212:30 PMÈ þî‚INSTRUCTIONSƒ: ™a)™This is a ‚closed bookƒ examination. ™b)™Answer any ‚threeƒ questions during the three hour time period. ™c)™Put the answers to different questions on separate sets of paper. ™d)™Put your code letter, ‚notƒ your name, on each page. ™e)™Return the examination with a signed statement of the honor pledge on þîa page separate from your answers. þî™f)™You are required to answer only what is asked in the questions and not þîall you know about the topics. þèìîQ1.™Let X¬1È, X¬2È, Û\5, X¬n¬1ÈÈ be a random sample of size n¬1È from ¦f¬XÈ(x; Ûh;¬1È)ÛV2ºxÂÛh;¬1ÈË e«-x«2È/2Ûh;¬1ÈÈ , xÛ>80, Ûh;¬1ÈÛ>80 .È ™And, let Y¬1È, Y¬2È, Û\5, Y¬n¬2ÈÈ be a random sample of size n¬2È from ¦f¬YÈ(y; Ûh;¬2È)ÛV2ºyÂÛh;¬2ÈË e«-y«2È/2Ûh;¬2ÈÈ , yÛ>80, Ûh;¬2ÈÛ>80 .È 5 pts.™™a)™ For r(Û>80), show that E(X«rÈ)ÛV2(2Ûh;¬1È)«r/2ÈÛG;ÛB0ºr+2Â2ËÛK0 and E(Y«rÈ)ÛV2(2Ûh;¬2È)«r/2ÈÛG;ÛB0ºr+2Â2ËÛK0 . 3 pts.™™b)™ Find two statistics (say, U¬1È and U¬2È) which are ‚jointly sufficientƒ for Ûh;¬1È and Ûh;¬2È. 3 pts.™™c)™ Find functions of U¬1È and U¬2È, say ¦Ûh;µ^Ȭ1ÈÛV2g¬1È(U¬1È) and Ûh;µ^Ȭ2ÈÛV2g¬2È(U¬2È) ,È ™™™ such that E(Ûh;µ^Ȭ1È)ÛV2Ûh;¬1È and E(Ûh;µ^Ȭ2È)ÛV2Ûh;¬2È . 7 pts.™™d)™ Assuming that n¬1È and n¬2È are large, find an approximate 100(1Û,2Ûa;)% confidence interval þî for (Ûh;¬1ÈÛ,2Ûh;¬2È) which is a function of Ûh;µ^Ȭ1È and Ûh;µ^Ȭ2È . What is your interval estimate of (Ûh;¬1ÈÛ,2Ûh;¬2È) when n¬1ÈÛV2n¬2ÈÛV2100, Ûh;µ^Ȭ1ÈÛV24, Ûh;µ^Ȭ2ÈÛV23, and Ûa;ÛV20.05? þî  7 pts.™™e)™ Show that the generalized likelihood ratio test of H¬0È: Ûh;¬1ÈÛV2Ûh;¬2È (ÛV2Ûh;, say) versus þî H¬AÈ:––Ûh;¬1ÈÛW2Ûh;¬2È can be based on the statistic ¦Û!0µn¬1ÈɶiÛV21ÈX«2ɬiÈÛ,0 ¤Û!0µn¬1ÈɶiÛV21ÈX«2ɬiÈÛ+2Û!0µn¬2ÈɶiÛV21ÈY«2ɬiÈÏ .È þèìî Q2.™For various reasons, individuals in a survey sample may prefer ‚notƒ to confide to an interviewer þîthe correct answers to certain sensitive or stigmatizing questions about their personal lives (e.g., about whether or not they use drugs, about whether or not they have ever stolen anything, etc.). To combat this problem, Warner [‚Journal of the American Statistical Associationƒ, March 1965, Vol. 60 (309), pp. 63-69] introduced a technique for estimating the proportion Ûp; of a human population having a sensitive or stigmatizing attribute A. The method, which he called randomized response", is designed to eliminate untruthful responses which would result in a biased estimate of Ûp;. ™™This randomized response" procedure works as follows. A random sample of n people is selected from the population of interest. Before a particular sensitive issue is discussed (e.g., like whether or not an individual has or does not have a sensitive or stigmatizing attribute A), the interviewer gives each interviewee a ‚spinnerƒ with a face marked so that the spinner points to the letter A with probability Ûh; and ‚notƒ to the letter A (i.e., to the complementary outcome ¤AÊ) with probability (1Û,2Ûh;), 0Û<8Ûh;Û<81; here, Ûh; has a ‚KNOWNƒ value. Each of the n interviewees in the sample spins the spinner (while unobserved by the interviewer) and reports ‚ONLYƒ whether or not the spinner points to the letter representing the group (either A or ¤AÊ) to which the interviewee ‚trulyƒ belongs. That is, the interviewee is required ‚ONLYƒ to say yes" or no" according to whether or not the spinner points to the correct group; he or she does ‚NOTƒ report the actual letter (or, equivalently, the group) to which the spinner points. þìî ™™Given the above discussion, let us define the following quantities: ™Ûp;ÛV2true (but ‚unknownƒ) proportion of people in the population possessing the sensitive or þîstigmatizing attribute A; Ûh;ÛV2‚knownƒ probability that the spinner points to the letter A; for iÛV21, 2, Û\5, n, let þì™™™ ™™1 if the i-th person in the sample responds yes", ™™™ ™X¬iÈ ÛV2 ––ÛV0 ™™™ ™™0 if the i-th person in the sample responds no". þìAgain, recall that a yes" or no" response means only that the spinner has identified group membership (A or ¤AÊ) correctly; the actual letter to which the spinner points is ‚notƒ divulged to the interviewer. þî3 pts.™™a)™ ‚Proveƒ that ¦pr(X¬iÈÛV21)ÛV2Ûp;(2Ûh;Û,21)Û+2(1Û,2Ûh;) .È 5pts.™™b)™ Prove that the maximum likelihood estimator (MLE) of Ûp; is ¦Ûp;µ^ÈÛV2º(Ûh;Û,21)Â(2Ûh;Û,21)ËÛ+2ºSÂn(2Ûh;Û,21)Ë , Ûh;ÛW2º1Â2Ë ,È ™™™ where SÛV2Û!0µnɶi=1ÈX¬iÈ . 3 pts.™™c)™ Show that ¦E(Ûp;µ^È)ÛV2Ûp; ,È ™™™ so that randomized response" estimator Ûp;µ^È is, indeed, an unbiased estimator of the true þî proportion Ûp; of people in the population possessing the sensitive attribute A. þî  7 pts.™™d)™ Find the ‚exactƒ variance of Ûp;µ^È, and use it to construct an appropriate 95% confidence þî interval for Ûp; when nÛV2100, Ûh;ÛV20.20, and Ûp;µ^ÈÛV20.10. þî  7 pts.™™e)™ If Ûh;ÛV20.20, develop an expression for the smallest sample size n«*È required so that ¦pr{Û|8Ûp;µ^ÈÛ,2Ûp;Û|8Û<8Ûd;}Û;20.95 ,È þî where Ûd;Û>80 is a known positive quantity. Note that your answer will be a function of the unknown value of Ûp;; for what value of Ûp; is n«*È the largest? þî  Q3.™Let X¬1È, Û\5, X¬nÈ be a random sample from the exponential distribution: ¦F(x)ÛV21Û,2exp(-x/Ûh;) , xÛ>80, Ûh;Û>80 .È þîLet Ûj; denote the median of F. That is, F(Ûj;)ÛV21/2 . þî 4 pts.™™a)™ Show that Ûj;ÛV2Ûh;log2 . 5 pts.™™b)™ Verify that the maximum likelihood estimator of Ûj; is given by ¦Ûj;µ^ÈÛV2¤XÊlog2 , where ¤XÊÛV2(X¬1ÈÛ+2Û]5Û+2X¬nÈ)/n .È 4 pts.™™c)™ Is Ûj;µ^È unbiased? 4 pts.™™d)™ Compute var(Ûj;µ^È) . 4 pts.™™e)™ Show that ¤nÕ(¤XÊÛ,2Ûh;)/Ûh; has approximately a N(0, 1) distribution. From this derive a 95% þî confidence interval for Ûj;. þî  4 pts.™™f)™ What is the distribution of ¤XÊ? Discuss briefly how this can be used to construct an exact þî 95% confidence interval for Ûj;. þî  Q4.™™A population of size N is divided into H sampling strata. With stratified simple random þîsampling as the sampling design, ¤yʬwoÈÛV2Û!0µHɶh=1ÈW¬hȤyʬhÈ is used to estimate the population mean per member, ¤YÊÛV2Û!0µHɶh=1ÈW¬hȤYʬhÈ, where (for the h-th stratum) ¤YʬhÈ is the mean per member among all N¬hÈ stratum members, ¤yʬhÈ is the mean per member for the SRS of n¬hÈ stratum members, and W¬hÈÛV2N¬hÈ/N is the proportion of the population in the stratum. ™™Keeping in mind that you are allowed to cite any known statistical properties of (unstratified) simple random sampling, show each of the following: þî7 pts.™™a)™ ¤yʬwoÈ is an unbiased estimator of ¤YÊ. 8 pts.™™b)™ The variance of the sampling distribution of ¤yʬwoÈ is ¦Var(¤yʬwoÈ)ÛV2Û!0µHɶh=1ÈW«2ɬhÈ º1Û,2f¬hÈÂn¬hÈË S«2ɬhÈ ,È þî where S«2ɬhÈ is the element variance for members of the h-th stratum and f¬hÈÛV2n¬hÈ/N¬hÈ . þî 5 pts.™™c)™ Under ‚proportionateƒ stratified simple random sampling, ¤yʬwoÈÛV2¤yÊ , where ¤yÊ is the simple þî mean among all nÛV2Û!0µHɶh=1È n¬hÈ members of the stratified sample. þî 5 pts.™™d)™ Also under ‚proportionateƒ stratified simple random sampling, the variance of ¤yʬwoÈ can be þî expressed as, ¦Var(¤yʬwoÈ)ÛV2º1Û,2fÂnË Û!0µHɶh=1ÈW¬hÈS«2ɬhÈ ,È where fÛV2n/N .