MS WRITTEN EXAMINATION IN BIOSTATISTICS PART II March 13, 1991: 9:30 AM to 1:30 PM INSTRUCTIONS: a)This is an open book examination. b)Answer any three questions. c)Put the answers to different qustions on separate sets of papers. d)Put your CODE LETTER, (not your name) on each page. e)Return the examination with a signed statement of the honor pledge on a page separate from your answers. f)You are required to answer only what is asked in the question not all you know about the topics. Q1.Consider the data in the attached table. Each row is a subject's data, with A indicating treatment group, Y the response, and X a nuisance variable (measured before treatment was applied). Note that S;YV2219, S;XV2252, S;Y28V21547, S;X28V21916. The General Linear Univariate Model (GLUM) may be stated as y~V2X~b;~ +2 e~ . A series of PROC GLM runs was conducted with model statements and results as follows. Model Statement#SS Model-SS Error7SS Total I.Y=INT2 INT3 INT4 X X2 X3 X4;#177.08.37.677214.75 II.Y=INT2 INT3 INT4 X;#175.67.39.077214.75 III.Y=X;#156.02.58.727214.75 IV.Y=INT2 INT3 INT4;$53.86-160.897214.75 Use a;V2.05 in the following. Indicate which one of th four models above allows unequal slopes for different levels of A. 3 pts.i) Specify clearly the dimensions, elements, and statistical properties of all four matrices in  the GLUM corresponding to that model. 3 pts.ii) A null hypothesis of interest is that of coincidence, which corresponds to the regression  line for Y on X being the same for all levels of A (in all groups). This is an example of a General Linear Hypothesis (GLH). Express this GLH in matrix notation, based on your formulation of the model. Clearly specify the dimensions and elements of all matrices. 2 pts.iii) Perform the test of coincidence. 3 pts.iv) Another null hypothesis of interest is that of equality of slopes. Express this GLH in  matrix notation, based on your formulation of the model. Clearly specify the dimensions and elements of all matrices. 2 pts.v) Perform the test of equality of slopes. 3 pts.vi) Indicate which one of the four models represents the traditional Analysis of Covariance  (ANCOVA). Specify clearly the dimensions, elements, and statistical properties of all four matrices in the GLUM corresponding to that model.  3 ptsvii) Another null hypothesis of interest is that of equality of intercepts. Express this GLH in  matrix notation, based on your formulation of the ANCOVA model. Clearly specify the dimensions and elements of all matrices. 2 pts.viii) Perform the test of equality of intercepts. ix) Briefly, but clearly, describe just the conceptual differences among a) the full model in  every cell, b) ANCOVA, and c) ANOVA on difference scores.   Q2.The contingency table shown below is from a two-center randomized clinical trial to compare two treatments for relief of tension headache. 9TreatmentNoneSlight"Moderate,Good3Excellent=Total Test18"12,9324=54 Placebo123"17,9316=57 3 pts.a) For each treatment, provide the percentage of patients with good or excellent response  and its standard error.  3 pts.b) Apply a statistical test to compare the two treatments with respect to the percentage of  patients with good or excellent response. 3 pts.c) Provide an 95% confidence interval for the odds ratio expressing a greater rate of good or  excellent response for test treatment than for placebo. 4 pts.d) Summarize how to apply a statistical test for evaluating whether the two treatments are  equivalent relative to the alternative of better relief for one of them. Provide an expression for the test statistic with definitions (and values) of the quantities in it. Computation of the test statistic is not necessary, although specification of a SAS Procedure and options as well as relevant part of output would be desirable.  3 pts.e) For the two centers in the study, the numbers of patients with good or excellent relief  versus no, slight, or moderate relief were 9CenterTreatment!(None, Slight Moderate)7(Good, Excellent) 1Active!15712 91Placebo!2079 2Active!6721 92Placebo!12716 Under minimal assumptions, apply a statistical test to compare the two treatments for all patients in this study.  3 pts.f) Apply a statistical test to evaluate whether the association between treatment and good  or excellent response is homogeneous across the two centers. 2 pts.g) The use of a logistic regression model for good or excellent reponse provided the following  results. *Parameter-b;^ș7s.e. Intercept,,22.3871.12 Test Treatment.0.9970.47 Center 2.1.0870.48 Male sex,,20.4870.60 Baseline severity,,20.9070.25 (1-5 scale for severe to mild) Age/10,,20.3570.18 * Summarize the relevant assumptions for the application of logistic regression to this study. Show the mathematical structure of the model and provide definitions of quantities in it.  4 pts.h) Provide an 95% confidence interval for the odds ratio expressing a greater rate of good or  excellent response for test treatment than for placebo, and assess its significance. Discuss how the results of this evaluation compare to those in (b), (c) and (e) with respect to whether the rates of good or excellent response for the two treatments are similar or not.  Q3.A sample survey of 800 adults is conducted in a large city in order to determine citizen attitudes towardnational health insurance. The sampling frame for the survey is a list of adult taxpayers from which four strata are formed corresponding to each of four major sections of the city. The design calls for selecting a simple random sample within each stratum. General data for the survey sample as well as results for an opinion question on national health insurance are as follows:  > h12'364@  (Inner City)(Blue Collar)#(Middle Class0(Affluent Suburbs)BTotal %Suburbs) Nhș > 20,00050,000&20,000510,000A100,000 nhș200200(2007200D800 Xhș190180)40830D440 NhV2 total number of adults nhV2 number of adults interviewed in the sample XhV2 number of interviewed sample adults who are in favor of national health insurance.  3 pts.a) Calculate a stratified estimate of the proportion (P) of adults in the city who favor  national health insurance.  5 pts.b) Calculate an estimate of the variance of the estimate produced in Part (a). 5 pts.c) A colleague argues that since a random sample" of adults has been chosen and since  simple random sampling was used, the estimate and its variance can be calculated as if a simple random sample of nV2800 adults had been selected. Calculate the estimate and its variance as the colleague suggests. 5 pts.d) Comment briefly on the difference between your analysis and your colleague's analysis. 7 pts.e) If you decide to use Neyman allocation for a similar survey on national health insurance  in the future, how would you then allocate a sample of nV2800 adults to the same four strata?  Q4.Consider the study described in the attached article by Cartwright, Lindahl, and Bawden from the Journal of Dentistry for Children (1968). 3 pts.a) The investigators say study" or investigation" but never experiment". Was this an  experiment, technically speaking? Explain why or why not. 5 pts.b) Identify the dependent and independent variables for this study. How many levels,  factors, and treatments were there, according to the statistical definitions of these terms? 6 pts.c) Three of the covariables which the investigators considered as potential confounders-- although they do not use the work confound"-- are age, amount of fluoride in the water supply, and living conditions. How did they ensure that these covariables would not in fact be confounders? What evidence indicates that their efforts were successful? 4 pts.d) Explain how the investigators used blocking in their study. Or, if they didn't, how they  might have done so. 3 pts.e) Explain why the double blind approach was not entirely successful". 4 pts.f) What do the investigators say about the external validity of their study? (Again, they do  not use the statistician's term.) NOTE: DMFT" is the number of decayed, missing, or filled teeth".