Question 4. This problem is based on the HUG-KIDS study, a multisite, uncontrolled, open label Phase II safety study of the use of hydroxyurea ("HU"), a drug that may be useful for the amelioration of some of the symptoms of sickle cell disease. Here is a brief summary of relevant aspects of the HUG-KIDS protocol. Patients at 7 clinical sites were screened to determine eligibility for the study. The eligibility criteria were specified to ensure that all subjects were among the 10% of patients most seriously afflicted with sickle cell disease symptoms. The first part of the study was a dose-escalation phase intended to find each patient's "maximum tolerated dose" (MTD). Each patient was initially administered a dose of 15 mg of HU per kg of body weight ("15 mg/kg"). The patient then returned to the clinic every two weeks for evaluation. Some patient characteristics were recorded only at "4 week visits" and not at all "2 week visits"; others were recorded at all visits. After 8 weeks at a specific dose during which the patient experienced no "hematologic toxicity" (the definition of which is too complicated to present here), a patient's dose would be increased by 5 mg/kg. When a patient experienced a hematologic toxicity, the dose was set to 0 for a period of two weeks, following which the patient was administered a dose 2.5 mg/kg lower than the dose previously received. A patient's dose ultimately "converges" either to the maximum allowable dose, 30 mg/kg, or to some lower dose; the "converged" dose is defined as that patient's MTD. (Some details of the "convergence criteria" are omitted here.) It should be noted, however, that this protocol was not correctly followed for all patients. The data provided for this problem are artificial, but very similar to the data collected in the actual study. The SAS datasets should contain one observation per patient for each visit from baseline (WEEK 0) up to at most 32 weeks (even though some patients had not yet converged to MTD by that time), except that the protocol did not require recording data from the first visit after a 0-dose period following a hematologic toxicity. Unfortunately, some other data are also missing, but such missing data may be assumed ignorably missing for the purposes of this problem. The variables are as follows: AGE: Patient's age, in years, at time of visit, computed to the nearest day DOSE: Dose administered to the patient during the two weeks preceding this visit. Possible values range from 0 to 30 in increments of 2.5 GENDER: "F" or "M" (the only non-numeric variable) HT: Patient's height, in cm, at time of visit ID: Patient's ID number MCH: Patient's mean cell hemoglobin SITE: ID number of the site where the patient was treated VISDATE: Date of the visit, stored as a SAS date value (format DATE7.) WEEK Study week, i.e., number of weeks the patient had been on study at the time of visit WT Patient's weight at time of visit The objectives of this set of analyses are:  To evaluate the dose-response relationship between mean cell hemoglobin (dependent variable) and HU dose (expressed as mg/kg) during the first 32 weeks on drug, in the context of the above protocol, separately for females and males. [Does mean MCH vary with dose? If so, is the relationship linear (straight line)? Curvilinear?]  To compare the dose-response relationship for females to that for males.  To evaluate the covariance structure and covariance parameters of the repeated measurements from a patient. In general terms, the assignment is to analyze the data to address the objectives of the study. However, to facilitate grading, please organize your work as indicated below. a) Introduction, Data Anomalies, and Descriptive Statistics. Write a brief introduction. (Feel free to copy text from the description above, which is available as an ASCII file, HUGKIDS1.TXT). Describe any data anomalies you find and explain your resolution of the anomalies. Present appropriate descriptive statistics. b) Definitions of the Model, Parameters, and Hypotheses. Present descriptions and definitions of your final model (after all fine tuning), all the model's parameters (expected value and variance-covariance), and all secondary parameters and hypotheses (a priori and post hoc). Any matrices (e.g., essence X matrix, C matrices) should be presented in well-labeled tables; where practical, combine several matrices into one table. Comment on and specifically identify, any post hoc parameters and/or hypotheses. Describe procedures (if any) you have used to cope with multiple comparisons or post hoc parameters/hypotheses. c) Results: Estimates and Inferential Statistics. Present your estimates and inferential statistics (test statistics, confidence regions) in a small number of well-conceived, well-labeled tables and/or graphs. Do not present lengthy verbal descriptions, but include enough text to guide the reader through the tables and/or figures. d) Conclusions. Briefly describe the conclusions indicated by your analysis. The data are available both in HUGKIDS1.SSD (SAS version 6.04 dataset) and in HUGKIDS1.SD2 (SAS version 6.11 dataset). In addition, the datasets ORPOL.SD2, ORPOL.SSD, and ORPOL.ASC (ASCII file) contain the variable DOSE [with values 0(2.5)30] and the matrix P, represented by variables P0, P1, P2, P3, P4, P5, P6, which contain the values of the orthogonal polynomials of degree 0 through 6 generated by these values of DOSE, as computed by ORPOL(DOSE,P,6) in PROC IML. The transpose of P is contained in the datasets ORPOLT.SD2, ORPOLT.SSD, and ORPOLT.ASC. Suggestion 1: For this problem do not consider polynomials of degree higher than 6 in any component of the model. Suggestion 2: If you wish to use random polynomial effects, use the values of P0, P1, P2, etc. in Z rather than the values of INT, DOSE, DOSE2, etc. Scoring: a) 5, b) 10, c) 5, d) 5.