Question 4.

This problem is based on the HUG-KIDS study, a multisite, uncontrolled, open
label Phase II safety study of the use of hydroxyurea ("HU"), a drug that may be
useful for the amelioration of some of the symptoms of sickle cell disease. 
Here is a brief summary of relevant aspects of the HUG-KIDS protocol.  Patients
at 7 clinical sites were screened to determine eligibility for the study.  The
eligibility criteria were specified to ensure that all subjects were among the
10% of patients most seriously afflicted with sickle cell disease symptoms.
The first part of the study was a dose-escalation phase intended to find each
patient's "maximum tolerated dose" (MTD).  Each patient was initially
administered a dose of 15 mg of HU per kg of body weight ("15 mg/kg").  The
patient then returned to the clinic every two weeks for evaluation.  Some 
patient characteristics were recorded only at "4 week visits" and not at all "2
week visits"; others were recorded at all visits.  After 8 weeks at a specific
dose during which the patient experienced no "hematologic toxicity" (the
definition  of which is too complicated to present here), a patient's dose would
be increased by 5 mg/kg.  When a patient experienced a hematologic toxicity, the
dose was set to 0 for a period of two weeks, following which the patient was
administered a dose 2.5 mg/kg lower than the dose previously received.  A
patient's dose ultimately "converges" either to the maximum allowable dose, 30
mg/kg, or to some lower dose; the "converged" dose is defined as that patient's
MTD.  (Some details of the "convergence criteria" are omitted here.)  It should 
be noted, however, that this protocol was not correctly followed for all
patients.

The data provided for this problem are artificial, but very similar to the data
collected in the actual study.  The SAS datasets should contain one observation
per patient for each visit from baseline (WEEK 0) up to at most 32 weeks (even
though some patients had not yet converged to MTD by that time), except that the
protocol did not require recording data from the first visit after a 0-dose
period following a hematologic toxicity.  Unfortunately, some other data are
also missing, but such missing data may be assumed ignorably missing for the
purposes of this problem.  The variables are as follows:

AGE:     Patient's age, in years, at time of visit, computed to the nearest day
DOSE:    Dose administered to the patient during the two weeks preceding this
         visit. Possible values range from 0 to 30 in increments of 2.5
GENDER:  "F" or "M" (the only non-numeric variable)
HT:      Patient's height, in cm, at time of visit
ID:      Patient's ID number
MCH:     Patient's mean cell hemoglobin
SITE:    ID number of the site where the patient was treated
VISDATE: Date of the visit, stored as a SAS date value (format DATE7.)
WEEK     Study week, i.e., number of weeks the patient had been on study at
         the time of visit
WT       Patient's weight at time of visit

The objectives of this set of analyses are:

 To evaluate the dose-response relationship between mean cell hemoglobin
(dependent variable) and HU dose (expressed as mg/kg) during the first 32 weeks
on drug, in the context of the above protocol, separately for females and males.
[Does mean MCH vary with dose?  If so, is the relationship linear (straight
line)?  Curvilinear?]

 To compare the dose-response relationship for females to that for males.

 To evaluate the covariance structure and covariance parameters of the
repeated measurements from a patient.

In general terms, the assignment is to analyze the data to address the
objectives of the study.  However, to facilitate grading, please organize your
work as indicated below.

a) Introduction, Data Anomalies, and Descriptive Statistics.  Write a
brief introduction.  (Feel free to copy text from the description above, which
is available as an ASCII file, HUGKIDS1.TXT).  Describe any data anomalies you
find and explain your resolution of the anomalies.  Present appropriate
descriptive statistics.

b) Definitions of the Model, Parameters, and Hypotheses.  Present
descriptions and definitions of your final model (after all fine tuning), all
the model's parameters (expected value and variance-covariance), and all
secondary parameters and hypotheses (a priori and post hoc).  Any matrices
(e.g., essence X matrix, C matrices) should be presented in well-labeled
tables; where practical, combine several matrices into one table.  Comment on
and specifically identify, any post hoc parameters and/or hypotheses.  Describe
procedures (if any) you have used to cope with multiple comparisons or post hoc
parameters/hypotheses.

c) Results:  Estimates and Inferential Statistics.  Present your
estimates and inferential statistics (test statistics, confidence regions) in a
small number of well-conceived, well-labeled tables and/or graphs.  Do not
present lengthy verbal descriptions, but include enough text to guide the
reader through the tables and/or figures.

d) Conclusions.  Briefly describe the conclusions indicated by your analysis.


The data are available both in HUGKIDS1.SSD (SAS version 6.04 dataset) and in
HUGKIDS1.SD2 (SAS version 6.11 dataset). In addition, the datasets ORPOL.SD2,
ORPOL.SSD, and ORPOL.ASC (ASCII file) contain the variable DOSE [with values
0(2.5)30] and the matrix P, represented by variables P0, P1, P2, P3, P4, P5, P6,
which contain the values of the orthogonal polynomials of degree 0 through 6
generated by these values of DOSE, as computed by ORPOL(DOSE,P,6) in PROC IML.
The transpose of P is contained in the datasets ORPOLT.SD2, ORPOLT.SSD, and
ORPOLT.ASC.  Suggestion 1:  For this problem do not consider polynomials of
degree higher than 6 in any component of the model.  Suggestion 2:  If you wish
to use random polynomial effects, use the values of P0, P1, P2, etc. in Z rather
than the values of INT, DOSE, DOSE2, etc.


Scoring:  a) 5,  b) 10,  c) 5,  d) 5.

