#! /bin/sh
# This is a shell archive, meaning:
# 1. Remove everything above the #! /bin/sh line.
# 2. Save the resulting text in a file.
# 3. Execute the file with /bin/sh (for example, type sh<hazcov.shar at the
#shell prompt) to create:
#
#  README hazcov.s, hazfit.s,
#  coplot.hazcov.d,   persp.hazcov.d,    predict.hazcov.d,  print.hazfit.d, 
#  hazcov.d,          plot.hazcov.d,     predict.hazfit.d,  wcp.d,
#  hazfit.d,          plot.hazfit.d,     print.hazcov.d,    hazcov.f
#
# This archive created: Saturday June 4, 1994
if test -f 'README'
then
	echo shar: "will not over-write existing file 'README'"
else
cat << \SHAR_EOF > 'README'
*****************************************************************************
*The software contained in this shar file was written by Robert Gray at the
*Dana-Farber Cancer Institute.  The methodology is described in 
*
*Gray, R. J. (1996), "Hazard Rate Regression Using Ordinary Nonparametric
*Regression Smoothers,"  JCGS, 5:190-207.
*
* and
*
*Gray, R. J. (1994), "Hazard Estimation with Covariates: Algorithms for Direct
*Estimation, Local Scoring and Backfitting," Technical Report No. 784Z,
*Division of Biostatistics, Dana-Farber Cancer Institute.
*
*Compressed postscript files containing the technical report can be
*obtained from the link in this directory, of via anonymous ftp from 
* occams.dfci.harvard.edu, as 
*
*          pub/gray/784Z.ps
*
*
*Alternately, for a printed copy send an e-mail request to
* gray@jimmy.harvard.edu. 
*
*You are free to use this software for noncomercial purposes only, provided
*this notice is not removed, and publications using the software reference the
*appropriate papers (given above).   Modification are allowed (even
*encouraged), but please indicate in the code what was original and what was
*changed. 
*
*Send questions and comments to gray@jimmy.harvard.edu.  (Responses will depend
*on the available time and interest of the author.)
*****************************************************************************

Installation instructions:

0.  Running 

  % sh < hazcov.shar

should have created files README (ie this file), hazcov.s, hazfit.s,
coplot.hazcov.d,   persp.hazcov.d,    predict.hazcov.d,  print.hazfit.d, 
hazcov.d,          plot.hazcov.d,     predict.hazfit.d,  wcp.d,
hazfit.d,          plot.hazfit.d,     print.hazcov.d, hazcov.f

The file hazcov.s contains the main fitting routine hazcov() and related
methods for printing, plotting, etc.  The file hazfit.s contains a routine
(hazfit()) for fitting structured submodels (see the tech reports for
details) with related methods for printing, plotting etc.   The *.d files are
the help files.  hazcov.f is a Fortran subroutine for calculating # events
and followup times in the cells.  To keep things a little simpler, this is
not currently used by hazcov(), but the code to call it is still included.
Typical problems run anywhere from 5% to over 33% faster using the Fortran
routine. 


1.  Create subdirectories .Data and .Data/.Help

2.  Load the S functions, by sourcing the files hazcov.s and hazfit.s from
within S, or (for example) by running

  % cat *.s | S

from the shell prompt.

3.  copy the *.d files to the .Data/.Help directory (with the .d extension
removed). 

4.  If you want to use the fortran routine, change the variable "fast" in the
functions hazcov() and reweight.hazcov() to T, compile hazcov.f (eg using 
f77 -c hazcov.f), and make appropriate arrangements for loading the object
file in S.

Notes:

1.  Help files should be reasonably self explanatory.  Note in particular the
way formulas are used in most of the functions.  The censored survival
outcome data needs to be specified using something that evaluates to a matrix
with failure in the first column and the failure indicator in the second,
such as the function cbind() or the function Surv().  Technical reports
describing the methodology are available via anonymous ftp, as
described above, or hardcopy by sending e-mail to
gray@jimmy.harvard.edu.  Also send questions/comments to this address
(no guarantees about responses). 

2.  In hazfit, only model terms specified through hs() will work.  I still
view hazfit() as being experimental.

3.  You should have Splus 3.2 or later.  Bugs in loess() in earlier versions
will give incorrect standard errors, and may cause other problems.  

4.  Even in version 3.2, there appear to be some problems with the
interpolation algorithm in loess().  Occasionally it will refuse to predict
values at points that are clearly on the interior of the data, claiming that
it would have to extrapolate.  Also, it sometimes gives predicted surfaces
with sharp jumps.  If either of these happen and cannot be worked around in
obvious ways, the problems can be avoided by specifying 

  surface="direct"

in the calls to hazcov().  Unfortunately, for large data sets this can be
much slower.  

5.  Note that the degree in the call to loess() in hazcov defaults to 1, not
2. 

6.  Here is a simple example to test the function hazcov()

set.seed(10)
x1 <- rnorm(500)
x2 <- rnorm(500)
y <- rexp(500)/exp(.2*x1-.1*x2^2)
ce <- 5*runif(500)
fi <- ifelse(y<= ce,1,0)
y <- pmin(y,ce)
y.hc <- hazcov(cbind(y,fi)~x1+x2,ncg=15,ntg=15,span=.5)
newd <- expand.grid(x1=c(-1,0,1),x2=c(-1,0,1),Time=c(.5,1,1.5))

Then print and predict should produce
> print(y.hc)
hazcov(formula = cbind(y, fi) ~ x1 + x2, ncg = 15, ntg = 15, span = 0.5)
 sample.size number.censored 
         500             101

number time groups, covariate groups: 15 15 15
number bins with follow-up time: 1079
Residuals:
   Min. 1st Qu.  Median   Mean 3rd Qu.  Max. 
 -1.333 -0.5679 -0.3003 0.0248   0.477 3.173

Numerator Smooth:
Call:
loess(formula = nf ~ Time + x1 + x2, data = HD.TMP, weights = n, span = 0.5, 
        degree = 1)
                                        
 Number of Observations:          1079 
 Equivalent Number of Parameters: 4.7  
 Residual Standard Error:         1.123
 Multiple R-squared:              0.06 
 Residuals: 
     min   1st Q  median 3rd Q   max 
 -0.9835 -0.4916 -0.2986 0.426 3.019

> predict(y.hc,newd,se=T)
         haz         se 
 1 0.7517829 0.08830824
 2 0.8684588 0.09846460
 3 1.0803444 0.11701853
 4 0.7256412 0.08571252
 5 0.7828474 0.06632391
 6 1.1313303 0.10836297
 7 0.6001788 0.07573328
 8 0.6923642 0.06922476
 9 0.9751597 0.10848967
10 0.7869937 0.08221408
11 0.8902716 0.08552974
12 1.0661201 0.11027550
13 0.7301598 0.07733138
14 0.8406564 0.08351013
15 1.0797096 0.10254034
16 0.6160743 0.07207458
17 0.7249726 0.07433775
18 1.0149742 0.10890044
19 0.7606114 0.09904267
20 0.8955233 0.11130383
21 1.0894887 0.14807454
22 0.6786626 0.09099392
23 0.7889721 0.10428021
24 1.0640813 0.14726826
25 0.5885752 0.07645694
26 0.7102310 0.09451733
27 1.1001024 0.15167781
SHAR_EOF

fi
if test -f 'hazcov.s'
then
	echo shar: "will not over-write existing file 'hazcov.s'"
else
cat << \SHAR_EOF > 'hazcov.s'
hazcov  <- function(formula,data,subset,na.action=na.omit,wt,ncg=15,ntg=15,
    failcode=1,minpr=.05,rtime,degree=1,ls=F,eps=.001,max.iter=5,...)
{ 
### Arguments:
###   formula:  a model formula of the form cbind(time,status)~cov1+cov2 (etc)
###     where time is the failure/follow-up time and status is the failure
###     indicator, and cov1 cov2 etc are the covariates to be used in the
###     smoothing.  Although other forms for the covariates can be used,
###     essentially they are always treated as if the full interaction
###     time*cov1*cov2*etc were specified (time should not be specified as a
###     covariate, though, since it is added in as another covariate before
###     the call to loess).  Other functions than cbind() can be used for the
###     response; anything which evaluates to a matrix with time
###     in the first column and status in the second would be ok, as would the
###     name of a matrix with time in the first column and status in the 2nd.
###     In particular, the Surv() function in Terry Therneau's survival 
###     routines could be used.
###   data: optional data frame where formula should be evaluated
###   subset: usual subset argument to modeling functions
###   na.action: usual na.action for modeling functions (see lm), except here
###     the default is set to na.omit.
###   wt: if given, must be ntg by length(time).  (j,i) element gives a prior 
###     relative risk (like an offset) for case i in time interval j.  Default
###     is all 1s.
###   ncg: number of bins to use on the covariate axes.  If one element, the
###     same value is used for all covariates.  If a vector of length
###     ncols(cov), then ntg[i] is the number of bins to use for the ith col of
###     cov. 
###   ntg: number of time intervals
###   failcode: if status==failcode then failtime is a failure time,
###     otherwise failtime is censored
###   minpr: smoothed estimates in the denominator (smoothed follow-up times)
###     are truncated at this value to make sure they dont get too close to 0.
###   rtime: the range on the time axis to be included in the estimation.
###   degree: Degree of the polynomials used in the local regression in loess
###     (1 or 2)
###   ls:  if T, uses iterative local scoring to estimate the log hazard
###     otherwise uses the direct algorithm
###   eps: if ls=T, then the stops when the average relative change in hazard
###     estimates is < eps
###   max.iter:  max number of iterations in the local scoring iteration
###   ...: additional arguments to loess, in particular span, although other
###     arguments (parametric, drop.square, surface, etc) could also be
###     specified. 
###
###  set-up and error checking:
  call <- match.call()
  cov <- match.call(expand = F)
  cov$wt <- cov$ncg <- cov$ntg <- cov$failcode <- cov$mlr <- NULL
  cov$minpr <- cov$rtime <- cov$lrr <- cov$fast <- cov$degree <- NULL
  cov$exact.trace <- cov$ls <- cov$eps <- cov$max.iter <- cov$... <- NULL
  cov$na.action <- na.action
  cov[[1]] <- as.name("model.frame.default")
  cov <- eval(cov, sys.parent())
  time <- model.extract(cov,"response")
  status <- time[,2] == failcode
  time <- time[,1]
  if (missing(rtime)) rtime <- c(0,max(time))
###  Not sure if this is foolproof:
  cov <- cov[,as.character(attr(delete.response(terms(cov)),"variables")),
             drop=F]
  ncov <- ncol(cov)
  n <- nrow(cov)
  ncen <- sum(status==0)
  if (missing(wt)) wt <- matrix(1,nrow=ntg,ncol=n)
  else if (nrow(wt) != ntg | ncol(wt) != n) stop("invalid dim(wt)")
  if (length(ncg)==1) ncg <- rep(ncg,ncov)
  else if (length(ncg) != ncov) stop("length(ncg) wrong")
### reduction to grouped data:
### first set up intervals
###    breaks are the interval boundaries, midp the interval midpoints (or the 
###      factor levels), 
###       cov2[i]=k if the jth variable for case i is in
###      interval k.  At end, cell[i] indicates which cell in the grouped
###      covariate space (tensor product of the grid for each covariate) the
###      ith case is in.
  breaks <- NULL
  midp <- NULL
  cell <- rep(0,n)
  for (i in ncov:1) {  
###         catagories for covs which are factors are not altered
###            and these are treated as factor variables by loess
    if (is.factor(cov[,i])) { 
      w <- levels(cov[,i])[table(cov[,i])>0]
      ncg[i] <- length(w)
      cov2 <- as.numeric(factor(cov[,i],levels=w))
      midp <- c(list(factor(w,w)),midp)
    } else {
      w <- sort(unique(cov[,i]))
###  if the number of unique values is <= the intended number of intervals,
###     make each unique value a group.  For truly continuous 
###     covariates the breaks are given by quantiles.
      if (length(w) <= ncg[i]) {
	ncg[i] <- length(w)
	breaks <- c(w[1]-.000001,(w[-ncg[i]]+w[-1])/2,w[ncg[i]],breaks)
	midp <- c(list(w),midp)
	cov2 <- as.numeric(factor(cov[,i],levels=w)) 
      } else {
	w <- sort(unique(c(min(cov[,i])-.000001,quantile(cov[,i],
		       1:(ncg[i]-1)/ncg[i]),max(cov[,i])+.000001)))
	ncg[i] <- length(w)-1
	cov2 <- as.numeric(cut(cov[,i],breaks=w))
	breaks <- c(w,breaks)
	w <- (w[-1]+w[-(ncg[i]+1)])/2
	midp <- c(list(w),midp)
      }
    }
    cell <- ncg[i]*cell+cov2-1
  }
  cell <- cell+1
  celli <- factor(cell)
  cellii <- as.numeric(levels(celli))
  nc <- length(cellii)
  tbrk <- seq(rtime[1],rtime[2],length=ntg+1)
  dt <- diff(tbrk)
  midp <- c(list(tbrk[-1]-dt/2),midp)
  names(midp) <- c("Time",make.names(names(cov),T))
  nl <- nc*ntg   # total number cells
  wt <- ifelse(is.na(wt),0,wt)
  swt <- (wt %*% rep(1,n)) / ((wt>0) %*% rep(1,n))
  fast <- F
  if (fast) {
    z <- .Fortran("hazgrp",as.double(time),as.integer(status),
            as.double(wt),as.integer(celli),as.integer(n),as.integer(ntg),
            as.double(tbrk),as.integer(nl),zft=double(nl),zfs=integer(nl),
            nnn=integer(nc))[9:11]
### On return from call to "hazgrp" components of z are
###   zft: sum of the weighted follow-up times in each cell (cov*time)
###   zfs: # failures in each cell
###   nnn: is the number cases in each covariate cell
    zft <- z$zft; zfs <- z$zfs; nnn <- z$nnn
  } else {
    o <- order(cell)
    sub <- c(diff(cell[o])>0,T)
    nnn <- rep(0,nc)
    nnn <- table(cell)
    zft <- matrix(0,nrow=ntg,ncol=nc)
    zfs <- zft
    for (i in 1:ntg) {
      ft <- pmin(pmax(time-tbrk[i],0),tbrk[i+1]-tbrk[i])*wt[i,]
      fs <- ifelse(tbrk[i] < time & time <= tbrk[i+1],status,0)
#      zft[i,] <- tapply(ft,cell,sum)
#      zfs[i,] <- tapply(fs,cell,sum)
      ft <- cumsum(ft[o])[sub]
      ft  <- c(ft[1],diff(ft))
      zft[i,] <- ft
      fs <- cumsum(fs[o])[sub]
      fs  <- c(fs[1],diff(fs))
      zfs[i,] <- fs
    }
    zft <- c(zft)
    zfs <- c(zfs)
  }
  dt <- rep(dt,nc)
  nnn <- rep(nnn,rep(ntg,nc))

### set things up and smooth to get fail probs and expected follow-up 
### HD.TMP is the matrix of covariates actually used in the loess smoothing
###   initially it gives all the unique combinations of cov intervals
###   in the same order as components in z (note, expand.grid varies the first
###   argument most rapidly, last argument least rapidly).
  HD.TMP <- expand.grid(midp[-1])[rep(cellii,rep(ntg,nc)),,drop=F]
  HD.TMP <- cbind(Time=rep(midp[[1]],nc),HD.TMP)
  exclude <- zft <= 0
  HD.TMP <- HD.TMP[!exclude,]
### HD.TMP is the data frame for the smoothing, [!exclude] removes cells with
###  no follow-up time, since in the ordinary right censorship case these are
###  beyond the range of the data.
###
  rminpr <- minpr*(rep(swt,nc)[!exclude])
  if (ls) {
    zft <- zft[!exclude]
    zfs <- zfs[!exclude]
    div <- (nnn*dt)[!exclude]
    HD.TMP <- data.frame(mft=zft/div,nf=(zfs-zft)/div,
			 n=nnn[!exclude],HD.TMP,wtt=zft,zfs=zfs)
    form <- paste("loess(mft~",paste(names(midp),collapse="+"),
             ",data=HD.TMP,weights=n,degree=",format(degree),
             ",trace.hat=\"approximate\",...)",sep="")
    form2 <- paste("loess(nf~",paste(names(midp),collapse="+"),
             ",data=HD.TMP,weights=wtt,degree=",format(degree),
             ",...)",sep="")
    a.new <- rep(log(sum(status)/sum(time)),length(zft))
    rss <- 1
    i <- 0
    while(rss > eps & i < max.iter) {
      i <- i+1
      a.old <- a.new
      zftn <- zft*exp(a.old)
      HD.TMP$mft <- zftn/div
      z <- eval(parse(text=form))
      zp1 <- pmax(predict(z),rminpr*zftn/zft)*div
      HD.TMP$wtt <- zp1
      HD.TMP$nf <- a.old+(zfs-zftn)/zp1
      z2 <- eval(parse(text=form2))
      a.new <- predict(z2)
      rss <- sum((a.old-a.new)^2)/sum(a.new^2)
    }
    zp <- exp(a.new)
      aic <- -2*sum(a.new*zfs-zft*zp) +2*z2$inference$trace.hat
      aic <- aic/sum(HD.TMP$n)
  } else {
    HD.TMP <- data.frame(mft=(zft/(nnn*dt))[!exclude],n=nnn[!exclude],
                     nf=(zfs/(nnn*dt))[!exclude],HD.TMP)
### smooth numerator (form2) and denominator (form), and get predicted values
###  for every cell, and estimate hazards by taking the ratio
    form <- paste("loess(mft~",paste(names(midp),collapse="+"),
           ",data=HD.TMP,weights=n,degree=",format(degree),
           ",trace.hat=\"approximate\",...)",sep="")
    form2 <- paste("loess(nf~",paste(names(midp),collapse="+"),
           ",data=HD.TMP,weights=n,degree=",format(degree),
           ",...)",sep="")
    z <- eval(parse(text=form))
    z2 <- eval(parse(text=form2))
    zp1 <- predict(z)
    zp1 <- ifelse(zp1 < rminpr, NA, zp1)
    zp2 <- pmax(predict(z2),0)
    zp <- zp2/zp1
    aic <- NULL
    rss <- NULL
  }
### OUTPUT:
###   call: call to hazcov
###   bin: data frame for smoothing (HD.TMP above) with the estimated hazards 
###      (haz) 
###   fitd: loess output from denominator smooth
###   swt:  n/sum of weights on each time interval
###   fitn: loess output from numerator smooth
###   timeb: boundaries of the time intervals (vector)
###   covb: vector giving boundaries of the covariate intervals 
###   dim: number of intervals for time ([1]) and each cov ([2] to [ncov+1])
###   n: sample size and number censored in the original data (after subset
###      and na.action())
###   midp: list giving the midpoints of the intervals
###   minpr, ls:  input values of these parameters
###   cell: the cell # of the ith case in the binned covariates.  Does not
###      reflect the further splitting of bins for time intervals
###   exclude: bins in the cell x time group tensor product excluded from the
###      rows of bin because of no follow-up time
###   aic: if local scoring used, the value of the aic statistic
###   css: average squared change in estimates at last local scoring iteration
  structure(list(call=call,
      bin=data.frame(haz=zp,HD.TMP),
      fitd=z,swt=swt,fitn=z2,timeb=tbrk,covb=breaks,
      dim=c(ntg,ncg),n=c(sample.size=n,number.censored=ncen),midp=midp,
      minpr=minpr,cell=celli,exclude=exclude,
      ls=ls,aic=aic,css=rss),class="hazcov")
}

print.hazcov <- function(z) {
### z: output from hazcov
  print(z$call)
  print(z$n)
  cat("\nnumber time groups, covariate groups:",format(z$dim))
  cat("\nnumber bins with follow-up time:", format(nrow(z$bin)))
  cat("\nResiduals:\n")
  print(summary(z$bin$nf-z$bin$haz*z$bin$mft))
  cat("\nNumerator Smooth:\n")
  print(z$fitn)
  invisible()
}

predict.hazcov <- function(z,newdata,se.fit=F,lrr=F,baseline,mlr=-log(4),
    		     minpr=z$minpr) { 
###  z: hazcov object
###  newdata: required (z$bin$haz contains hazards at points used in the
###    estimation).  Data frame giving values of time and covariate
###    combinations where estimates should be calculated.  Names in newdata
###    need to correspond to names(z$midp).
###  se.fit: if T, calculate standard errors of the hazard estimates.  If
###    z$bin is large this will often fail due to excessive memory
###    requirements. 
###  lrr: if T, log relative risks are calculated (relative to the covariates
###    specified in baseline)
###  baseline:  list or data.frame containing baseline values for the
###    covariates (attempts to default to a value where estimates are most
###    likely to be defined on all intervals)
###  mlr: if lrr=T, the log relative risks are truncated below at mlr and
###    above by -mlr.  Defaults to value given in z.
###  minpr: estimates only calculated where denominator estimate >= minpr
###    [times a scaling factor if wt used in call to hazcov] (defaults to
###    value given in z). 
###
  assign("HD.TMP",z$bin,1)
#  attr(newdata,"out.attrs")  <- NULL
  zp1 <- predict(z$fitn,newdata=newdata,se.fit=se.fit)
  if (se.fit) {
    zp1s <- c(zp1$se.fit/zp1$residual.scale)
    zp1 <- c(zp1$fit)
  } else {
    zp1 <- c(zp1)
  }
  if (z$ls) {
    zp1 <- exp(zp1) 
  } else {
    zp2 <- c(predict(z$fitd,newdata=newdata))
    zp1 <- pmax(zp1,0)
    ind <- as.numeric(cut(newdata$Time,breaks=z$timeb))
    zp2 <- ifelse(zp2 < minpr*z$swt[ind], NA, zp2)
    zp1 <- zp1/zp2
  }
  zp <- zp1
  if (se.fit) { 
    if (z$ls) { 
      zp1s <- zp1*zp1s # hazard
###      zp1s <- zp1s # log hazard
    } else {
###      zp1s <- ifelse(zp1 > 0, zp1s/sqrt(zp1*zp2), NA) # for log hazard
      zp1s <- sqrt(zp1/zp2)*zp1s   # se of hazard
      ddel <- sqrt(diff(z$timeb)[ind])
      zp1s <- zp1s/ddel
    }
    se <- zp1s
  }     
  if (lrr) {
    if (missing(baseline)) {
      mtx <- max(HD.TMP$Time)
      sub <- HD.TMP$mft == max(HD.TMP$mft[HD.TMP$Time==mtx]) & 
           HD.TMP$Time == mtx
      ind <- match(T,sub)
      baseline <- HD.TMP[ind,names(z$midp)[-1],drop=F]
    } else {
      baseline <- as.data.frame(baseline)
    }
    t1 <- sort(unique(newdata$Time))
    bv <- baseline[rep(1,length(t1)),,drop=F]
    bv <- cbind(Time=t1,bv)
    bv <- bv[,names(z$midp),drop=F]
    zb1 <- c(predict(z$fitn,newdata=bv))
    if (z$ls) {
      zb <- exp(zb1) 
    } else {
      zb2 <- c(predict(z$fitd,newdata=bv))
      zb1 <- pmax(0, zb1)
      ind <- as.numeric(cut(bv$Time,breaks=z$timeb))
      zb2 <- ifelse(zb2 < minpr*z$swt[ind], NA, zb2)
      zb <- zb1/zb2
    }
    index <- match(newdata$Time,t1)
    zb <- zb[index]
    zz <- zp/zb
    zm <- exp(mlr)
    zz <- pmax(zm,zz)
    zz <- log(pmin(1/zm,zz))
    zb <- zz
    if (se.fit) {data.frame(haz=c(zp),logratio=c(zb),se=se)}
    else data.frame(haz=c(zp),logratio=c(zb))
  } else {
    if (se.fit) {data.frame(haz=c(zp),se=se)}
    else c(zp)
  }
}

wcp <- function(z,zb,minpr=zb$minpr,mt=max(z$bin$Time)) {
### z: hazcov output where wcp statistic to be computed
### zb: at risk probabilities estimated from zb for weighted RSS (allows
###   using the same weights with several different hazcov objects)
### minpr: estimates only calculated where at risk probability estimates >
###   minpr (defaults to zb$minpr)
### mt: if z used direct estimation, can restrict the range of times which
###   are included in calculating wcp by giving the maximum as mt
  if (z$ls) {
    z$aic
  } else {
    if (nrow(zb$bin) != nrow(z$bin)) print("data dimension mismatch")
    if (z$fitn$control$trace.hat != "exact") stop(
  "need diagonal of smoother matrix--try refitting with trace.hat=\"exact\"")
    pi <- predict(zb$fitd)
    ind <- as.numeric(cut(zb$bin$Time,breaks=zb$timeb))
    dc <- zb$swt[ind]
    pi <- ifelse(pi < minpr*dc, NA, pi)
    ff <- predict(z$fitn)
    ff <- ifelse(ff>= 0, ff, 0)
    rss <- sum((z$bin$n*((z$bin$nf-ff)/pi)^2)[z$bin$Time <= mt],na.rm=T)
    del <- diff(z$timeb)[ind]
    cc <- sum((z$fitn$inference$diagonal*ff/(pi^2*del))[z$bin$Time <= mt],
            na.rm=T)
    c(rss+2*cc)/sum(z$bin$n[!is.na(pi) & z$bin$Time <= mt])
  }
}

coplot.hazcov <- function(formula,object,given.values,wt,
     cl=F,xlim,ylim,...) {
###  formula must be a valid formula for coplot, eg at most 2 given variables
###  if response starts with an "l" then logratios will be plotted,
###    otherwise hazards will be.
###  three options for given.values:  nothing, include just the given
###    covariates in formula, or include all covariates except the independent
###    variable in formula.  If variables not appearing in formula are not
###    specified in given.values, they are set somewhere near the middle of
###    the corresponding entry in midp.
###  wt (length=ntg) optional vector of prior relative risks.  hazards on time
###    interval i are multiplied by wt[i] (this is intended for use with
###    hazfit models, to adjust estimates for other terms in the model.)
###  cl:  If T, lines giving +/- 2 standard errors on the log scale added to
###    plots at several points
  if (length(formula)>2) vars <- all.vars(formula)
  else vars <- c("hazard rate",all.vars(formula))
  ncond <- min(length(vars)-2,2)
  if (ncond==1) ngval <- 9 else ngval <- 5
  ind <- match(vars[-1],names(object$midp))
  if (any(is.na(ind))) stop("invalid variable names")
  if (is.factor(object$midp[[ind[1]]])) {
    stop("x must be numeric")
  } else {
    t1 <- list(seq(min(object$midp[[ind[1]]]),max(object$midp[[ind[1]]]),
	           length=40))
  }
  if (missing(given.values)) {
    given.values <- NULL
    for (i in 2:(ncond+1)) { 
      if (is.factor(object$midp[[ind[i]]])) {
	given.values <- c(given.values,object$midp[ind[i]])
      } else {
	given.values <- c(given.values,list(quantile(object$midp[[ind[i]]],
	   probs=(1:ngval)/(ngval+1))))
      }
    }
    names(given.values) <- vars[3:(3+ncond-1)]
  }
  t2 <- NULL
  if (length(given.values) < length(object$midp)-1) {
    t2 <- object$midp[-ind]
    for (i in 1:length(t2)) {
      if (is.factor(t2[[i]])) {
	t2[[i]] <- t2[[i]][1]
      } else {
	t2[[i]] <- median(t2[[i]])
      }
    }
  }
  t1 <- c(t1,given.values,t2)
  names(t1)[1] <- vars[2]
  t2 <- expand.grid(t1)
  if (match(substring(vars[1],1,1),c("l","L"),0)>0) {
    u <- predict(object,t2,lrr=T)[,2]
  } else {
    u <- predict(object,t2,lrr=F)
    if (cl & missing(wt)) {
      t3 <- t1
      t3[[1]] <- t3[[1]][c(4,15,26,37)]
      t3 <- expand.grid(t3)
      uu <- predict(object,t3,lrr=F,se.fit=T)
      ul <- ifelse(uu[,1]>0,exp(log(uu[,1])+2*uu[,2]/uu[,1]),0)
      ll <- ifelse(uu[,1]>0,exp(log(uu[,1])-2*uu[,2]/uu[,1]),0)
      uu <- cbind(c(ul,ll),rbind(t3,t3))
      names(uu)[1] <- vars[1]
    }
    if (!missing(wt)) {
      wtt <- as.numeric(cut(t2$Time,object$timeb))
      u <- u*wt[wtt]
    }
  }
  t2 <- cbind(u,t2[,vars[-1]])
  names(t2)[1] <- vars[1]
  if (missing(xlim)) xlim <- range(t1[[1]],na.rm=T)
  if (!cl | (match(substring(vars[1],1,1),c("l","L"),0)>0) | !missing(wt)){
    if (missing(ylim)) ylim <- range(u,na.rm=T)
    coplot(formula,t2,given.values[vars[-(1:2)]],lines,xlim=xlim,ylim=ylim,...)
  } else {
    if (missing(ylim)) ylim <- range(uu[,1],na.rm=T)
    coplot(formula,t2,given.values[vars[-(1:2)]],lines,xlim=xlim,ylim=ylim,...)
    clf <- function(x,y){
      n2 <- length(y)/2
      segments(x[1:n2],y[1:n2],x[1:n2],y[(n2+1):length(y)])
    }
    coplot(formula,uu,given.values[vars[-(1:2)]],clf,add=T,xlim=xlim,ylim=ylim,...)
  }
  t1[-1]
}

plot.hazcov <- function(object,formula,...) {
### object: output from hazcov
### formula: (optional) formula for call to coplot.hazcov
### ...: other arguments passed to coplot.hazcov
  if (!missing(formula)) {
    out <- coplot.hazcov(formula,object,...)
  } else if (class(object)=="hazcov") {
    lab <- names(object$midp)[-1]
    out <- vector("list",length(lab))
    names(out) <- lab
    for (i in lab) out[[i]] <- coplot.hazcov(as.formula(
		paste("hazard.rate~Time|",i,sep="")),object,...)
  } else {
    stop("first argument must be a hazcov object")
  }
  out
}

persp.hazcov <- function(formula,object,given.values,wt,xlab,ylab,zlab,...) {
###  formula: formula of form resp~x*y.  if resp begins with "l" or "L" then
###    log ratios will be plotted, otherwise hazard rates will be.  x and y
###    are 2 of the variables used in the smoothing (including times)
###  object: a hazcov object
###  given.values: specify values for other covariates (other than x and y in
###    the formula).  If not specified, factors are set to the first level,
###    and continuous variables to somewhere near the middle of the
###    corresponding entries in midp.
###  wt assumed to be hazard ratios (relative to the baseline values)
###    for a given covariate combination at the times in z$midp[[1]]
###    (actually on each of the time intervals)
###  xlab,ylab,zlab: axis labels in call to persp
###  ...: additional arguments passed to persp()
###
  if (length(formula)>2) vars <- all.vars(formula)
  else vars <- c("hazard rate",all.vars(formula))
  ind <- match(vars[-1],names(object$midp))
  if (any(is.na(ind))) stop("invalid variable names")
  if (is.factor(object$midp[[ind[1]]]) | is.factor(object$midp[[ind[2]]]))
    stop ("covariates must be continuous")
  r <- range(object$midp[[ind[1]]],na.rm=T)
  x <- seq(r[1],r[2],length=40)
  r <- range(object$midp[[ind[2]]],na.rm=T)
  y <- seq(r[1],r[2],length=40)
  tmpv <- names(object$midp)[-ind[1:2]]
  if (missing(given.values)) given.values <- NULL
  if (length(tmpv)>0) {
    if (!is.null(given.values)) {
      ind2 <- match(names(given.values),tmpv,0)
      if (any(ind2 == 0)) stop("invalid names in given.values")
      tmpv <- tmpv[-ind2]
    }
    if (length(tmpv)>0) {
      uu <- NULL
      for (i in tmpv) { 
	if (is.factor(object$midp[[i]])) {
	  uu <- c(uu,list(object$midp[[i]][1]))
	} else {
	  uu <- c(uu,list(median(object$midp[[i]])))
	}
      }
      names(uu) <- tmpv
      given.values <- c(given.values,uu)
    }
  }
  uu <- c(list(x=x,y=y),given.values)
  names(uu)[1:2] <- vars[2:3]
  xx <- expand.grid(uu)
  if (match(substring(vars[1],1,1),c("l","L"),0)>0) {
    u <- predict(object,xx,lrr=T)[,2]
    if (missing(zlab)) zlab <- "Log Hazard Ratio"
  } else {
    u <- predict(object,xx,lrr=F)
    if (missing(zlab)) zlab <- "Hazard Rate"
    if (!missing(wt)) {
      wtt <- as.numeric(cut(xx$Time,object$timeb))
      u <- u*wt[wtt]
    }
  }
  if (missing(xlab)) xlab <- names(object$midp)[ind[1]]
  if (missing(ylab)) ylab <- names(object$midp)[ind[2]]
  persp(x,y,matrix(u,nrow=length(x)),xlab=xlab,ylab=ylab,zlab=zlab,...)
  given.values
}


SHAR_EOF
fi

if test -f 'hazfit.s'
then
	echo shar: "will not over-write existing file 'hazfit.s'"
else
cat << \SHAR_EOF > 'hazfit.s'
plot.hazfit <- function(z,formula,given.values,baseline,mlr=-log(20),...) {
### z: hazfit output
### formula:  formula for coplot--var names correspond to names in midp
###   components. 
### given.values: actually, can include any or all of the covariates + Time,
###   and give the points where log hazard ratio estimates will be plotted.
###   Components not included are set to defaults.
### baseline: baseline covariate values.  If given needs to include values
###   for all covariates, but not time.
### additional arguments passed to coplot (or plot.hazcov if formula not given)
  if (missing(formula)) {
    for (i in 1:(length(z)-1)) plot(z[[i]],...)
    invisible()
  } else {
    if (length(formula)>2) {
      vars <- all.vars(formula)
    } else {
      stop("formula must have x and y")
    }
    ncond <- min(length(vars)-2,2)
    if (ncond==1) ngval <- 9 else ngval <- 5
    nval <- 40
    u <- NULL
    j <- NULL
    for (i in 1:(length(z)-1)) {
      u <- c(u,paste(names(z[[i]]$midp)))
      j <- c(j,rep(i,length(z[[i]]$midp)))
    }
    sub <- !duplicated(u)
    u <- u[sub]
    j <- j[sub]
    if (missing(given.values)) {
      ind <- match(vars[2],u)
      t1 <- z[[j[ind]]]$midp[[vars[2]]]
      if (!is.numeric(t1)) {
	stop("x must be numeric") 
      } else {
	given.values <- list(seq(min(t1),max(t1),length=nval))
        names(given.values) <- vars[2]
      }
    } else {
      if (!is.null(given.values[[vars[2]]])) {
	if (!is.numeric(given.values[[vars[2]]])) stop("x must be numeric") 
      } else {
	ind <- match(vars[2],u)
	t1 <- z[[j[ind]]]$midp[[vars[2]]]
	if (!is.numeric(t1)) {
	  stop("x must be numeric") 
	} else {
	  given.values[[vars[2]]] <- seq(min(t1),max(t1),length=nval)
	}
      }
    }
    for (i in 1:ncond) {
      if (is.null(given.values[[vars[i+2]]])) {
	ind <- match(vars[i+2],u)
	t1 <- z[[j[ind]]]$midp[[vars[i+2]]]
        if (is.factor(t1)) {
	  given.values[[vars[i+2]]] <- t1
	} else {
	  given.values[[vars[i+2]]] <- seq(min(t1),max(t1),length=ngval)
	}
      }
    }
    ind <- match(vars[2:(2+ncond)],u)
    u <- u[-ind]
    j <- j[-ind]
    if (length(u)>0) {
      for (i in 1:length(u)) {
	if (is.null(given.values[[u[i]]])) {
	  t1 <- z[[j[i]]]$midp[[u[i]]]
	  if (is.factor(t1)) {
	    given.values[[u[i]]] <- t1[1]
	  } else {
	    given.values[[u[i]]] <- median(t1)
	  }
	} else {
	  given.values[[u[i]]] <- given.values[[u[i]]][1]
	}
      }
    }
    given.values <- given.values[c(vars[2:(2+ncond)],u)]
    data <- expand.grid(given.values)
    if (missing(baseline)) 
      baseline <- data[1,-match("Time",names(data)),drop=F]
    lhr <- predict(z,data,baseline,mlr)
    lhr <- lhr %*% rep(1,ncol(lhr))
    data <- cbind(c(lhr),data[,1:(1+ncond)])
    names(data)[1] <- vars[1]
    coplot(formula,data,given.values[2:(1+ncond)],panel=lines,...)
    list(given.values=given.values,baseline=baseline)
  }
}

print.hazfit <- function(z) {
### z: hazfit output
  print(z$fit$call)
  cat("Termination SS: ",format(signif(z$fit$swsq,3)),"\n")
  for (i in 1:(length(z)-1)) {
    cat("\nTerm # ",format(i),"\n")
    print(z[[i]])
  }
  invisible()
}

predict.hazfit <- function(z,newdata,baseline,mlr=-log(20)) {
### z: hazfit output
### newdata: data frame containing a column of Time values + columns for each
###   variable in each term of z, with names as given in names(z[[i]]$midp)
### baseline: (optional) if given, should be a list or data frame containing
###   same variables as newdata, except without time, with one entry per
###   component giving the baseline value for the hazard ratio estimate.
### returns a matrix giving the log hazard ratio (rel to baseline in each
###   term) for each term in z, for each row of newdata.
  nc <- length(z)-1
  out <- matrix(NA,nrow=nrow(newdata),ncol=nc)
  if (missing(baseline)) {
    for (i in 1:nc)
      out[,i] <- predict(z[[i]],newdata,lrr=T,mlr=mlr)[,2]
  } else {
    for (i in 1:nc)
      out[,i] <- predict(z[[i]],newdata,lrr=T,baseline=baseline,mlr=mlr)[,2]
  }
  out
}

hazfit <- function(formula,data,subset,na.action=na.omit,failcode=1,
    ntg=15,mlr=-log(4),max.iter=7,eps=.001,ls=F,...) {
### formula:  model formula with a response evaluating to a matrix with first
###   column giving survival times and second component the status (eg
###   cbind(time,status)), and the terms giving calls to the function hs().
###   The first argument in the call to hs should be the covariate terms part
###   of a formula appropriate for calling hazcov.  Other arguments to hazcov
###   can be given as arguments to hs (which then only apply to that term) or
###   as part of ... (which are included in every term).  Arguments with
###   values T or F should not be included in hs(), and specifying data,
###   subset, na.action, ls, ntg, rtime, wt, failcode, eps, max.iter in hs()
###   will have no effect.
### data: data frame where everything is taken from (optional)
### subset: specify subset of data to be included in the analysis (if data is
###   given any variable names used in subset must be in data).
### na.action: usual term to specify how to handle NAs models.
### failcode: value of status variable for failures (all other values
###   censored)
### ntg: number of time intervals to use
### mlr: minimum value for log hazard ratios--log ratios truncated at +/- mlr
### max.iter: max number of iterations in the backfitting algorithm
### eps: covergence threshhold (smaller values require tighter convergence)
### ls: if T, use local scoring instead of direct estimation
### ...: additional arguments passed to hazcov.  In addition to the arguments
###   explicitly listed in hazfit, arguments wt and lrr have no effect. 
  call <- match.call()
  varnames <- all.vars(formula)
### Note, arguments with T or F values in hazcov should be passed through
###  the call to hazfit, not as args to hs, as all.vars will interpret them as
###  variable names. 
  if (!missing(data)) {
    if (!missing(subset)) subset <- eval(substitute(subset),data)
    data <- data[,varnames]
  } else {
    if (is.matrix(eval(as.name(varnames[1]),sys.parent(1)))) {
      data <- paste("data.frame(I(",varnames[1],"),",paste(varnames[-1],
         collapse=","),")",sep="")
    } else {
      data <- paste("data.frame(",paste(varnames,collapse=","),")",sep="")
    }
    data <- eval(parse(text=data),sys.parent())
    names(data) <- varnames
  }
  if (!missing(subset)) data <- data[subset,]
  data <- na.action(data)
  timev <- data[,varnames[1]]
  if (is.matrix(timev)) timev <- timev[,1]
  trms <- terms(formula)
### embedded function #######################################
  hs <- function(form,...) {
    ## Just returns the arguments as a named list.  form should be the
    ## covariate part of the formula for a call to hazcov.
    c(list(formula=substitute(form)),list(...))
  }
#############################################################
  rbv <- zo <- ccll <- vector("list",length=length(trms))
  ll <- length(ccll)
  for (i in 1:ll) {
    if (substring(as.character(trms[i]),1,2) != "hs") 
      stop("model terms must be calls to function hs()")
    u <- eval(trms[i])
    form <- formula
    form[[3]] <- u[[1]]
    u[[1]] <- form
    u$wt <- u$ntg <- u$rtime <- u$ls <- u$eps <- u$max.iter <- NULL
    u$subset <- NULL
    u <- c(u,list(...))
    u$ls <- ls; u$ntg <- ntg; u$data <- as.name("data"); 
    u$wt <- as.name("wt"); u$failcode <- failcode; u$max.iter <- 1;
    ccll[[i]] <- u
  }
  nn <- nrow(data)
  wt <- matrix(1,nrow=ntg,ncol=nn)
  wto <- wt
  swsq <- 10000
############## embedded functions haz.ratio and reweight.hazcov ############
#################(no idea whether only defining these locally within
#################hazfit is sensible or not)#################################
  haz.ratio <- function(z,bv,mlr=-log(4)) {
### calculate log relative risks (zz)
### relative risks are relative to a baseline value of the covariates
### the following chooses the baseline value to be the values for the
### cell with the largest mft in the last time interval (by default, unless
### bv is specified in the call)
### this is calculated using the truncated estimator in the denominator,
### because, hazfit() needs estimates calculated wherever possible, even if
### they are not very good.
### mlr: log ratio truncated at +/- mlr
    if (missing(bv)) {
      mtx <- max(z$bin$Time)
      sub <- z$bin$mft == max(z$bin$mft[z$bin$Time==mtx]) & 
         z$bin$Time == mtx
      ind <- match(T,sub)
      bv <- z$bin[ind,names(z$midp)[-1],drop=F]
    }
    ntg <- z$dim[1]
    ncov <- length(z$dim)-1
    if (z$ls) { 
      zp <- z$bin$haz
    } else {
      ind <- match(z$bin$Time,z$midp[[1]])
      rminpr <- z$minpr*z$swt[ind]
      zp1 <- pmax(0,predict(z$fitn))
      zp2 <- pmax(rminpr,predict(z$fitd))
      zp <- zp1/zp2
    }
    sub <- rep(T,nrow(z$bin))
    for (i in 1:ncov) {
      sub <- sub & z$bin[,names(z$midp)[i+1]] == bv[,i]
    }
    if (length(sub[sub]) < ntg) {
      ind <- match(z$bin$Time[sub],z$midp[[1]])
      mid <- rep(NA,ntg)
      mid[ind] <- zp[sub]
    } else {
      mid <- zp[sub]
    }
    cell <- as.numeric(z$cell)
    zz <- rep(NA,max(cell)*ntg)
    zz[!z$exclude] <- zp
    zz <- zz/mid
    zm <- exp(mlr)
    zz <- pmax(zm,zz)
    zz <- pmin(1/zm,zz)
    zz <- matrix(zz,nrow=ntg)
    list(ratio=zz[,cell],bv=bv)
  }
#####################################################################
reweight.hazcov <- function(zn,wt,time)
{
### updates output from hazcov with a new set of weights
### Arguments:
###   zn: output from hazcov
###   wt: must be ntg by length(time). (i,j) element gives a prior 
###     relative risk (like an offset) for case i in time interval j.
###   fast:  if true, uses fortran routine, instead of S commands, for
###     calculating weighted follow-up times in cells
###
### set-up
  ncov <- length(zn$midp)-1
  n <- length(time)
  ntg <- zn$dim[1]
  ncg <- zn$dim[-1]
  minpr <- zn$minpr
  if (ncol(wt) != n | nrow(wt) != ntg) stop("invalid dim(wt)")
  tbrk <- zn$timeb
  dt <- diff(tbrk)
  celli <- zn$cell
  nc <- length(levels(celli))
  nl <- nc*ntg   # total number cells
  exclude <- zn$exclude
  wt <- ifelse(is.na(wt),0,wt)
  swt <- (wt %*% rep(1,n)) / ((wt>0) %*% rep(1,n))
  fast <- F
  if (fast) {
    zft <- .Fortran("hazgrp",as.double(time),as.integer(rep(0,n)),
       as.double(wt),as.integer(celli),as.integer(n),as.integer(ntg),
       as.double(tbrk),as.integer(nl),zft=double(nl),zfs=integer(nl),
       nnn=integer(nc))$zft[!exclude]
### On return from call to "hazgrp" components of z are
###   zft: sum of the weighted follow-up times in each cell (cov*time)
  } else {
    cell <- as.numeric(celli)
    o <- order(cell)
    sub <- c(diff(cell[o])>0,T)
    zft <- matrix(0,nrow=ntg,ncol=nc)
    for (i in 1:ntg) {
      ft <- pmin(pmax(time-tbrk[i],0),tbrk[i+1]-tbrk[i])*wt[i,]
      ft <- cumsum(ft[o])[sub]
      ft  <- c(ft[1],diff(ft))
      zft[i,] <- ft
    }
    zft <- c(zft)[!exclude]
  }
### set things up and smooth to get fail probs and expected follow-up 
  if (any(zft<=0)) warning("using components of zft with values <=0")
  dt <- rep(dt,nc)[!exclude]
  rminpr <- minpr*(rep(swt,nc)[!exclude])
  div <- zn$bin$n*dt
  HD.TMP <- zn$bin[,-1]
  if (zn$ls) {
    a.new <- log(zn$bin$haz)
    a.old <- a.new
    zftn <- zft*exp(a.old)
    HD.TMP$mft <- zftn/div
### if have problems with zft<= 0 (warning above), possibly use subset =
### mft>0 in update, then predict at HD.TMP.
    z <- update(zn$fitd)
    zp1 <- pmax(predict(z),rminpr*zftn/zft)*div
    HD.TMP$wtt <- zp1
    HD.TMP$nf <- a.old+(zn$bin$zfs-zftn)/zp1
    z2 <- update(zn$fitn)
    a.new <- predict(z2)
    zp <- exp(a.new)
    aic <- -2*sum(a.new*zn$bin$zfs-zft*zp) +2*z2$inference$trace.hat
    aic <- aic/sum(HD.TMP$n)
    zn$aic <- aic
    zn$rss <- NULL
    zn$fitn <- z2
  } else {
    HD.TMP$mft <- zft/div
### if have problems with zft<= 0 (warning above), possibly use subset =
### mft>0 in update, then predict at HD.TMP.
    z <- update(zn$fitd)
    zp1 <- predict(z)
    zp1 <- ifelse(zp1 < rminpr, NA, zp1)
    zp2 <- pmax(0,predict(zn$fitn))
    zp <- zp2/zp1
  }
### OUTPUT:
###   bin: data frame for smoothing (HD.TMP above) with the est hazards (haz)
###      and log haz ratios (logratio) for each cell 
###   fitd: loess output from denominator smooth
###   swt:  n/sum of weights on each time interval
###   other components not changed from zn. (In local scoring aic and fitn
###   also updated above)
    structure(c(zn[1],list(bin=data.frame(haz=zp,HD.TMP),
      fitd=z,swt=swt),zn[-(1:4)]),class="hazcov")
}
#####################################################################
  for (i in 1:ll) {
    zo[[i]] <- do.call("hazcov",ccll[[i]])
    rbv[[i]] <- haz.ratio(zo[[i]],mlr=mlr)
    wt <- wt*rbv[[i]]$ratio
  }
  swsq <- sum((wt-wto)^2,na.rm=T)/sum(wt^2,na.rm=T)
  wto <- wt
  it <- 1
  if (ll > 1 | ls) {
    print(c(it=it,swsq=swsq))
    while (swsq> eps & it < max.iter) {
###  wt=exp(g1+g2+...+gll), where the gj are smooth functions of covariates
###   and time, and ww[[j]]=exp(gj).  wt is initialized to 1, and then
###   backfitting is used to restimate each gj by calling hazcov with weights
###   wt/ww[[j]].
      it <- it+1
      for (i in 1:ll) {
        wt <- wt/rbv[[i]]$ratio
        zo[[i]] <- reweight.hazcov(zo[[i]],wt,timev)
        rbv[[i]] <- haz.ratio(zo[[i]],bv=rbv[[i]]$bv,mlr=mlr)
        wt <- wt*rbv[[i]]$ratio
      }
      swsq <- sum((wt-wto)^2,na.rm=T)/sum(wt^2,na.rm=T)
      wto <- wt
### note: terminates when relative change in sum of squared diffs in fitted 
###   values (wts) is small. 
      print(c(it=it,swsq=swsq))
    }
  }
  structure(c(zo,list(fit=list(call=call,wt=wt,swsq=swsq))),class="hazfit")
}

SHAR_EOF
fi

if test -f 'hazcov.f'
then
	echo shar: "will not over-write existing file 'hazcov.f'"
else
cat << \SHAR_EOF > 'hazcov.f'
      subroutine hazgrp(time,ist,wt,icell,n,ntg,tbrk,nl,haz,
     $     iwk,nn)
c Input:
c   time: failure times
c   ist: status, 1=failure, 0=censored
c   wt: matrix ntg x n giving for each case on each interval the prior
c        relative risk weights
c   icell: cov comb # for each case.
c   n: number of cases
c   ncov: number of covariates
c   ncg: vector length ncov giving the number of intervals for each covariate
c   ntg: number of time intervals
c   tbrk: boundaries of time intervals
c   nl: total number of cells (unique combinations of covariate intervals and
c        time intervals)
c   
c Output:
c   haz: sum of weighted follow-up time contributions for each cell (times 
c   varying fastest, within levels of icell)
c   iwk: number of failures observed in each cell
c   nn: number of cases for each value of cell.
c
      double precision time(n),wt(ntg,n),tbrk(0:ntg),haz(nl)
      integer ist(n),iwk(nl),nn(nl/ntg),icell(n)
      nc=nl/ntg
      do 10 i=1,nl
         haz(i)=0
         iwk(i)=0
 10   continue
      do 11 i=1,nc
         nn(i)=0
 11   continue
      do 20 i=1,n
         ll=(icell(i)-1)*ntg
         nn(icell(i))=nn(icell(i))+1
         do 22 j=1,ntg
            if (time(i).gt.tbrk(j)) then
c               if (wt(j,i).le.0) call intpr("i",1,i,1)
               haz(ll+j)=haz(ll+j)+wt(j,i)*(tbrk(j)-tbrk(j-1))
            else if (time(i).gt.tbrk(j-1)) then
c               if (wt(j,i).le.0) call intpr("i",1,i,1)
               haz(ll+j)=haz(ll+j)+wt(j,i)*(time(i)-tbrk(j-1))
               iwk(ll+j)=iwk(ll+j)+ist(i)
            endif
 22      continue
 20   continue
      return
      end

SHAR_EOF
fi

if test -f 'coplot.hazcov.d'
then
	echo shar: "will not over-write existing file 'coplot.hazcov.d'"
else
cat << \SHAR_EOF > 'coplot.hazcov.d'
.BG
.FN coplot.hazcov
.TL
Creates coplots from hazcov output.
.DN
Creates coplots from hazcov output.
.CS
coplot.hazcov(formula, object, given.values, wt, cl=F, xlim,
              ylim, ...)
.RA
.AG formula
Must be a valid formula for coplot, eg one x covariate and one or two given
covariates.  The response variable can be most anything--if it starts with an
"l" or "L" then logratios will be plotted, otherwise hazards will be.  Other
than the response variable, the names in formula must correspond to names in
object$midp. 
.AG object
An object of class hazcov (output from function hazcov())
.OA
.AG given.values
A list giving the values of the "given covariates" in formula where plots
should be made.  Optionally, this can also include values for covariates (and
Time) not included in formula.  Defaults to reasonable values.
.AG wt
(length=ntg) optional vector of prior relative risks.  hazards on time
interval i are multiplied by wt[i].
.AG cl
If T, vertical line segments indicating +/- 2 standard errors will be added
at several points.  Only available for hazard plots without prior relative
risks (wt).
.AG xlim,ylim
Usual xlim and ylim arguments to plotting functions.  Default to the range of
the data to be plotted.
.AG ...
Additional arguments passed to coplot.
.RT
Returns a list giving the values used for the "given covariates" in formula,
and also the values used for any covariates (or Time) in the model which are
not in formula.
.SE
Creates a coplot on the current device.  
.DT
If given.values are not specified for the "given covariates", then for
numeric variables 9 values
are chosen for 1 given cov or 5 values if 2, using quantiles of the
components of object$midp, and for factors all the factor levels are used.
For variables in model but not in formula, the defaults for computing
estimates are to use the median of the components of object$midp for numeric
variables, and to use the first component of the factor levels in the
component of z$midp for factors.  If given.values are specified for variables
not in formula, only one value each should be given.

Names in given.values and in formula (except for the response) must
correspond to names in object$midp.
.SA
hazcov plot.hazcov persp.hazcov
.EX
z <- hazcov(cbind(TTR,Fail.Ind)~ER+Log.Nodes+Tumor.Size,data=jd,span=.20)
coplot.hazcov(Hazard.Rate~Time|Log.Nodes*ER,z,given.values=
  list(Log.Nodes=log(c(1,3,6,10)),ER=as.factor(c("Pos","Neg")),
       Tumor.Size=3))

.KW hazcov
.WR
SHAR_EOF
fi
if test -f 'hazcov.d'
then
	echo shar: "will not over-write existing file 'hazcov.d'"
else
cat << \SHAR_EOF > 'hazcov.d'
.BG
.FN hazcov
.TL
Hazard Smoothing with Multiple Covariates
.DN
Estimates the hazard function for censored survival data with covariates,
using the loess() function.
.CS
hazcov(formula, data, subset, na.action=na.omit, wt, ncg=15, ntg=15, 
       failcode=1, minpr=0.05, rtime, degree=1, ls=F, eps=0.001, 
       max.iter=5, ...)
.RA
.AG formula
A model formula.  The response should be a matrix with event time in the
first column and status (censored or failure) in the second, or a function
which evaluates to such a matrix (eg cbind(survival,status) or
Surv(survival,status) ).
Covariates can be specified in the form cov1+cov2+..., although in the
smoothing the notation cov1*cov2*... would be more accurate, since a full
interaction model is fit (the terms in the formula are just used to define
which covariates to include).  Forms such as log(cov1) can be included,
although this can create some confusion about names to use in subsequent
calls to predict() and plot().
.OA
.AG data
data frame containing the data
.AG subset
standard subset argument for modeling functions
.AG na.action
standard na.action, except the default in hazcov is set to na.omit rather
than na.fail.
.AG wt
if given, must be ntg by length(time).  (j,i) element gives a prior 
relative risk (like an offset) for case i in time interval j.  Default
is all 1's.
.AG ncg
number of covariate groups.  Covariates are grouped by using quantiles of the
marginal distribution of each covariate, and taking the tensor product of
these groups.  ncg[i] specifies the number of intervals to use for the ith
covariate.  If a single value the same value is used for all.
.AG ntg
The number of time intervals to use (number of events and follow-up times are
computed for each interval for each covariate group--this is the data
actually used as the response in the calls to loess()).
.AG failcode
The code in the second column of the response which actually denotes failure.
All other values treated as being censored.
.AG minpr
hazard estimates will only be given where the smoothed estimates in the
denominator (roughly of the probability of still being at risk) are >= this
value (if ls=F). 
.AG rtime
The range of follow-up times to include in the estimation.
.AG degree
degree of the polynomial for the loess smoothing (included as a specific
argument to set the default to 1 [for no particularly good reason]).
.AG ls
If T, an iterative local scoring algorithm is used.
.AG eps
termination criteria for local scoring--smaller values should require more
iterations. 
.AG max.iter
maximum number of iterations in local scoring.
.AG ...
Additional arguments to loess.  In particular, span would often be specified.
.RT

Returns a list of class "hazcov", with components "call" giving the call to
hazcov, "bin" giving the data frame of binned data actually used in the
smoothing operations (and the hazard estimates at each point),  "fitd" giving
the loess output from the denominator smooth, "swt" 
giving n/(sum of prior weights wt) on each time interval, "fitn" giving the
loess output for the numerator smooth, "timeb" giving the boundaries of the
time intervals, "covb" (vector) giving the boundaries of the covariate
groups, "dim" giving the number of time intervals [1]
and covariate intervals [2] to [ncov+1], "n" giving sample size and number
censored in the original data (after subset and na.action()), "midp" a list
giving the the actual 
covariate values used in the binned data with names corresponding to names
used in the formulas in the calls to loess, "minpr"
and "ls" giving the input values of these parameters, "cell" giving the cell
number of the ith case in the grouped covariate cells, "exclude" a logical
vector with T for those cells in the full tensor product of 
time x covariate groups which do not have any follow-up time (these cells are
excluded from the smoothing), "aic" giving the value of the aic statistic
(only if local scoring used), and "css" giving the weighted average squared
change in estimates at the last local scoring interation (only if local
scoring used).

.DT
Performs either the direct estimation (ls=F) or local scoring algorithm for
hazard estimation described in the references below.  First the data are
grouped on the covariate values, using the quantiles of the marginal
distributions of the covariates (for numeric covariates) or the factor levels
for factors.  Time intervals are formed and the number of events and total
follow-up time computed for each interval for each covariate combination.
In direct estimation, these number of event and follow-up time contributions
are then separately smoothed on the midpoints of the time intervals and
covariate groups using the loess() function, and the hazard estimate formed
by taking the ratio.  For local scoring, an approximate likelihood is formed
taking the hazard to be constant on each of the time interval/covariate group
combinations.  The log hazards are then estimated using a standard local
scoring algorithm.

Any special terms in the formula, such as log(cov) etc, are evaluated prior
to forming the covariate groups.  This means that in the calls to loess,
these are replaced by modified versions of the names (eg log.cov.).  These
modified versions must be used in subsequent calls to predict or plot.
.SH REFERENCES

Gray RJ (1996) Hazard rate regression using ordinary nonparametric regression
smoothers, JCGS 5:190-207.

Gray RJ (1994) Hazard estimation with covariates: algorithms for direct
estimation, local scoring and backfitting, Technical Report 784Z, Div of
Biostatistics, Dana-Farber Cancer Institute. 

.SA
print.hazcov plot.hazcov predict.hazcov coplot.hazcov persp.hazcov wcp hazfit
.EX
z <- hazcov(cbind(TTR,Fail.Ind)~ER+Log.Nodes+Tumor.Size,data=jd,span=.20)
.KW survival 
.KW smooth
.WR
SHAR_EOF
fi
if test -f 'hazfit.d'
then
	echo shar: "will not over-write existing file 'hazfit.d'"
else
cat << \SHAR_EOF > 'hazfit.d'
.BG
.FN hazfit
.TL
Hazard estimation with covariates using structured (limited interaction)
models. 
.DN
Estimates hazards/hazard ratios using hazcov(), but by fitting a model with
several additive terms using backfitting.
.CS
hazfit(formula, data, subset, na.action=na.omit, failcode=1, ntg=15, 
       mlr= - log(4), max.iter=7, eps=0.001, ls=F, ...)
.RA
.AG formula
model formula with response a matrix (or function evaluating to a matrix)
with first column giving survival (or other type event) times and second
column the status (eg 
cbind(time,status)), and the terms giving calls to the function hs().
The first argument in the call to hs should be the covariate terms part
of a formula appropriate for calling hazcov().  Other arguments to hazcov
can be given as arguments to hs (which then only apply to that term) or
as part of ... (which are included in every term).  Arguments with
values T or F should not be included in hs(), and specifying data,
subset, ls, ntg, rtime, wt, failcode, eps, or max.iter in hs() will
have no effect.
.OA
.AG data
data frame containing the data to use
.AG subset
usual subset argument to modeling functions
.AG na.action
usual na.action function to modeling functions, except the default here is
na.omit(). 
.AG failcode
The code in the second column of the response which actually denotes failure.
All other values treated as being censored.
.AG ntg
The number of time intervals to use (number events and follow-up time are
computed for each interval for each covariate group--this is the data
actually used as the response in the calls to loess()).
.AG mlr
log relative risks will be truncated at +/- mlr.  Note that mlr should be
negative, since it is the smallest (ie largest negative) value allowed for
the log ratio.
.AG max.iter
maximum number of iterations of the backfitting algorithm
.AG eps
Termination criteria for backfitting.  Smaller values should require more
iterations.
.AG ls
If T, local scoring is used.  (T tends to work better, and maybe should be
the default)
.AG ...
additional arguments to hazcov().  Note that many of these can also be
specified separately for each term by including them in the hs() terms.
.RT
Returns a list of class "hazfit", giving a hazcov object for each hs() term
in the model, and a component $fit, which is also a list with components
$call giving the call to hazfit(), $wt giving the overall estimated hazard
ratio for each case on each interval (the product of the ratios from each
term), and $swsq giving the average weighted sum of squares from the final
iteration (the sum of squares used to determine convergence).

.DT
hazcov() essentially fits a full interaction model, while hazfit() uses
backfitting to fit partially additive models (see Examples).  The details of
the algorithms are given in the references.  Hazard ratios for all terms are
allowed to vary with time. 

.SH REFERENCES

Gray RJ (1996) Hazard rate regression using ordinary nonparametric regression
smoothers, JCGS 5:190-207.

Gray RJ (1994) Hazard estimation with covariates: algorithms for direct
estimation, local scoring and backfitting, Technical Report 784Z, Div of
Biostatistics, Dana-Farber Cancer Institute. 

.SA
hazcov plot.hazfit predict.hazfit print.hazfit
.EX
z <- hazfit(cbind(TTR,Fail.Ind)~hs(ER,span=.5)+
      hs(Log.Nodes+Tumor.Size,ncg=20,span=.4),data=jd,ntg=20)
# Allows the hazard ratio for ER to change with time, and an interaction
# between Log.Nodes and Tumor.Size (which can also change with time), but
# assumes no interaction between ER and the joint effect of Log.Nodes and
# Tumor.Size.

.KW survival
.KW smooth
.WR
SHAR_EOF
fi

if test -f 'persp.hazcov.d'
then
	echo shar: "will not over-write existing file 'persp.hazcov.d'"
else
cat << \SHAR_EOF > 'persp.hazcov.d'
.BG
.FN persp.hazcov
.TL
Perspective plots of hazard estimates
.DN
Function to create perspective plots using output from hazcov()
.CS
persp.hazcov(formula, object, given.values, wt, xlab, ylab, zlab, ...)
.RA
.AG formula
Model formula of the form resp~cov1*cov2, where if resp begins with an "l" or
"L" log ratios will be plotted, otherwise hazards will be, and cov1 and cov2
specify the x and y covariates in the plot.  These covariates must be names
corresponding to terms in object$midp (including Time), and must be numeric
variables. 
.AG object
An object of class hazcov (output from function hazcov())
.OA
.AG given.values
A list giving values for covariates (including Time) in the model but not in
formula.  If specified, estimates will be calculated at these values.  Names
of the components in the list must match the names in object$midp.
.AG wt
(length=ntg) optional vector of prior relative risks.  hazards on time
interval i are multiplied by wt[i].
.AG xlab,ylab,zlab
axis labels in the plot.
.AG ...
additional arguments passed to persp()
.RT
returns a list giving the values used for covariates not in formula.
.SE
Creates a perspective plot on the current device.  
.DT
The values used for covariates not in formula are the median of the
components of object$midp for numeric 
variables, and the first component of the factor levels in the
component of z$midp for factors.  If given.values are specified, only one
value each should be given. 
.SA
hazcov plot.hazcov coplot.hazcov
.EX
z <- hazcov(cbind(TTR,Fail.Ind)~ER+Log.Nodes+Tumor.Size,data=jd,span=.20)
persp.hazcov(Hazard.Rate~Time*Log.Nodes,z,given.values=
  list(ER=as.factor(c("Pos")),Tumor.Size=3))
persp.hazcov(Hazard.Rate~Time*Log.Nodes,z,given.values=
  list(ER=as.factor(c("Neg")),Tumor.Size=3))

.KW hazcov
.WR
SHAR_EOF
fi
if test -f 'plot.hazcov.d'
then
	echo shar: "will not over-write existing file 'plot.hazcov.d'"
else
cat << \SHAR_EOF > 'plot.hazcov.d'
.BG
.FN plot.hazcov
.TL
Coplots of Estimated Hazards
.DN
plot method for hazcov objects
.CS
plot.hazcov(object, formula, ...)
.RA
.AG object
Object of class hazcov (output from function hazcov())
.OA
.AG formula
A formula suitable for coplot.hazcov, essentially a valid formula for a call
to coplot.
.AG ...
Additional arguments for coplot.hazcov.
.RT
A list giving values of covariates (and Time) used in constructing the plots.
.SE
One or more plots created on the current device.
.DT
If formula is specified, the coplot corresponding to this formula is
constructed.  Otherwise the function loops over the covariates giving the
coplot with Time as the x variable and each covariate in turn as the given
variable.  See coplot.hazcov for additional details.
.SA
hazcov coplot.hazcov persp.hazcov
.EX
z <- hazcov(cbind(TTR,Fail.Ind)~ER+Log.Nodes+Tumor.Size,data=jd,span=.20)
plot(z,Hazard.Rate~Time|Log.Nodes,given.values=
     list(Log.Nodes=log(c(1,3,6,10))))

.KW hazcov
.WR
SHAR_EOF
fi
if test -f 'plot.hazfit.d'
then
	echo shar: "will not over-write existing file 'plot.hazfit.d'"
else
cat << \SHAR_EOF > 'plot.hazfit.d'
.BG
.FN plot.hazfit
.TL
log hazard ratio plots from hazfit objects
.DN
plotting method for objects of class "hazfit"
.CS
plot.hazfit(z, formula, given.values, baseline, mlr=-log(20), ...)
.RA
.AG z
object of class "hazfit" (output from function hazfit())
.OA
.AG formula
A formula appropriate for a call to coplot.  Names used in the formula must
correspond to names of the z[[i]]$midp components, except for the response.
The response is not actually used for much, but is required(it does provide a
default ylab). 
.AG given.values
list or data frame.  Can include any or all of the covariates + Time. 
Estimating the log ratios requires specifying values for all covariates in
the model, even the ones not in formula.  This provides a means to specify
both the given values for the given variables in formula, and values for
other covariates not included in the formula. Components not included are set
to defaults. 

.AG baseline
list (each component of length 1) or a data frame with one row, giving the
baseline value to use in the hazard ratios.  If given, must include all the
covariates in the model (but not Time), and the names of the components in
the list/data frame must match the names used in the z[[i]$midp components.
.AG mlr
log hazard ratios for each term are truncated at +/- this value (must be < 0).
.AG ...
other arguments passed to coplot.
.RT
returns the list of given.values actually used in the plots
.SE
creates one or more plots on the current graphics device.
.SA
hazfit plot.hazcov coplot.hazcov
.EX
z <- hazfit(cbind(TTR,Fail.Ind)~hs(ER,span=.5)+
      hs(Log.Nodes+Tumor.Size,ncg=20,span=.4),data=jd,ntg=20)
plot(z,Log.Ratio~Log.Nodes|Tumor.Size*Time,given.values=list(Tumor.Size=
  c(1,3,5,7),Time=1:5,ER=as.factor("Pos")),baseline=list(Log.Nodes=0,
  Tumor.Size=5,ER=as.factor("Neg")))
.KW hazfit
.WR
SHAR_EOF
fi
if test -f 'predict.hazcov.d'
then
	echo shar: "will not over-write existing file 'predict.hazcov.d'"
else
cat << \SHAR_EOF > 'predict.hazcov.d'
.BG
.FN predict.hazcov
.TL
Hazard (and log hazard ratio) estimates from hazcov objects
.DN
Predict method for hazcov objects
.CS
predict.hazcov(z, newdata, se.fit=F, lrr=F, baseline, mlr=-log(4), 
               minpr=z$minpr)
.RA
.AG z
Object of class hazcov (output from function hazcov())
.AG newdata
Data frame containing new covariate and time values where estimates are to be
computed.  The names in newdata must correspond to the names in z$midp.
.OA
.AG se.fit
If T, calculates standard errors for the hazard estimates.
.AG lrr
If T, calculates log hazard ratios in addition to the hazard estimates.
.AG baseline
baseline value of covariates for computation of hazard ratios.  Default tries
to choose a value where hazard estimate is defined on all intervals.
.AG mlr
The minimum value for log hazard ratios.  The log ratios are truncated at +/-
this value.  (Must be < 0.)
.AG minpr
hazard estimates will only be given where the smoothed estimates in the
denominator (roughly of the probability of still being at risk) are >= this
value (if ls=F). 
.RT
If se.fit==lrr==F, a vector giving the hazard estimates at the points in
newdata.  Otherwise a data frame with columns "haz" giving the hazard
estimates, "logratio" giving the log hazard ratios (if lrr=T), and "se"
giving the standard errors of the hazard estimates (if se.fit=T).
.DT
The names of newdata must correspond to the names in z$midp, which
might not be exactly the same as the names in the formula in the original
call to hazcov.  For example, log(cov) in the original call would be 
log.cov. in z$midp, and values on the log scale would need to be given in
newdata. 

Standard errors are computed as discussed in the references below.
Calculation of standard errors tends to run out of memory if nrow(z$bin) is
very large.
.SH REFERENCES

Gray RJ (1996) Hazard rate regression using ordinary nonparametric regression
smoothers, JCGS 5:190-207.

Gray RJ (1994) Hazard estimation with covariates: algorithms for direct
estimation, local scoring and backfitting, Technical Report 784Z, Div of
Biostatistics, Dana-Farber Cancer Institute. 

.SA
hazcov
.EX
z <- hazcov(cbind(TTR,Fail.Ind)~ER+Log.Nodes+Tumor.Size,data=jd,span=.20)
x <- expand.grid(Time=seq(.5,10,length=30),ER=as.factor("Pos"),
       Log.Nodes=log(c(1,3,6,10)),Tumor.Size=c(2,4,7))
zp <- predict(z,x)
.KW hazcov
.WR
SHAR_EOF
fi

if test -f 'predict.hazfit.d'
then
	echo shar: "will not over-write existing file 'predict.hazfit.d'"
else
cat << \SHAR_EOF > 'predict.hazfit.d'
.BG
.FN predict.hazfit
.TL
Estimate hazard ratios
.DN
predict method for objects of class "hazfit"
.CS
predict.hazfit(z, newdata, baseline, mlr= - log(20))
.RA
.AG z
object of class "hazfit"
.AG newdata
Data frame containing new covariate and time values where estimates are to be
computed.  Needs to have values for all covariates in all terms, with names
that match the names of the corresponding components of z[[i]]$midp.
.OA
.AG baseline
baseline value of covariates.  If given, should be a list or data frame
containing same variables as newdata, except without time, with one entry per 
component (if a list), or one row (if a data frame).  Components must be
named, with names matching corresponding components of the z[[i]]$midp
.AG mlr
log hazard ratios for each term are truncated at +/- this value (must be < 0).
.RT
returns a matrix with rows corresponding to rows of newdata, and one column
for each hs() term in the hazfit model, giving for each term the contribution
for that term to the estimated log hazard ratio at the points in newdata.
The sum of the columns gives the overall log hazard ratio for the points in
newdata to the baseline.
.SA
hazfit predict.hazcov
.EX
z <- hazfit(cbind(TTR,Fail.Ind)~hs(ER,span=.5)+
      hs(Log.Nodes+Tumor.Size,ncg=20,span=.4),data=jd,ntg=20)
newd <- expand.grid(Time=1:5,ER=as.factor(c("Pos","Neg")),Log.Nodes=
                    log(c(1,4,8)),Tumor.Size=5)
bv <- list(ER="Pos",Log.Nodes=0,Tumor.Size=2)
zp <- predict(z,newd,bv)
## returns a 30 x 2 matrix giving in the first column the overall 
## hazard ratio for the ER term which is only a function of ER and 
## time, and the second column giving  the Log.Nodes*Tumor.Size term
## which is a function of time, Log.Nodes and Tumor.Size
.KW hazfit
.WR
SHAR_EOF
fi
if test -f 'print.hazcov.d'
then
	echo shar: "will not over-write existing file 'print.hazcov.d'"
else
cat << \SHAR_EOF > 'print.hazcov.d'
.BG
.FN print.hazcov
.TL
Prints a summary of a hazcov object
.DN
Print method for hazcov objects
.CS
print.hazcov(z)
.RA
.AG z
An object of class hazcov (output from function hazcov())
.SE
Prints the call, sample size and number of censored observations, number of
time and covariate groups in the binned data, total number of bins with
follow-up time contributions, a summary of residuals, where the residual are
defined by 
  (average number of events)-(average follow-up time)*(hazard estimate)

for each bin, and also prints the print.loess() output for the numerator
smooth. 
.SA
hazcov
.KW hazcov
.WR
SHAR_EOF
fi

if test -f 'print.hazfit.d'
then
	echo shar: "will not over-write existing file 'print.hazfit.d'"
else
cat << \SHAR_EOF > 'print.hazfit.d'
.BG
.FN print.hazfit
.TL
Print hazfit objects
.DN
print method for objects of class hazfit
.CS
print.hazfit(z)
.RA
.AG z
object of class hazfit
.RT
none
.SE
prints the call, average SS at final iteration, and calls print.hazcov() for
each of the terms in the hazfit model.
.SA
hazfit
.KW hazfit
.WR
SHAR_EOF
fi

if test -f 'wcp.d'
then
	echo shar: "will not over-write existing file 'wcp.d'"
else
cat << \SHAR_EOF > 'wcp.d'
.BG
.FN wcp
.TL
Calc weighted cp statistic for hazcov objects
.DN
Calc weighted cp statistic for hazcov objects
.CS
wcp(z, zb, minpr=zb$minpr, mt=max(z$bin$Time))
.RA
.AG z
An object of class hazcov (output from hazcov()).
.AG zb
Another hazcov object (should be either the same as z or output from the same
model but using a different span).  The estimates of the at risk
probabilities for the wcp statistic are taken from zb.
.OA
.AG minpr
only points with smoothed estimates in the denominator (roughly of the
probability of still being at risk) >= this value (if z$ls=F) are included in
the calculation. 
.AG mt
if z$ls=F, can restrict the range of times used in calculating wcp to times
<= mt.
.RT
The wcp statistic, defined in the reference below.
.DT
The reason for having two objects as arguments is so the same estimates of
the at risk probabilities can be used in calculating the wcp statististics at
a number of different spans.  (Since contributions are weighted by the
inverse of these, wcp's for different spans can be sensitive to changes in
these.)  

If ls=T this just returns the aic component of z.

If ls=F, the hazcov models need to have the exact trace calculated.  If loess
does not default to this, then specify trace.hat="exact" in the call to
hazcov.  Note though that this can be very time consuming on large data sets.
.SH REFERENCES

Gray RJ (1996) Hazard rate regression using ordinary nonparametric regression
smoothers, JCGS 5:190-207.

.SA
hazcov
.EX
z20 <- hazcov(cbind(TTR,Fail.Ind)~ER+Log.Nodes+Tumor.Size,data=jd,span=.20,
              trace.hat="exact")
z30 <- hazcov(cbind(TTR,Fail.Ind)~ER+Log.Nodes+Tumor.Size,data=jd,span=.30)
              trace.hat="exact")
z40 <- hazcov(cbind(TTR,Fail.Ind)~ER+Log.Nodes+Tumor.Size,data=jd,span=.40)
              trace.hat="exact")
wcp(z20,z30,mt=10)
wcp(z30,z30,mt=10)
wcp(z40,z30,mt=10)

.KW hazcov
.WR
SHAR_EOF
fi

exit 0
#	End of shell archive
