Survival Analysis

Survival analysis is a branch of statistics for analyzing the expected duration of time until one or more events happen, such as death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysis in engineering, duration analysis or duration modelling in economics, and event history analysis in sociology. Survival analysis attempts to answer questions such as: what is the proportion of a population which will survive past a certain time? Of those that survive, at what rate will they die or fail? Can multiple causes of death or failure be taken into account? How do particular circumstances or characteristics increase or decrease the probability of survival?

Survival analysis involves the modelling of time to event data; in this context, death or failure is considered an "event" in the survival analysis literature – traditionally only a single event occurs for each subject, after which the organism or mechanism is dead or broken.

Let's look at the malignant melonoma survival data from here. Data set contains time, survival time in days, status (1: died from melanoma, 2: alive, 3: dead from other causes), sex (1: male 0: female) and ulcer (1: present, 0: absent), event, 0 if status=1 and 1 otherwise. Download the data from here.

We will utilize survival and survminer packages for our analysis.

> library(survival)
> library(survminer)

First we will create the survival model:

> melsurv <- Surv(time=melanoma$time, event = melanoma$event)

Now we will fit our model with survfit function:

> fit1 <- survfit(melsurv~ulcer+sex, data=melanoma)

Note the covariables 'ulcer' and 'sex' in the formula. To see our survival plots we use ggsurvplot of survminer package:

> ggsurvplot(fit1, data = melanoma, pval = TRUE)

We can also create an 'at-risk table' with ggsurvtable:

> ggsurvtable(fit1, data = melanoma)