The central challenge in predictive modeling for survival analysis in medical prognostics is the management of censored observations in the data. In such problems the true target times of a majority of instances are unknown; what is known is a censored target representing some indeterminate time before the true target time. Patients who have experienced the endpoint of interest (cancer recurrence, death, etc) during an often multi-year study are considered as non-censored or events. They may represent as little as 9% of the available samples. Most of the patients do not experience the endpoint or are lost to follow-up for various reasons (patient moved, died of other causes, etc.).
These censored samples often represent most of the available instances. Modeling techniques which can correctly account for censored observations are crucial. Such censored samples can be considered as semi-supervised targets, however most efforts in semi-supervised regression do not take into account the partial nature of unsupervised information; with samples treated as either fully labelled or unlabeled. This work presents a novel transduction approach for semi-supervised survival analysis. The true target times are approximated from the censored times through transduction to improve predictive performance. The framework can be employed to transform traditional regression methods for survival analysis, or to enhance existing survival analysis algorithms for improved predictive performance. This proposed approach represents one of the first applications of semi-supervised regression to survival analysis and yields significant improvements in predictive performance for multiple applications in prostate and breast cancer prognostics.