An Online Updating Approach for Testing the Proportional Hazards Assumption with Streams of Big Survival Data

5 Sep 2018  ·  Yishu Xue, HaiYing Wang, Jun Yan, Elizabeth D. Schifano ·

The Cox model, which remains as the first choice in analyzing time-to-event data even for large datasets, relies on the proportional hazards assumption. When the data size exceeds the computer memory, the standard statistics for testing the proportional hazards assumption can no longer b e easily calculated. We propose an online up dating approach with minimal storage requirement that up dates the standard test statistic as each new block of data becomes available. Under the null hypothesis of proportional hazards, the proposed statistic is shown to have the same asymptotic distribution as the standard version if it could be computed with a super computer. In simulation studies, the test and its variant based on most recent data blocks maintain their sizes when the proportional hazards assumption holds and have substantial power to detect different violations of the proportional hazards assumption. The approach is illustrated with the survival analysis of patients with lymphoma cancer from the Surveillance, Epidemiology, and End Results Program. The proposed test promptly identified deviation from the proportional hazards assumption that was not captured by the test based on the entire data.

PDF Abstract
No code implementations yet. Submit your code now




  Add Datasets introduced or used in this paper