“Statistics and Data (Analytics): You cannot have one without the other.”

Inaugural Lecture to be given by
Professor Jeanine Houwing-Duistermaat
Presentation (slides)


Many fields of human interest are producing vast amounts of data: commerce, social science, biology, agriculture, healthcare, urban planning, transport, communications and many more in order to answer relevant questions.  For example in health research the datasets should provide insight into biological processes underlying health and disease and will be used to determine homogeneous sub-groups of patients to tailor treatment and screenings programs. Reaching these goals requires collaborations between experts in data acquisition, biology and methodology. The complexity of the data necessitates involvement of computer science, modelling, physics and statistics.

Due to the availability of many different datasets in health (e.g. electronic health records, imaging, genomics, proteomics), research in statistical methods is nowadays very exciting. The challenge is to integrate these datasets for joint analysis while addressing measurement error, heterogeneity, missing values and sampling design. Outcome dependent sampling is typically employed to reduce costs without losing too much statistical efficiency. However such a design adds another layer of complexity to the statistical methodology, since most data analyses need to account for the design to ensure appropriate interpretation of the results.

During the lecture I will share the current excitement in data analytics and illustrate statistical methods for integrating multiple datasets using the analysis of the multi-case family design. And for the future: Which role should statisticians claim in order to elevate the impact of data in health research and health care?