Mathematics and Statistics Colloquium
Lucia Petito, Harvard School of Public Health
An Exploration into Misclassified Group
Tested Current Status Data
Monday, March 12, starting at 4:00 pm, Davis 301
Refreshments at 3:30 pm, outside of Davis 216
Group testing, first introduced by a military doctor in 1943, has been used as a method to reduce costs when estimating the prevalence of a binary characteristic based on a screening test of m groups that include n independent individuals in total. If the unknown prevalence in question is low, and the screening test suffers from misclassification, more precise prevalence estimates can be obtained from group testing than from testing all n samples separately. In some applications, the individual binary response corresponds to whether an underlying “time to incidence” variable T is less than an observed screening time C. This data structure at the individual level is known as current status data. Given sufficient variation in the observed Cs, it is possible to estimate the distribution function F of T non-parametrically using the pool-adjacent-violators algorithm. Here, we develop a nonparametric estimator of F based on group tested current status data for groups of size k = n/m where the group tests “positive” if and only if any individual unobserved T is less than its corresponding observed C. We will investigate the performance of the group-based estimator as compared to the individual test nonparametric maximum likelihood estimator, and show that the former can be more precise in the presence of misclassification for low values of F(t). We then apply this estimator to the age-at-incidence curve for hepatitis C infection in a sample of U.S. women who gave birth to a child in 2014, where group assignment is done at random and based on maternal age.