Determining the Optimal Number of Interview Waves in a Panel Survey with Application to the National Crime Victimization Survey

Panel surveys need to balance the benefits of repeated measurements (e.g., bounded interview, reduced cost, increased response rates) with the drawbacks that may eventually occur (e.g., respondent fatigue, mode effect). The optimal number of interview waves for a panel survey needs to maximize the advantages while minimizing the potential for bias due to incorporating sampling units for too many interview waves. In this paper, we develop cost models for two potential constraints: (1) keeping the number of interviews constant across designs, and (2) keeping the cost constant across designs. These models are applied to the National Crime Victimization Survey (NCVS). The NCVS currently uses a seven-wave or time-in-sample (TIS) design. In an effort to maintain or reduce costs and improve data quality, the Bureau of Justice Statistics commissioned a Panel Design Study to evaluate the effects of changing the NCVS from a 7-TIS design to a 5-TIS, 4-TIS, 3-TIS, or 1-TIS design. The study used a set of simulations to mimic different panel designs. The simulation assumptions were constructed using NCVS data from 1999 to 2011, and included assumptions about sample sizes, costs, response rates, household replacement, type of interview, demographics, and victimization propensities. Samples were simulated with different panel designs and summary victimization propensities, and standard errors were computed for key estimates. Simulations considered both keeping the cost constant and keeping the number of interviews constant across the different panel design options. In this paper, we show the impact of changing the number of panel TISs on property and violent victimization rates in terms of point estimates, variability, sample sizes, and costs, by several population characteristics. Simulation results found that a 4-TIS design is optimal for the NCVS.


Background
The National Crime Victimization Survey (NCVS) is the nation's leading measure of reported and unreported crime victimization rates in the United States. Sponsored by the Department of Justice's Bureau of Justice Statistics (BJS) and conducted by the U.S. Census Bureau, the NCVS is a nationally representative, probability-based household survey that interviews all people 12 years and older in a selected household. Interviews are conducted in approximately 90,000 households, and 160,000 individuals are interviewed in the NCVS each year [1,2].
Similar to other national benchmark surveys (see, for example, the Current Population Survey [3] and the Consumer Expenditure Survey [4]), the NCVS utilizes a rotating panel design, where equally sized sets of sampling units (i.e., rotation groups) are brought in and out of the sample in a specified pattern [5,6]. In the case of the NCVS, samples of 50,000 households are released to the field every 6 months allocated across seven rotation groups. The households remain in the sample for 3 and a half years and are interviewed seven times (at 6-month intervals) during that period [1]. In other words, the time-in-sample (TIS) for a household in the NCVS is seven. Waves in a Panel Survey with Application to the National Crime Victimization Survey Rotating panel designs offer three key benefits to a survey design [6,7]: Bounded interviews. For studies where there is concern that the outcome of interest may be highly susceptible to recall bias (e.g., telescoping), bounding an interview (i.e., tying a previous interview to a specific point in time) to the sampling unit's previous interview better ensures that events of interest that occurred before the bounding period will not be reported in the current period [8,9].
Cost. A panel design can reduce survey costs in two ways. First, panel designs often have higher response rates because after the initial contact, participating households are more motivated to remain in the study [5]. A higher response rate allows fewer sampling units to be initially selected to achieve the desired number of interviews, thereby reducing data collection costs. Second, a panel design allows the study to alter the interview mode to a lower-cost mode after the initial contact is made. For example, in the NCVS, the initial interview with a sampling unit may be in-person (i.e., the interviewer comes to the sampled address in-person to interview all eligible household members) to better recruit the household into the study and explain the study and its purpose. Follow-up interviews may be conducted by telephone (i.e., the same interviewer calls the household and conducts interviews via the telephone with each eligible household member) to reduce survey costs.
Longitudinal design. By interviewing a sampling unit multiple times, a rotating panel design allows for longitudinal data analysis in addition to serial cross-sectional analysis. This benefit allows analysts to better consider the correlation between a sampling unit and the outcome of interest over time [10].
However, there are some logistical considerations that may reduce the impact of these benefits [6,11,12]. For instance, the inclusion of the initial, unbounded interview, as the NCVS does, may introduce measurement error in the form of recall bias. Similarly, mobility in the sample may reduce the benefits of bounding and the longitudinal nature of the data [13]. Moreover, if there is a large amount of household turnover requiring replacement households (i.e., new families that have moved into a selected address), then the cost benefits of changing interview mode may not be realized because the first interview with a new replacement household will be in-person and will negate the potential cost savings by switching modes. Also, with the inclusion of the initial in-person interview and replacement households, there is the potential for a mode effect between in-person and telephone interviews. However, in the NCVS, the apparent mode effect is a function of telescoping effect and respondent fatigue rather than a function of mode [14,15]. In addition, if there is a large amount of panel attrition during the data collection period, then the ability to do longitudinal analysis may be reduced because of an increase in bias and a reduction in precision. Furthermore, respondents that remain in the panel may suffer from rotation group bias or panel conditioning [16][17][18]. Although the exact impact of panel conditioning is not consistent in all surveys, it does appear to affect a respondent's behavior over time [19]. Also, because BJS conducts most analyses in a cross-sectional or serial cross-sectional manner (e.g., Hardison-Walters et al. and Planty et al. [20,21]), having a larger number of interview waves (referred to here as TISs) may not be helpful analytically. This paper addresses the issue of cost and the longitudinal design. The effect of including unbounded interviews and other non-sampling error sources is beyond the scope of this paper (see, for example, Berzofsky and Biemer [22]).*

Purpose of Study
Given the tug between the benefits and limitations of a rotating panel design, it is necessary to assess the current NCVS design to see if the number of panel waves (aka time-in-sample, or TIS) for which a household is in the sample is optimal. Therefore, the purpose of our study is to determine the optimal number of TISs for the NCVS while ensuring that study estimates (i.e., crime victimization rates), precision levels, and study costs are not dramatically altered.
To understand how changing the number of TISs for sampled household in the NCVS will impact the cost of data collection, four alternative designs were considered in addition to the current 7-TIS design:

Methods
Study questions were answered through a three-step process. First, cost models were developed to determine the change in survey costs or the number of interviews that could be afforded under the current and alternative designs. Second, key characteristics related to the probability of reporting a crime were determined. Third, a Monte Carlo simulation was conducted using the cost and key characteristics to assess the change in the NCVS estimates and precision levels caused by modifying the number of TISs.

Cost Models
To assess the cost of modifying the number of TISs, two types of cost models were developed: (1) keeping the number of interviews constant (KNIC) and (2) keeping the cost constant (KCC). In the KNIC model, the number of interviews is fixed based on the average number of interviews in the 7-TIS (current) design. The model adjusts the cost of each design based on this fixed number of interviews. In the KCC model, the cost of each design is fixed based on of the estimated cost of the 7-TIS (current) design. The model adjusts the number of interviews in each design based on the fixed cost. Both of these models depend on knowledge about the current 7-TIS design. Therefore, the first step was to determine, within these model frameworks, the number of interviews and the cost of the survey under the current design.

Cost Model Assumptions
Each of the cost models is based on assumptions grounded in how the NCVS is currently conducted. Two main assumptions were needed for the cost models: (1) the probability of a sampled person participating in a particular TIS, and (2) the cost of conducting an interview.
Because the field procedures and analysis of the NCVS have changed over time, the data used to determine the probability of participating were restricted to a period that best reflects current practices. Characteristics that needed to be based on current practices included interview type distribution (i.e., mode of interview), household status in previous TIS, † response rate, and cost per interview type.
Using the data that met our cost assumption study criteria, ‡ the following response and participation distributions were determined.
Household response rate and household status by TIS and the person's previous TIS status (Table 1) For households responding in the current TIS, the distribution of person-level response status by TIS for each possible pattern of response in the previous TISs, based on (1) the household's response status (i.e., whether the same household is responding or if it is a replacement household), (2) the person's previous participation status for a household (i.e., whether or not at least one person in the household participated in the survey during the previous TIS), and (3) mode of interview (i.e., in person or telephone) ( Table 2).
Given these two pieces of information, a person's probability of participating in the NCVS for a particular TIS by interview mode was determined.
For the cost per interview, this study assumed $250 for an † Either "First TIS," "Same HH [household] interviewed the previous TIS," "Replacement HH since the previous TIS," or "Noninterview in the previous TIS." ‡ For the cost portion of the analysis, data were restricted to the years 2007-2011 and only included sample and rotation groups for which all seven TISs were publically available. Additionally, reinstated cases were excluded from the analysis.
in-person interview and $120 for a telephone interview. These cost assumptions were provided by BJS. They were based on actual total costs provided by the Census Bureau and the approximate distribution of in-person and telephone interviews.

Cost Estimates for Current Design
To make a fair comparison with the simulated samples (for 5, 4, 3, or 1 TISs), the actual sample (of 7 TISs) is not used to calculate the cost of the current design. Instead, a simulated sample of 7 TISs similar to the current design is generated. Using the simulated sample removes any noise from the actual sample caused by cases that were excluded from our analysis. This allows for an equal comparison between the current design and the alternative designs.
Approximately 50,000 households per 6 months are selected, distributed among seven rotation groups (across two samples). This means that a sample of 7,143 households per rotation group by sample number is selected. Table 3 shows a typical rotation pattern for households, for the simulation of the current 7-TIS design, with two sample numbers and 18 semesters (9 years). When the maximum 7 rotation groups are active (see Periods 7-12 in Table 3), there are 50,000 households for which interviewers are attempting to conduct the survey.
Given this rotation pattern; the number (n) of sampled households in a rotation group and sample; the response propensity (r) for a person given the TIS, household status, household response status in the previous TIS, and interview type; and the cost (c) of the person interview given the interview type, the total cost (TC) model can be written as = | where h = household, i = household status (1 = First TIS, 2 = household in previous TIS, 3 = replacement household since last TIS, 4 = noninterview in the previous TIS), j = household response status the previous TIS (1 = responded, 2 = nonrespondent), k = interview type (1 = in-person or equivalent, 2 = telephone) for person p, l = rotation group, m = sample number, p = person in the given household, P = number of people in the given household, and = 7,143. Based on the rotation chart for the 7-TIS design, the cost model, and the distribution of the number of people 12 years and older living in a household (based on the NCVS sample; the average is 2.04 persons per household), the survey cost for a 6-month period is approximately $12,200,000, with about 67,200 interviews conducted.

Cost Model for Keeping the Number of Interviews
Constant Based on the model for the current 7-TIS design, designs that kept the number of interviews constant fixed the number of interviews for a 6-month period at 67,200 and let the cost vary based on the mixture of in-person and telephone interviews. For each alternative design, a rotation chart, similar to the one in Table 1, was developed to determine the number of households that would need to be selected per rotation group. The average number of households needed per sample (m) and rotation group (l) is = ̅ Given the number of households sampled, the total cost for the alternative models for a 6-month period is

Cost Model for Keeping the Cost Constant
Based on the model for the current 7-TIS design, designs that kept the cost constant fixed the total cost for a 6-month period at $12,200,000 and let the number of interviews vary on the basis of the mixture of in-person and telephone interviews. For each alternative, when KCC, the number of interviews per sample (m) and rotation group (l) can be written as = ∑ ∑ ∑ | However, this formula will lead to a sample size that will vary across rotation groups. Therefore, for ease of implementation, the average number of interviews can be written as where M is the number of samples selected during the 6-month period and L is the number of rotation groups per sample.

Population Parameter Assumptions for Victimization Rates
The simulations that were carried out to assess the impact of reducing the number of NCVS interview TISs required a set of population distribution assumptions. Because the sample population is fixed before any data collection, the population distribution for the simulations needs to be based on attributes about the population that are knowable before data collection. The assumptions about the population distribution will be applied to all of the samples used throughout the simulation study.
It is not practical, or feasible, to use all of the available variables when creating the population for the simulations. Therefore, it is necessary to restrict the variables for simulation to the ones that best predict the outcome of interest: reporting a victimization. Characteristics associated with the propensity to experience property victimization and violent victimization are likely to be different; in addition, the subjects to which the two types of victimization apply are also different (households vs. persons). The population distribution assumptions will, therefore, be created separately for property and violent victimizations.
Once the characteristics that are most strongly associated with experiencing victimization have been identified, they will be used to determine the distribution of characteristics for the simulated samples. The victimization outcome will then be generated based on those characteristics and the victimization propensity for the group to which the household or person belongs.

Data for Determining Population Assumptions
As crime victimization is quite rare, especially within groups of interest (e.g., Hispanic males, age 18-29), the data used for the cost models were deemed inadequate (too sparse) to determine the most significant population parameters. The cost models dealt with estimating nonresponse patterns and sample sizes across TIS designs; therefore, it was necessary to use a dataset for which the TIS variable could be calculated without error. For the population parameters, however, the most important requirement is that the propensity estimates be as accurate as possible within propensity group; therefore, a larger dataset was desirable. Thus, to estimate the victimization propensity and to find the most important variables affecting those propensities, the analyses used all of the dataset that contains TIS 1 through TIS 7 responses for survey years 1999 through 2011.

Population Parameters
For property and violent victimizations, the first step is to determine how many distinct victimization propensities exist in the population and the variables that are most strongly associated with experiencing victimization. The subjects (either households or persons) in the NCVS sample will then be split according to differential propensity groups, defined by the most important variables. All the subjects within a group will have the same victimization propensity, but the propensity will differ across groups.
To identify the characteristics most strongly associated with the propensity to experience a property or violent victimization, respectively, the outcome of interest is defined as whether or not the household was the victim of at least one property crime in the reference period or at least one violent crime in the reference period. The set of characteristic variables that will be evaluated included all population characteristics collected in the NCVS. These variables are potentially associated with (1) whether a household experienced property victimization or (2) whether a person experienced a violent victimization. For some variables, some category collapsing was done beforehand; this was because of either small cell counts or preliminary exploratory analyses that revealed that some categories did not differ with respect to the outcome of interest: victimization.
It was then necessary to fix the number of groups that will be used for estimating the likelihood of experiencing a violent or property victimization. As mentioned above, the propensity will vary among groups, but will be constant within groups. One way to decide how many groups to use is by evaluating the reduction in deviance that increasing the number of groups produces. The deviance is a statistical measure of the error associated with a model. In this case, for example, the model might specify that the data can be divided into a number of groups, say 10, within which all the subjects have the same victimization propensity, and across which the propensity to experience victimization differs. Another model might specify that the data are divided into 11 groups (rather than 10), and so on.
Once the reduction in deviance was equal to at least 80% of the total possible reduction, it was clear that the largest reduction in deviance occurred when there were 25 to 27 groups for property victimizations and 12 to 14 groups for violent victimizations. Therefore, the property victimization model will include at least 27 groups, and the violent crime victimizations model will include at least 14 groups. For each victimization type, a recursive partitioning tree was used to determine the best set of groups. The partitioning tree for property crime included 12 different household characteristics that were identified as correlated with reporting a property victimization, and the partitioning tree for violent crime included nine person and household characteristics identified as correlated with reporting a violent victimization.
In addition to the variables used for predicting the probability of reporting a victimization, other key demographic characteristics that BJS uses for analysis (e.g., sex, age category) were randomly assigned to the sample population based on their marginal distributions in the population (i.e., the probability of being between 18-29 was not conditioned on any other characteristic, such as sex or race). These variables were used for subpopulation analysis.

Monte Carlo Simulation
Once the propensities to respond and the population parameters were determined, a Monte Carlo simulation was conducted to produce victimization estimates by type of crime (TOC). The simulation produced estimates for each detailed TOC. Namely, for property crime, estimates were produced for household burglary, theft, and motor vehicle theft, and for violent crime, estimates were produced for rape and sexual assault, aggravated assault, robbery, and simple assault.
Because of the complex nature of the NCVS household sample design, it was not feasible to incorporate the actual design into the simulation. Therefore, a simple random sample was used to select households from the population. To get appropriate standard errors, design effects from the population were estimated from the unbounded 1999-2011 data. For each design, only responses from the corresponding TISs were used to estimate the design effects. For example, for robbery, the design effect for the 4-TIS design was based only on robbery victims in TIS 1 through TIS 4. Table 4 and Table 5 present the design effects for the property crime and violent crime types, respectively, that were analyzed. Furthermore, because no bounding adjustment (i.e., an adjustment applied to respondents in TIS 1 who report a victimization to account for potential recall bias) can be applied to a 1-TIS (cross-sectional) design, no bounding adjustment was applied to any of the designs. The simulation of samples was conducted 1,000 times. For each simulation, households are assigned a rotation group and survey characteristics, as follows. Within each rotation group, response and participation characteristics are assigned based on the cost model assumptions. Given these characteristics, and the simulated TIS for a household, a household (or person) is assigned a victimization status. Among those identified as victims, the number of victimizations reported was simulated on the basis of the simulated household (or person) characteristics.
Estimates were the average victimization rate over the 1,000 simulations. In other words, if the victimization rate for one realization of the simulation for TOC V is % & = 1,2, ⋯ ,1000, then the average victimization rate across all simulations was calculated as  Table 6 presents the results of the cost models by alternative design when KNIC. As seen in the table, as the number of TISs decreases, the cost to maintain the same number of interviews increases compared with the current design. For instance, with the 4-TIS design, the cost increases to $12.8 M (a 4.9% increase). For a 1-TIS design, the increase in cost is 37.7%, but for all other designs, the change in cost is less than 10%. This change is because as the number of households per sample and rotation group increases, the number of in-person interviews increases (i.e., there are more first interviews with an address), which increases the total cost.  Table 7 presents the results of the cost models by alternative design when KCC. As seen in the table, as the number of TISs decreases, the number of interviews per 6-month period decreases. For instance, for the 4-TIS design, only 48,400 interviews can be conducted for the $12.2 M cost of the current design (a 4.8% decrease). For the 1-TIS design, the decrease in the number of interviews is 27.4%, but for all other designs, the change is less than 10%.

Victimization Rates
For assessing victimization rates, two types of analyses were conducted: (1) comparing overall victimization rates by design, and (2) comparing subpopulation victimization rates by design.

Comparing Overall Victimization Rates by Design
To assess the quality of the estimates under each design, given that no gold standard (i.e., error-free estimate) of crime victimization exists, only relative comparisons to the current design could be made. Therefore, to compare victimization estimates in each design, the following measures were used: (1) estimates by type of crime, (2) statistical difference of estimates, (3) relative standard errors (RSEs), and (4) nominal and effective sample sizes.  Figure 1 and Figure 2 show, for violent and property crimes, respectively, the victimization rates, nominal and effective sample sizes, and cost for the KNIC designs. In both figures, the victimization rate increases as the number of TISs decreases. This is because as the number of TISs decreases, the influence of the unbounded interview (i.e., the first TIS) is greater. Unbounded interviews have more victimizations reported because of potential recall bias. For violent victimizations, the differences in the victimization rates are not significantly different from one another. However, for property victimization, the 1-TIS design has a significantly higher rate than the other designs. This is because, as seen in Table 4 and Table 5, the design effects for property crimes decrease more sharply than for violent victimizations as the number of TISs decrease. Violent crimes have additional correlation as a result of interviewing all persons 12 and older in a household. This additional correlation offsets the benefits, in terms of variance reduction, of having fewer repeat interviews over time. These findings are similar when breaking violent crimes into more-detailed types of crimes (e.g., aggravated assault, household theft). Moreover, although the nominal sample size is intentionally the same for each design, because of the decreasing design effects, the effective sample size increases as the number of TISs decrease for both violent and property crime.  Table 8 presents the RSEs for violent and property crime, respectively, by design when KNIC. In both cases, the RSE decreases as the number of TISs decreases. This is mostly because of the reduction in design effect as the number of TISs decrease.  Figure 3 and Figure 4 show, for violent and property crimes, respectively, the victimization rates, nominal and effective sample sizes, and cost for the KCC designs. As with the KNIC designs, the victimization rates increase as the number of TISs decrease. For violent victimizations, the differences in the victimization rates are not significantly different from one another. However, for property victimization, the 1-TIS design has a significantly higher rate than the other designs. These findings are similar for more-detailed types of crimes (e.g., aggravated assault, household theft). Furthermore, the nominal sample size decreases as the number of TISs decrease in order to keep costs fixed. Nonetheless, for 3-, 4-, and 5-TIS designs the effective sample size increases relative to the 7-TIS design for both types of crime. However, because of the smaller decrease in the design effect and the large (27.4%) decrease in the nominal sample size for violent crime, the effective sample size is lower for the 1-TIS design than the 7-TIS design. The 1-TIS effective sample size is larger than the 7-TIS design for property crime because the design effect is much smaller for the 1-TIS design than for the 7-TIS design.  Table 9 presents the RSEs for violent and property crimes, respectively, by design when KCC. For violent crimes, the RSEs (100 × standard error/estimate) remain relatively flat across design options, whereas for property crime, the RSEs decrease as the number of TISs decrease. Because in the KCC designs, the nominal sample size needs to decrease to maintain costs because of the increase in in-person interviews, when the change in the design effect is negligible (as is the case with violent crime) the standard errors increase rather than remain flat (i.e., the negative impact of the decrease in sample size on the standard errors is greater than the positive impact of the smaller design effect), thus leading to flat RSEs. However, for property crime, the decrease in the design effect has greater influence on the standard errors than the decrease in nominal sample size, leading to lower RSEs as the number of TISs decrease. These findings were consistent across the more detailed types of crime.

Comparing Subpopulation Victimization Rates by Design
In addition to comparing overall victimization rates by design, it is necessary to know whether the alternative designs impact subpopulation estimates. For this comparison, subpopulation estimates for key population characteristics (e.g., age category, income) were computed for each design and statistically compared to the current 7-TIS design. Table 10 and Table 11 present the violent and property victimization rates by gender and race for KNIC and KCC models, respectively, by design alternative. For both the KNIC and KCC models, the only differences (except for whites in the 3-TIS design) between the alternative designs and the current 7-TIS design were in the 1-TIS design. This finding held true for all other characteristics compared.

Conclusions
For both violent and property crimes, the 4-TIS design achieves the largest effective sample sizes while still ensuring that the (overall) victimization estimates are not significantly different from the current estimates. The 3-and 1-TIS designs, on the other hand, sometimes achieve larger effective sample sizes, but both designs produce estimates that are significantly different from the current estimates at either the subpopulation level (3-TIS design) or subpopulation and overall level (1 TIS). Moreover, the 4-TIS design reduces the RSEs for all types of crime.
The conclusion that the 4-TIS design has preferred properties compared with all other designs holds for both the KNIC and KCC models. In general, it would be recommended to use a design that maintains the current number of interviews per year. When maintaining the same number of interviews, a lower number of TISs results in higher effective sample sizes. Designs with fewer TISs produce lower design effects in general. However, maintaining the same number of interviews costs more because of the increase in the number of in-person interviews. Therefore, the decision as to which 4-TIS design is preferred (KNIC or KCC) comes down to a matter of cost. When KNIC, the cost of the 4-TIS design is 4.8% greater ($1.2 M per year) than the 7-TIS design.
However, for the additional cost of $1.2 M, the 4-TIS design only reduces the RSE for violent crime by 7.9% and for property crime by 11.0% compared with the 7-TIS design. The KCC 4-TIS design provides improvements in the effective sample size, although not as great as the KNIC 4-TIS design, while having only slightly lower reduction in the RSE (violent victimization has a reduction of 5.25%, whereas reduction in the RSE is the same as the KNIC reduction for property crime). Moreover, unlike the 3-and 1-TIS designs, when costs are kept constant, the 4-TIS design produces estimates that are not significantly different from the 7-TIS design for all subpopulation characteristics and types of crimes.