Population Assessment of Tobacco and Health

Cross-Sectional Comparisons between Waves Tips for Using the PATH Study Data User Forum

Return to the Population Assessment of Tobacco and Health (PATH) Study Series page.

2 posts / 0 new
Last post
Cross-Sectional Comparisons between Waves


I am wondering if it is appropriate to compare prevalence estiamtes between waves.

For example, can I comapre the prevalence of cigarette smoking in 12-17 year olds at wave 1 with the prevalence of cigarette smoking among 12-17 year olds at wave 2 using the two data sets.

Thank you!


Cross-Sectional Comparisons between Waves (Reply)

Hi Hui,

Thanks for posting your question!

Cross-sectional comparisons such as the example of the prevalence of cigarette smoking among 12-17 year olds at Wave 1 with the prevalence of cigarette smoking among 12-17 year olds at Wave 2 can indeed be made using the PATH Study data files. Details on how to do this are given in Section 5.4.3 of the Restricted Use Files User Guide, available at http://doi.org/10.3886/ICPSR36231.userguide .

The prevalence estimates for Waves 1 and 2 are computed by using the wave-specific weights provided with the data files, and their standard errors are obtained by using the corresponding replicate weights. However, because the data were collected longitudinally, there are a number of differences with comparisons of true cross-sectional survey data. First, the respondents in Wave 2 represent the population from which the sample was selected at Wave 1 that remained in scope at Wave 2, which is not exactly the same as the population that is in scope at Wave 2. This is a somewhat subtle difference, especially since the waves are only 1 year apart, so that the Wave 2 sample can be viewed as approximately representative of a cross-sectional population. This topic is further described in Section 5.4.2 of the User Guide.  

Second, because the respondents in Waves 1 and 2 overlap and therefore are not independent, it is generally incorrect to compare estimates between waves by computing separate confidence intervals and seeing if they overlap. This is true even for domains of interest: for instance for the comparison mentioned above, most but not all of the 12-17 year old respondents in Wave 1 are also 12-17 year old respondents in Wave 2.


Log in or register to post comments