I am interested in comparing cross-sectional prevalence estimates across waves of PATH data. Per the PATH user guide (p. 61), is it not appropriate to compute separate estimates for each wave and compare by assessing confidence interval overlap. They recommend creating a stacked (or long) dataset with a wave indicator, which I have done. The user guide goes on to say "the subsequent analyses must include the newly created wave indicator variable and the design correctly specified in a software package designed to capture sample variability described in Appendix A....Manipulating the files as described above and using the appropriate variance estimation will correctly reflect these correlations."

After applying the recommended survey set command in Stata, I have tested out several lines of code (some from Appendix A) to compare estimates by wave indicator (e.g., chi-square, regression with categorical wave as a predictor). However, it is unclear to me whether performing these simple tests (for example, a chi-square test comparing prevalence of smoking by wave indicator) with the recommended survey specification (weighted with BRR variance estimation) is accounting for the correlation between participants across waves (since they are mostly the same people). Is it? If not, what is the recommended statistical approach to do so? Several statisticians have recommended using mixed models but it seems to me that this would have been specified in the user guide if needed. There is no specific guidance on this type of analysis in Appendix A.

Please refer to section 5.4.3.2 “Cross-sectional Analyses Comparing Different (or Partially Overlapping) Sets of Persons between Waves” in the user guide. As indicated in the user guide, to create cross-sectional estimates for comparing waves, the PATH Study data user will have to perform the following data manipulation steps:

As stated in the user guide “even though there is not complete overlap between the two sets of respondents, there are still correlations between the two groups that should be reflected due to partial overlap and because some persons are in the same PSUs. This correlation serves to reduce the estimated variance of the comparison; manipulating the files as described above and using the appropriate variance estimation methods will correctly reflect these correlations.”

Claire Cepuran

National Addiction and HIV Data Archive Program

Inter-university Consortium for Political

and Social Research (ICPSR)

734-615-1959

Thank you for your quick response, Claire.

These are the instructions I have followed. However, they stop short of actually explaining how to analyze the newly created stacked dataset. I cannot find this information anywhere in the user guide.

Per my original post, after applying the recommended survey set command in Stata, I have tested out several lines of code (some from Appendix A) to compare estimates by wave indicator (e.g., chi-square, regression with categorical wave as a predictor). However, it is unclear to me whether performing these simple tests (for example, a chi-square test comparing prevalence of smoking by wave indicator) with the recommended survey specification (weighted with BRR variance estimation) is accounting for the correlation between participants across waves (since they are mostly the same people). Is it? If not, what is the recommended statistical approach to do so? Several statisticians have recommended using mixed models but it seems to me that this would have been specified in the user guide if needed. There is no specific guidance on this type of analysis in Appendix A.

Can you please provide guidance on how to produce and statistically compare cross-sectional estimates with the stacked data file (created per user guide instructions)? I'd like a p-value comparing smoking prevalence estimates from waves 1, 2, and 3. Is it sufficient to run the survey set command and then do a chi-square by indicator or regression with the categorical indicator? Or is a mixed model required that accounts for the clustering by PERSONID?

Jana

Thank you for contacting PATH Study Support. The PATH Study is unable to assist with individual analytic questions from researchers or provide any other type of personal assistance as the study does not endorse specific statistical approaches. The RUF and PUF User Guides, codebooks, and annotated instruments are valuable resources that may help address this question, located at

https://doi.org/10.3886/Series606You may also wish to consult with statisticians and analysts at your institution with more specific questions

Claire Cepuran

National Addiction and HIV Data Archive Program

Inter-university Consortium for Political

and Social Research (ICPSR)

734-615-1959