Publications

Peer-Reviewed

Scaling Open-Ended Survey Coding: An LLM Pipeline Where Definitions Do the Heavy Lifting

Journal of Open Source Software, 2026

Peer-reviewed software paper introducing CatLLM, an open-source Python and R toolkit for reproducible LLM-powered text classification, with defaults calibrated against expert human coders across multiple survey datasets.

Recommended citation: Soria C. Scaling Open-Ended Survey Coding: An LLM Pipeline Where Definitions Do the Heavy Lifting. Journal of Open Source Software. 2026. doi:10.21105/joss.09678 https://doi.org/10.21105/joss.09678

The Caribbean American Dementia and Aging Study: Protocol for a Population-Based Study of Older Adult Health and Dementia in Cuba, the Dominican Republic, and Puerto Rico

BMC Geriatrics, 2025

CADAS is a multi-purpose household study of aging focused on the life course determinants and consequences of health and dementia in Puerto Rico, Dominican Republic, and Cuba.

Recommended citation: Liu MM, Llibre-Guerra J, Soria C, Li J, Zayas Llerena T, Rodriguez G, Acosta D, Jiménez Velázquez I, Llibre-Rodriguez JJ, Dow WH. The Caribbean American Dementia and Aging Study: protocol for a population-based study of older adult health and dementia in Cuba, the Dominican Republic, and Puerto Rico. BMC Geriatr. 2025;25(1). doi:10.1186/s12877-025-06131-0 https://bmcgeriatr.biomedcentral.com/articles/10.1186/s12877-025-06131-0

Assessing the 10/66 Dementia Classification Algorithm for International Comparative Analyses with the U.S.

American Journal of Epidemiology, 2025

Cross-national comparisons of dementia prevalence are essential for identifying unique determinants and cultural-specific risk factors, but methodological differences in dementia classification across countries hinder global comparisons. This study maps the 10/66 algorithm for dementia classification, widely used and validated in low- and middle-income countries (LMICs), to the U.S. Aging, Demographics, and Memory Study (ADAMS), the dementia sub-study of the Health and Retirement Study, and assesses its performance in ADAMS.

Recommended citation: Jorge J Llibre Guerra, Jordan Weiss, Jing Li, Chris Soria, Ana Rodriguez-Salgado, Juan de Jesús Llibre Rodriguez, Ivonne Z Jiménez Velázquez, Daisy Acosta, Mao-Mei Liu, William H Dow, Assessing the 10/66 Dementia Classification Algorithm for International Comparative Analyses with the U.S., American Journal of Epidemiology, 2024;, kwae470, https://doi.org/10.1093/aje/kwae470 https://pubmed.ncbi.nlm.nih.gov/39745806/

Commentary: Examining Contextual Factors Contributing to Differentials in COVID-19 Mortality in U.S. vs. India

Frontiers in Public Health, 2022

This commentary examines the disparities in COVID-19 mortality rates between the U.S. and India, exploring demographic dynamics and contextual factors contributing to the “Indian death paradox.”

Recommended citation: Zanwar PP, Wallace KL, Soria C, Perianayagam A. Commentary: Examining contextual factors contributing to differentials in COVID-19 mortality in U.S. vs. India. Front Public Health. 2022;10:995751. doi:10.3389/fpubh.2022.995751 https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2022.995751/full

Pre-Print

Scaling Open-Ended Survey Coding: An LLM Pipeline Where Definitions Do the Heavy Lifting

SocArXiv, 2026

CatLLM is an open-source Python and R package offering a three-stage pipeline for coding open-ended survey responses with large language models, with defaults calibrated by a systematic empirical study evaluating 21 LLMs across three capability tiers, six providers, and four survey questions.

Recommended citation: Soria C. Scaling Open-Ended Survey Coding: An LLM Pipeline Where Definitions Do the Heavy Lifting. SocArXiv. 2026. https://osf.io/preprints/socarxiv/gjvcf_v1 https://osf.io/preprints/socarxiv/gjvcf_v1

Partisan differences in health behaviors can impact respiratory disease dynamics

medRxiv, 2026

This study examines how partisan differences in contact rates, mask usage, and vaccination patterns shape respiratory disease transmission dynamics.

Recommended citation: Soria C, Dorelien A, Feehan D, Mahmud A. Partisan differences in health behaviors can impact respiratory disease dynamics. medRxiv. 2026. doi:10.64898/2026.01.14.26344076 https://www.medrxiv.org/content/10.64898/2026.01.14.26344076v1

Social Network Structure Rivals Smoking and Income as a Predictor of U.S. County Mortality

SocArXiv, 2025

This study examines how US county-level social network structure relates to mortality disparities using measures from 21 billion Facebook friendships.

Recommended citation: Soria C, Feehan DM. Social Network Structure Rivals Smoking and Income as a Predictor of U.S. County Mortality. SocArXiv. 2025. https://osf.io/preprints/socarxiv/kvmx6_v3 https://osf.io/preprints/socarxiv/kvmx6_v3