Scaling Open-Ended Survey Coding: An LLM Pipeline Where Definitions Do the Heavy Lifting
Journal of Open Source Software, 2026
Peer-reviewed software paper introducing CatLLM, an open-source Python and R toolkit for reproducible LLM-powered text classification, with defaults calibrated against expert human coders across multiple survey datasets.
Recommended citation: Soria C. Scaling Open-Ended Survey Coding: An LLM Pipeline Where Definitions Do the Heavy Lifting. Journal of Open Source Software. 2026. doi:10.21105/joss.09678 https://doi.org/10.21105/joss.09678
