General-purpose LLM text classification engine for R. This is the base package in the CatLLM R ecosystem, providing domain-agnostic classification, extraction, exploration, and summarization of text, images, and PDFs using large language models.
cat.stack wraps the Python cat-stack package via reticulate. It makes no domain assumptions and can be used for any classification task.
Installation
# From R-universe (recommended)
install.packages("cat.stack",
repos = c("https://chrissoria.r-universe.dev",
"https://cloud.r-project.org"))
# Or from a local clone
devtools::install("path/to/cat.stack")
# Install the Python backend (one-time setup)
cat.stack::install_cat_stack()
# With PDF support
cat.stack::install_cat_stack(pdf = TRUE)Quick Start
Extract categories from data
result <- extract(
input_data = df$responses,
description = "Why did you move to this city?",
api_key = Sys.getenv("OPENAI_API_KEY")
)
print(result$top_categories)Summarize text or PDFs
results <- summarize(
input_data = df$articles,
description = "News articles",
instructions = "Provide a 2-sentence summary of each article",
api_key = Sys.getenv("OPENAI_API_KEY")
)Multi-model ensemble
results <- classify(
input_data = df$responses,
categories = c("Positive", "Negative", "Neutral"),
models = list(
c("gpt-4o", "openai", Sys.getenv("OPENAI_API_KEY")),
c("claude-sonnet-4-5-20250929", "anthropic", Sys.getenv("ANTHROPIC_API_KEY"))
),
consensus_threshold = "unanimous"
)Functions
| Function | Description |
|---|---|
classify() |
Classify text, images, or PDFs into categories |
extract() |
Discover and extract categories from data |
explore() |
Get raw category extractions for saturation analysis |
summarize() |
Summarize text, images, or PDFs |
install_cat_stack() |
Install the Python backend |