cat.stack — General-purpose LLM text classification for R • cat.stack

General-purpose LLM text classification engine for R. This is the base package in the CatLLM R ecosystem, providing domain-agnostic classification, extraction, exploration, and summarization of text, images, and PDFs using large language models.

cat.stack wraps the Python cat-stack package via reticulate. It makes no domain assumptions and can be used for any classification task.

Installation

# From R-universe (recommended)
install.packages("cat.stack",
                 repos = c("https://chrissoria.r-universe.dev",
                          "https://cloud.r-project.org"))

# Or from a local clone
devtools::install("path/to/cat.stack")

# Install the Python backend (one-time setup)
cat.stack::install_cat_stack()

# With PDF support
cat.stack::install_cat_stack(pdf = TRUE)

Quick Start

Classify text

library(cat.stack)

results <- classify(
  input_data  = c("I love this product!", "Terrible experience.", "It was fine."),
  categories  = c("Positive", "Negative", "Neutral"),
  description = "Customer feedback sentiment",
  api_key     = Sys.getenv("OPENAI_API_KEY")
)

Extract categories from data

result <- extract(
  input_data  = df$responses,
  description = "Why did you move to this city?",
  api_key     = Sys.getenv("OPENAI_API_KEY")
)
print(result$top_categories)

Summarize text or PDFs

results <- summarize(
  input_data   = df$articles,
  description  = "News articles",
  instructions = "Provide a 2-sentence summary of each article",
  api_key      = Sys.getenv("OPENAI_API_KEY")
)

Multi-model ensemble

results <- classify(
  input_data  = df$responses,
  categories  = c("Positive", "Negative", "Neutral"),
  models      = list(
    c("gpt-4o",              "openai",    Sys.getenv("OPENAI_API_KEY")),
    c("claude-sonnet-4-5-20250929", "anthropic", Sys.getenv("ANTHROPIC_API_KEY"))
  ),
  consensus_threshold = "unanimous"
)

Functions

Function	Description
`classify()`	Classify text, images, or PDFs into categories
`extract()`	Discover and extract categories from data
`explore()`	Get raw category extractions for saturation analysis
`summarize()`	Summarize text, images, or PDFs
`install_cat_stack()`	Install the Python backend

License

MIT