Skip to contents

Academic paper classification with LLMs. A domain wrapper around cat.stack that adds journal and topic sourcing parameters for classifying, extracting, exploring, and summarizing academic literature.

cat.ademic wraps the Python catademic package via reticulate.

Installation

# From R-universe (recommended)
install.packages("cat.ademic",
                 repos = c("https://chrissoria.r-universe.dev",
                          "https://cloud.r-project.org"))

# Or from a local clone
devtools::install("path/to/cat.stack")
devtools::install("path/to/cat.ademic")

# Install the Python backend (one-time setup)
# pip install cat-ademic

Quick Start

Classify papers by journal

library(cat.ademic)

results <- classify(
  categories   = c("Quantitative", "Qualitative", "Mixed Methods"),
  journal_name = "American Sociological Review",
  paper_limit  = 100L,
  polite_email = "you@university.edu",
  api_key      = Sys.getenv("OPENAI_API_KEY")
)

Extract categories by topic

result <- extract(
  topic_name   = "climate change adaptation",
  paper_limit  = 200L,
  polite_email = "you@university.edu",
  api_key      = Sys.getenv("OPENAI_API_KEY")
)
print(result$top_categories)

Summarize papers

results <- summarize(
  input_data   = df$abstracts,
  description  = "Sociology journal abstracts",
  instructions = "Summarize the key findings in 2 sentences",
  api_key      = Sys.getenv("OPENAI_API_KEY")
)

Functions

Function Description
classify() Classify academic papers into categories
extract() Discover and extract categories from paper data
explore() Get raw category extractions for saturation analysis
summarize() Summarize academic papers

License

MIT