Skip to contents

Wraps the Python cat_survey.extract() function. Discovers and returns a normalised, deduplicated set of categories found in survey response data.

Usage

extract(
  input_data,
  api_key,
  survey_question = "",
  description = "",
  input_type = "text",
  max_categories = 12L,
  categories_per_chunk = 10L,
  divisions = 12L,
  user_model = "gpt-4o",
  creativity = NULL,
  specificity = "broad",
  research_question = NULL,
  mode = "text",
  filename = NULL,
  model_source = "auto",
  iterations = 8L,
  random_state = NULL,
  focus = NULL,
  chunk_delay = 0
)

Arguments

input_data

A character vector, list, or data.frame column of survey responses.

api_key

Character. API key for the model provider.

survey_question

Character. The survey question text. Default "".

description

Character. Additional context. Default "".

input_type

Character. Type of input. Default "text".

max_categories

Integer. Maximum final categories. Default 12L.

categories_per_chunk

Integer. Default 10L.

divisions

Integer. Number of data chunks. Default 12L.

user_model

Character. Model name. Default "gpt-4o".

creativity

Numeric or NULL. Temperature. Default NULL.

specificity

Character. "broad" or "specific". Default "broad".

research_question

Character or NULL. Optional research context.

mode

Character. Processing mode. Default "text".

filename

Character or NULL. Output CSV filename.

model_source

Character. Provider hint. Default "auto".

iterations

Integer. Number of passes. Default 8L.

random_state

Integer or NULL. Random seed.

focus

Character or NULL. Optional focus.

chunk_delay

Numeric. Seconds between API calls. Default 0.0.

Value

A named list with counts_df, top_categories, and raw_top_text.

Examples

if (FALSE) { # \dontrun{
result <- extract(
  input_data      = c("Took a new job in Chicago",
                      "Wanted to be closer to grandkids",
                      "Couldn't afford rent in the Bay Area"),
  survey_question = "Why did you move?",
  api_key         = Sys.getenv("OPENAI_API_KEY"),
  user_model      = "gpt-4o-mini"
)
print(result$top_categories)
} # }