Wraps the Python catvader.extract() function. Discovers and returns a
normalised, deduplicated set of categories from social media data.
Usage
extract(
input_data = NULL,
api_key = NULL,
sm_source = NULL,
sm_limit = 50L,
sm_months = NULL,
sm_credentials = NULL,
platform = NULL,
handle = NULL,
hashtags = NULL,
post_metadata = NULL,
description = "",
max_categories = 12L,
categories_per_chunk = 10L,
divisions = 12L,
user_model = "gpt-4o",
creativity = NULL,
specificity = "broad",
research_question = NULL,
mode = "text",
filename = NULL,
model_source = "auto",
iterations = 8L,
random_state = NULL,
focus = NULL,
chunk_delay = 0
)Arguments
- input_data
A character vector, list, or
NULLto fetch from social media. DefaultNULL.- api_key
Character or
NULL. API key for the LLM provider.- sm_source
Character or
NULL. Social media source.- sm_limit
Integer. Max posts to fetch. Default
50L.- sm_months
Integer or
NULL. Fetch posts from last N months.- sm_credentials
Named list or
NULL. API credentials.- platform
Character or
NULL. Alias forsm_source.- handle
Character or
NULL. Social media handle.Character vector or
NULL. Hashtags to filter by.- post_metadata
Named list or
NULL. Additional post metadata.- description
Character. Context description. Default
"".- max_categories
Integer. Default
12L.- categories_per_chunk
Integer. Default
10L.- divisions
Integer. Default
12L.- user_model
Character. Default
"gpt-4o".- creativity
Numeric or
NULL. DefaultNULL.- specificity
Character. Default
"broad".- research_question
Character or
NULL.- mode
Character. Default
"text".- filename
Character or
NULL.- model_source
Character. Default
"auto".- iterations
Integer. Default
8L.- random_state
Integer or
NULL.- focus
Character or
NULL.- chunk_delay
Numeric. Default
0.0.
Examples
if (FALSE) { # \dontrun{
result <- extract(
input_data = df$posts,
api_key = Sys.getenv("OPENAI_API_KEY"),
user_model = "gpt-4o-mini"
)
print(result$top_categories)
} # }