Changelog
Source:NEWS.md
cat.stack 0.2.2
Python engine floor raised to cat-stack 2.0.1
-
install_cat_stack()(andSystemRequirements) now requirecat-stack >= 2.0.1— the stable 2.0 engine. This is where the fixes land that R users on the newest Anthropic generation need:creativityandthinking_budgetno longer 400 on Opus 4.7+/Sonnet 5/Fable 5 (adaptive thinking + parameter gating are handled per model),thinking_budgetgrades consistently across providers, anddescriptioncontext routing is fixed inclassify()/prompt_tune(). Existing installs: runinstall_cat_stack(upgrade = TRUE)once.
Deprecation-proof survey_question forwarding
-
classify()andprompt_tune()now forwardsurvey_questionto the engine as the canonicaldescriptionparameter, so R callers never trigger the Python-levelDeprecationWarning. The R signatures are unchanged;survey_questionremains a documented soft-deprecated alias.
cat.stack 0.2.1
New parameter in classify()
-
multi_label— logical, defaultTRUE(unchanged behavior). The Python engine has always supported single-label classification, but the R wrapper did not forward the flag, soclassify()could only ever request multi-label output. Settingmulti_label = FALSEnow reaches the engine and switches the prompt to “assign the single most appropriate category” (one1, the rest0), for mutually exclusive coding frames.
cat.stack 0.2.0
Tracks the Python cat-stack 1.6.0 release. The version floor in install_cat_stack() is now cat-stack >= 1.6.0 — older Python installs will hit unexpected keyword argument errors when R forwards the new parameters listed below.
New parameters in classify()
-
batch_mode,batch_poll_interval,batch_timeout— opt into the async batch APIs (OpenAI / Anthropic / Google / Mistral / xAI) for ~50% cost savings and higher rate limits. HuggingFace / Perplexity / Ollama fall back to synchronous calls. -
json_retries— per-row retries when the LLM returns JSON that fails schema validation. On the final attempt the formatter fallback fires (ifjson_formatteris enabled). -
json_formatter— three-state (TRUE/FALSE/NULL) control for the local JSON-repair model. DefaultNULLtriggers an interactive consent prompt on the first malformed row; non-TTY contexts decline silently. Requirescat-stack[formatter]. -
two_step_classify— split classification into reasoning + JSON formatting steps for weaker models (lower-tier API + local Ollama). Auto-enablesjson_formatterwhen set. -
embedding_tiebreaker+min_centroid_size— resolve true ensemble 50/50 ties via embedding-centroid similarity instead of the default “tie → 0”. Addscategory_N_resolved_byaudit column ("vote"or"centroid"). Multi-model ensemble + text input only; not yet supported inbatch_mode. Requirescat-stack[embeddings].
Behavior changes
-
consensus_threshold = "majority"is now strict majority. A 50/50 tie on an even-model ensemble (2-2 of 4, 3-3 of 6, 1-1 of 2) resolves to"0", not"1". Matches sklearn’sVotingClassifierdefault and standard ensemble literature. Numeric thresholds (e.g.consensus_threshold = 0.5) keep>=semantics — the user picked a number, they get the literal interpretation. For 2-model ensembles,"majority"now effectively requires both models to agree on positive. Use 3+ models for a non-degenerate majority vote, or pair withembedding_tiebreaker = TRUE. -
batch_retriesdefault lowered from2Lto1Lto match the Python default (changed incat-stack1.4.1).
Documentation
-
consensus_thresholdman page now describes the strict-majority semantics, the 2-model degeneracy, the numeric-input escape hatch, and theembedding_tiebreakercompanion. - Added a new example block showing strict-majority + tiebreaker and
batch_mode = TRUE.
Python-side fixes that flow through automatically
These are in Python cat-stack 1.6.0 and pass through the reticulate bridge without R-side changes — listed here so users know the behaviors they’re getting after install_cat_stack(upgrade = TRUE):
- Google preflight 400 fixed (
additionalPropertiesstripped before reachingresponseSchema). - Per-model batch-job failure isolation — one model’s batch failure no longer kills the ensemble run.
- Anthropic batch terminal-state inspection — all-errored batches raise instead of silently returning empty results.
-
system_promptis no longer silently dropped inbatch_mode. - PDF summary synthesis grounds on actual page text instead of the page label.
- HuggingFace small-model strip-on-5xx (Llama-3.2-1B /
response_format). - Image directory loading is case-insensitive; large images warn.
-
prompt_tunereturnssystem_prompt = ""when no improvement was found (instead of returning a non-improving prompt as the default).
cat.stack 0.1.0
- Initial release of the R interface to the Python
catstackpackage. - Exposes
classify(),extract(),explore(), andsummarize()for domain-agnostic LLM-powered text, image, and PDF classification. -
install_cat_stack()installs the Python dependency via reticulate. - Internal helpers (
.strip_quotes(),.as_py_int(),.convert_models(),.validate_add_other()) exported for reuse by sibling domain packages (cat.survey, cat.ademic, cat.pol, cat.web, etc.).