Create and manage cached context for repeated queries against large payloads. Use caching when making multiple queries against the same large dataset or document - you pay once for input tokens, then only for output tokens.
Usage
gemini_cache_create(
userdata,
ml = NULL,
instructions = NULL,
ttl_seconds = NULL,
displayName = NULL
)
gemini_cache_delete(cache)
gemini_cache_get(cache)
gemini_cache_list()
gemini_chat_cached(
prompt,
cache,
ml = NULL,
max_think = FALSE,
temp = 1,
timeout = 60
)
.estTokenUsage(userdata)Arguments
- userdata
Character or list. Data to cache - either JSON string or R object (will be serialized to JSON). Must be >= ~4096 estimated tokens. Use for large context like documents, datasets, or conversation history.
- ml
Character. Model ID to use. If NULL (default), uses
ART_GEMINI_MODELenv var. The cache is tied to this model.- instructions
Character. System instruction for cached context. Default "You are a helpful agent." Sets behavioral context for all queries using this cache.
- ttl_seconds
Numeric. Time-to-live in seconds. Default 43200 (12 hours). Cache is automatically deleted after this time. Override via
ART_GEMINI_TTL_SEC.- displayName
Character. Human-friendly label for the cache (e.g., "artist-portfolio-2024"). For metadata only - use the returned
namefield for operations.- cache
Character. Cache name/ID returned from
gemini_cache_create(). Format: alphanumeric string. Used for get/delete/chat operations.- prompt
Character. User message for cached chat. Must be a single string.
- max_think
Logical. Enable extended reasoning (
thinkingLevel = "high") for Gemini 3 models. Silently ignored for other models. Default FALSE.- temp
Numeric. Temperature setting (0-2). Default 1.
- timeout
Numeric. Request timeout in seconds. Default 60.
Value
List with name, model, tokenUsage, and optional displayName.
Invisible; errors on non-2xx.
List with cache metadata (name, model, tokenUsage, displayName).
List with cachedContents entries (each carries name/model/tokenUsage).
Character model reply with attributes modelVersion/usageMetadata.
Details
Minimum cache size: Payloads must be at least ~4096 estimated tokens
(calculated as nchar(json) / 4). Smaller payloads will error before the
API call is made: "Payload too small for explicit caching (min 4096 est. tokens)."
displayName: Optional human-friendly label set only at cache creation.
It is returned on gemini_cache_get() and gemini_cache_list() but cannot
be used for direct lookup - always use the cache name (ID) for operations.
TTL: Default is 12 hours (43200 seconds). Override via ttl_seconds
parameter or ART_GEMINI_TTL_SEC environment variable.
Functions
gemini_cache_create(): Create an explicit cache entrygemini_cache_delete(): Delete a cached content entrygemini_cache_get(): Get cached content metadatagemini_cache_list(): List cached content entriesgemini_chat_cached(): Chat using a cached context.estTokenUsage(): Estimate token usage for a cached payload
Examples
if (FALSE) { # \dontrun{
# Create a large payload (must be >= ~4096 tokens)
large_context <- list(
documents = lapply(1:100, function(i) {
list(id = i, content = paste(rep("Sample text content.", 50), collapse = " "))
})
)
# Create cache with 10-minute TTL
cache <- gemini_cache_create(large_context, ttl_seconds = 600)
# Query against cached context
gemini_chat_cached("Summarize document 42.", cache = cache$name)
# Clean up when done
gemini_cache_delete(cache$name)
} # }
