Send text and image requests to Gemini's API

Core functions for interacting with Google's Gemini generateContent endpoint. Use gemini_chat() for text conversations and gemini_describe_image() for vision tasks. These are the primary entry points for most use cases.

For multi-turn conversations, use gemini_continue(). For tool-augmented responses (Google Search, code execution), use gemini_with_tools().

Usage

gemini_generate(
  contents,
  ml = NULL,
  temp = 1,
  genconf = list(),
  safety = NULL,
  instruction = NULL,
  timeout = 60,
  resp_fmt = NULL,
  expect_fields = NULL,
  cache = NULL
)

gemini_describe_image(
  img_path,
  prompt = NULL,
  ml = NULL,
  temp = 0.7,
  timeout = 60
)

Arguments

contents: List. Gemini content blocks in the format expected by the API. Each block should have role ("user" or "model") and parts (list of text/image parts). Built automatically by gemini_chat() and related functions.
ml: Character. Model ID to use (e.g., "gemini-3-pro-preview", "gemini-2.5-flash"). If NULL (default), uses ART_GEMINI_MODEL env var. The package prepends "models/" internally. See README for available models.
temp: Numeric. Temperature setting (0-2). Lower values (0-0.3) produce deterministic output for structured data; higher values (0.7-1.5) produce creative, varied responses. Default 1.
genconf: List. Additional generationConfig values to pass to the API. Merged with defaults (temperature, maxOutputTokens). Override max output tokens with list(maxOutputTokens = 16384). Default maxOutputTokens uses ART_GEMINI_MAX_OUTPUT_TOKENS env var (default 8192). See README.
safety: List. Safety settings to override defaults. See Gemini API docs for available categories and thresholds.
instruction: List. System instruction block with role = "system" and parts containing the system prompt. Sets behavioral context for the model.
timeout: Numeric. Request timeout in seconds. Default 60. Increase for complex queries or slow networks. Override via ART_GEMINI_TIMEOUT env var.
resp_fmt: List. Response format specification. Use list(type = "json_object") to enable JSON mode with responseMimeType.
expect_fields: Character vector. Required JSON fields to validate in response. Only checked when resp_fmt requests JSON. Errors if fields are missing.
cache: Character. Cache name/ID to reuse for repeated queries against large context. Obtain from gemini_cache_create(). Sent as cachedContent.
img_path: Character. Path to local image file or URL. Supports common formats (PNG, JPEG, WebP). Images are resized to 1000px width and converted to PNG for transmission. For CDN images, pass the full URL.
prompt: Character. Custom prompt for image analysis. If NULL (default), uses "Describe this image in detail." Use for specialized analysis tasks.

Value

Character string with the first candidate text; attributes modelVersion and usageMetadata are attached when present.

Character. Image description text. Includes attributes: modelVersion and usageMetadata.

Functions

gemini_generate(): Low-level Gemini generateContent API call
gemini_describe_image(): Describe an artwork with Gemini vision

Examples

if (FALSE) { # \dontrun{
gemini_chat("Say hello from Gemini.", temp = 0.2)

# Model Selection Strategy
#
# Speed/cost priority
gemini_chat("Quick question", ml = "gemini-2.5-flash")
#
# Quality priority
gemini_chat("Complex analysis", ml = "gemini-2.5-pro")
#
# Multimodal with reasoning
gemini_chat("Reasoning", ml = "gemini-3-pro-preview", max_think = TRUE)

# Monitor token usage for cost control
usage <- gemini_chat("hello") |> attr("usageMetadata")
usage$totalTokenCount
} # }

if (FALSE) { # \dontrun{
gemini_describe_image(tmp, temp = 0)

# Vision: allow extra time
gemini_describe_image(tmp, timeout = 60)
} # }