Data Modification Workflows
Source:vignettes/data-modification-workflows.Rmd
data-modification-workflows.RmdThis vignette covers functions that create, update, and delete records. These operations modify the database and require careful use.
Note: Demo mode enforcement is handled at the application layer, not in artutils. These functions will execute modifications directly - ensure you’re connected to the correct database environment.
The Modification Ecosystem
Data modifications in artutils follow a pipeline pattern:
External Input → artpipelines → artutils → Database
(processing) (storage)
- artpipelines processes raw data (images, replay files) into structured records
- artutils handles database operations with transaction safety
- Direct modifications via artutils are rare outside of the pipeline
Understanding this helps explain why some functions (like
addArtwork()) expect pre-processed data.tables rather than
raw inputs.
Collection Management
Collections group related artworks. Every artwork must belong to a collection.
Creating Collections
library(artutils)
# Create a new collection for an artist
artist_uuid <- "746b8207-72f5-4ab6-8d19-a91d03daec3d"
collection_uuid <- addCollection(
artist = artist_uuid,
collection_name = "Abstract Landscapes 2024"
)
cat("Created collection:", collection_uuid, "\n")What addCollection() Does
addCollection() creates two records in a
transaction:
- app.collection_index - The collection record itself
- settings.collection_settings - Default visibility settings
# Pseudocode of internal logic
DBI::dbWithTransaction(cn, {
# 1. Insert collection record
DBI::dbAppendTable(cn, "app.collection_index", data.table(
artist_uuid = artist,
collection_uuid = new_uuid,
collection_name = collection_name
))
# 2. Insert default settings (WIP status, no visibility override)
DBI::dbAppendTable(cn, "settings.collection_settings", data.table(
collection_uuid = new_uuid,
status = "wip",
visibility_override = NA_character_,
extended_settings = "{}"
))
})The transaction ensures both records exist or neither does - no orphan collections.
Managing Collection Visibility
Collections have a visibility lifecycle controlled by two factors:
-
status:
"wip"(work-in-progress) or"complete" -
visibility_override:
NA(inherit),"visible", or"hidden"
Visibility resolution priority: 1. Explicit override takes precedence
2. Completed collections are visible by default 3. WIP collections
inherit from artist’s default_wip_collection_visibility
preference
# Get current settings
settings <- get_collection_settings(collection_uuid)
cat("Status:", settings$status, "\n")
cat("Override:", settings$visibility_override, "\n")
# Mark collection as complete (makes it visible)
update_collection_settings(
collection_uuid,
status = "complete"
)
# Or explicitly hide a collection (overrides status)
update_collection_settings(
collection_uuid,
visibility_override = "hidden"
)
# Clear the override (inherit from status/artist prefs again)
update_collection_settings(
collection_uuid,
visibility_override = NA_character_
)Batch Visibility Queries
For gallery views with multiple collections, use batch queries:
# Get all collections for artist
collections <- get_artist_collections_summary(artist_uuid)
# Get visibility for all in one query
visibility <- get_collect_visibility(
artists = artist_uuid,
collects = collections$collection_uuid
)
# Merge and filter
collections <- merge(collections, visibility, by = "collection_uuid")
visible_collections <- collections[is_visible == TRUE]Artist Preferences
Artist preferences control default behaviors for new collections and artworks.
# Set default visibility for new WIP collections
# "hidden" means WIP collections won't appear in public galleries
upsert_artist_preferences(
artist_uuid,
default_wip_collection_visibility = "hidden"
)
# Future WIP collections will be hidden by default
# Complete collections are always visible unless explicitly hiddenExtended Settings
The extended_settings field stores arbitrary JSON for
future features:
upsert_artist_preferences(
artist_uuid,
extended_settings = list(
theme = "dark",
notifications_enabled = TRUE,
custom_colors = list(primary = "#FF5733", secondary = "#33FF57")
)
)Artwork Creation Pipeline
Adding artwork is a multi-table operation typically driven by
artpipelines.
Understanding the Data Flow
Raw Files (images, replay.json)
↓
artpipelines::processArtwork()
↓
Prepared data.tables for each table:
- artwork_index
- artwork_stats
- artwork_meta
- artwork_colors
- artwork_profiles
- artwork_frame_analytics
- artwork_hash
- artwork_paths
- artwork_styles
- global_styles
↓
artutils::addArtwork()
↓
Database (all tables updated atomically)
Direct addArtwork() Usage
While typically called by the pipeline, here’s the expected structure:
# This shows the expected data.table structure
# In practice, artpipelines prepares these from processing results
artwork_index <- data.table::data.table(
artist_uuid = artist_uuid,
art_uuid = artwork_uuid,
collection_uuid = collection_uuid,
art_name = "sunrise-over-mountains",
art_title = "Sunrise Over Mountains",
is_nft = FALSE,
created_utc = lubridate::now()
)
artwork_stats <- data.table::data.table(
artist_uuid = artist_uuid,
art_uuid = artwork_uuid,
brush_strokes = 45000L,
drawing_hours = 12.5,
ave_bpm = 60.0,
n_unique_colors = 1250L,
share_of_spectrum = 0.42,
# ... additional stats columns
created_utc = lubridate::now()
)
artwork_meta <- data.table::data.table(
artist_uuid = artist_uuid,
art_uuid = artwork_uuid,
image_width = 4096L,
image_height = 3072L,
format = "png",
filesize_bytes = 15000000L,
created_utc = lubridate::now()
)
# Frame analytics - one row per frame
artwork_frame_analytics <- data.table::data.table(
artist_uuid = rep(artist_uuid, 100),
art_uuid = rep(artwork_uuid, 100),
frame = 1:100,
elapsed_hours = seq(0, 12.5, length.out = 100),
cumulative_strokes = cumsum(rpois(100, 450)),
unique_colors = cumsum(rpois(100, 12)),
estimated_bpm = rnorm(100, 60, 10),
technique_phase = sample(c("sketch", "base", "detail"), 100, replace = TRUE),
# ... additional frame columns
created_utc = lubridate::now()
)
# And so on for other tables...Transaction Safety
addArtwork() wraps all inserts in a transaction. If any
insert fails:
- All changes are rolled back
- Database remains in consistent state
- Error is propagated to caller
tryCatch({
addArtwork(
artist = artist_uuid,
artwork = artwork_uuid,
artwork_colors = artwork_colors,
artwork_frame_analytics = artwork_frame_analytics,
artwork_hash = artwork_hash,
artwork_index = artwork_index,
artwork_meta = artwork_meta,
artwork_paths = artwork_paths,
artwork_profiles = artwork_profiles,
artwork_stats = artwork_stats,
artwork_styles = artwork_styles,
global_styles = global_styles
)
cat("Artwork added successfully\n")
}, error = function(e) {
# Transaction was rolled back - no partial data
cat("Failed to add artwork:", e$message, "\n")
# Investigate and fix the issue
})Statistics Recalculation
After adding or removing artworks, aggregate statistics need recalculation.
updateArtistStats()
Recalculates aggregate statistics stored in
app.artist_stats:
# After adding/removing artwork
updateArtistStats(artist_uuid)
# Verify the update
stats <- get_artist_stats(artist_uuid)
cat("Total artworks:", stats$artworks, "\n")
cat("Total collections:", stats$collections, "\n")What Gets Calculated
# artist_stats columns updated:
# - total_brushes: sum of all artwork brush_strokes
# - total_hours: sum of all artwork drawing_hours
# - total_artworks: count of artworks
# - total_collections: count of unique collections
# - ave_bpm: total_brushes / (total_hours * 60)
# - ave_brushes: total_brushes / total_artworks
# - ave_hours: total_hours / total_artworks
# - updated_utc: timestampBenchmark Recalculation
Benchmarks are percentile scores comparing each artwork to others in the artist’s portfolio. They enable “this artwork is in the top 10% for time investment” type insights.
When to Recalculate
Benchmarks should be recalculated when:
- A new artwork is added
- An artwork is deleted
- Artwork stats are corrected
- Benchmark data is missing or corrupt
Do not recalculate on every page load - benchmarks are cached in the database.
updateArtistBenchmarks()
# Recalculate all benchmarks for an artist
# This is expensive - queries all artwork stats, calculates percentiles,
# deletes old benchmarks, inserts new ones
updateArtistBenchmarks(artist_uuid)Understanding Benchmark Categories
Benchmarks are grouped into three categories:
# 1. TIME & EFFORT
# How much time and work went into this piece?
time_effort_metrics <- c(
"drawing_hours", # Total hours spent
"brush_strokes", # Total strokes
"ave_bpm", # Average strokes per minute
"color_generation_rate", # New colors per hour
"early_late_color_ratio" # Color exploration timing
)
# 2. SKILL & ARTISTRY
# Technical proficiency indicators
skill_artistry_metrics <- c(
"ave_blend_rate", # Smooth transitions
"n_unique_colors", # Color vocabulary
"share_of_spectrum", # Color range utilization
"strokes_per_unique_color", # Color efficiency
"frame_color_stability" # Consistency
)
# 3. COMPLEXITY & DETAIL
# How intricate is the work?
complex_detail_metrics <- c(
"ave_colors_pstroke", # Colors per stroke
"brush_density", # Strokes per pixel
"q75_color_freq", # Color distribution
"frame_color_variance", # Color variation
"technique_phase_count" # Phase transitions
)Benchmark Confidence
Benchmarks include a confidence level based on portfolio size:
- low: 1-3 artworks (percentiles less meaningful)
- medium: 4-10 artworks (reasonable comparison)
- high: 11+ artworks (statistically significant)
appdata <- getAppdata(artist_uuid, artwork_uuid)
benchmarks <- appdata$artwork$benchmarks
cat("Time & Effort score:", benchmarks$time_effort$score, "\n")
cat("Confidence:", benchmarks$time_effort$confidence, "\n")
# Only show benchmarks to users when confidence is medium+
if (benchmarks$time_effort$confidence != "low") {
# Render benchmark visualization
}Low-Level Database Operations
For operations not covered by helper functions, use the db-interface functions.
SELECT Queries
# Basic query - returns data.table
artists <- dbArtGet("
SELECT artist_uuid, artist_name, created_utc
FROM app.artist_index
ORDER BY created_utc DESC
LIMIT 10
")
# Single row as list (for config lookups)
artist <- dbArtGet(
stringr::str_glue(
"SELECT * FROM app.artist_index WHERE artist_uuid = '{artist_uuid}'"
),
unlist = TRUE
)
artist$artist_name # Direct field accessINSERT Rows
# Prepare data.table
new_style <- data.table::data.table(
artist_uuid = artist_uuid,
tag = "impressionism",
category = "style"
)
# Insert into table
dbArtAppend(new_style, table = "artist_style_map", schema = "app")UPDATE Statements
# Direct SQL update
dbArtUpdate(stringr::str_glue("
UPDATE app.artwork_profiles
SET description = 'Updated description'
WHERE art_uuid = '{artwork_uuid}'
"))Transaction Pattern
For multi-step operations, use explicit transactions:
cn <- artcore::..dbc()
on.exit(artcore::..dbd(cn))
DBI::dbWithTransaction(cn, {
# Step 1: Insert parent record
dbArtAppend(parent_data, "parent_table", cn = cn)
# Step 2: Insert child records
dbArtAppend(child_data, "child_table", cn = cn)
# Step 3: Update summary
dbArtUpdate(summary_query, cn = cn)
# If any step fails, entire transaction rolls back
})Functions Under Review
Some modification functions are in R/considering.R with
lifecycle::deprecate_soft():
# These may be removed in future versions
# Use with caution - file an issue if you depend on them
# deleteCollection() - No known callers
# update_has_nft() - Should be handled by pipeline
# getExternArtIndex() - DeviantArt integration
# pathArtVaultImage() - Vault access pattern
# pathCanvasSign() - Signature detection path
# pathLottieJSON() - Animation assets
# pathPackageCSS() - Package CSS loadingBest Practices
1. Validate Before Modifying
# Use artcore::validate_uuid() to catch bad input early
artcore::validate_uuid(artist_uuid, "artist UUID")
artcore::validate_uuid(artwork_uuid, "artwork UUID")
# Then proceed with modification
updateArtistStats(artist_uuid)2. Recalculate Stats After Changes
# After any artwork modification
addArtwork(...) # or delete
updateArtistStats(artist_uuid)
# Only recalculate benchmarks if portfolio changed
updateArtistBenchmarks(artist_uuid)3. Share Connections for Batch Operations
cn <- artcore::..dbc()
on.exit(artcore::..dbd(cn))
# Multiple operations share connection
for (collection_name in collection_names) {
addCollection(artist_uuid, collection_name, cn = cn)
}
updateArtistStats(artist_uuid, cn = cn)4. Log Modifications
All artutils functions log via rdstools::log_*(). In
production:
# Enable info-level logging to see modification events
# rdstools::log_inf() calls show:
# - "Adding new collection to collection_index"
# - "Appending to artwork_index"
# - "Updated artist preferences for {uuid}"
# etc.