Skip to contents

Repository Overview

artutils is a proprietary R package (v0.9.3) by Artalytics Inc. containing shared tools and data for the Artalytics Shiny application ecosystem. This is an R package with standard R package structure, using data.table as the preferred data framework and avoiding tidyverse unless absolutely necessary.

  • Language: R (>= 4.1.0)
  • Type: R package with Shiny modules and database utilities
  • Size: ~90 files, ~12 R source files, comprehensive test suite
  • License: Proprietary (see LICENSE file)

Purpose

  • Application data management and reactive patterns
  • Database operations for artists, artworks, and collections
  • Shiny module components and UI utilities
  • File path management and asset routing
  • Data processing and transformation utilities
  • Integration layer between core infrastructure and applications

Architecture Role

This is the middle layer of the Artalytics platform architecture: - Depends on artcore for foundational services - Extended by application-specific packages - Provides common application functionality - Bridges infrastructure and user-facing features

Environment Setup & Dependencies

System Requirements

  • R >= 4.1.0
  • ImageMagick++ (libmagick++-dev) - CRITICAL: Required for magick package
  • PostgreSQL client libraries (libpq-dev)
  • Standard build tools (build-essential, gfortran, pkg-config)

R Package Dependencies

Core Dependencies: artcore (private), data.table, DBI, fs, jsonlite, lubridate, magick, rlang, shiny, stringr, tidyjson Remote Packages: rdstools, rpgconn (from r-data-science), artcore (from artalytics)

Environment Setup

ALWAYS run the environment setup script first on new systems:

chmod +x inst/setup-env.sh
sudo ./inst/setup-env.sh

This script installs R, system dependencies, and all required R packages. Takes ~10-15 minutes.

Required Environment Variables (for full functionality)

export ART_USE_PG_CONF="production_config"
export ART_PG_USER_PASSWD_PRD="database_password"
export ART_BUCKETS_KEY_ID="aws_key_id"
export ART_BUCKETS_KEY_SECRET="aws_secret"
export CODECOV_TOKEN="coverage_token"

Build, Test & Validation Commands

Essential Command Sequence

ALWAYS follow this exact order for reliable builds:

  1. Install Dependencies (required before any other commands):

    Rscript -e 'pak::pkg_install(".", ask = FALSE, dependencies=TRUE)'
  2. Lint Code (run first to catch style issues):

    Rscript -e 'lintr::lint_package()'
    • Uses 100-character line length limit (see .lintr config)
    • Must pass with no errors before proceeding
  3. Run Tests:

    Rscript -e 'testthat::test_local()'
    • Uses testthat edition 3
    • Some tests require database connection (will skip if unavailable)
    • Tests with skip_on_ci() will skip in CI environment
  4. Package Check (comprehensive validation):

    R CMD check . --no-manual --compact-vignettes=gs+qpdf
    • Timeout: Allow 40 minutes for completion
    • Must pass with 0 errors, 0 warnings
  5. Coverage Analysis (optional):

    Rscript -e 'covr::package_coverage(quiet = FALSE)'

Build Troubleshooting

Common Issues & Solutions:

  • Missing system dependencies: Run inst/setup-env.sh script
  • artcore package not found: Ensure GitHub PAT is configured for private repos
  • ImageMagick errors: Install libmagick++-dev system package
  • PostgreSQL connection errors: Set ART_PGHOST and database credentials
  • Timeout on R CMD check: Increase timeout to 40+ minutes

Project Architecture & Layout

Directory Structure

artutils/
├── R/                    # Main source code (12 files)
│   ├── artutils-package.R    # Package definition & globals
│   ├── app-data.R           # Reactive data functions for Shiny
│   ├── app-paths.R          # Asset path utilities
│   ├── app-waiter.R         # UI waiting components
│   ├── db-*.R               # Database interaction layers
│   └── mod-statsBox.R       # Shiny module for statistics
├── tests/testthat/       # Unit tests (9 test files)
├── man/                  # Generated documentation
├── inst/                 # Installation files
│   └── setup-env.sh          # Environment setup script
├── data-raw/             # Raw data processing scripts
├── .github/workflows/    # CI/CD configuration
└── [standard R package files]

Key Configuration Files

  • DESCRIPTION: Package metadata, dependencies, system requirements
  • .lintr: Linting configuration (100-char line length)
  • codecov.yml: Coverage reporting configuration
  • .Rprofile: Development helper functions
  • NAMESPACE: Generated exports (do not edit manually)

GitHub Workflows (CI/CD)

  1. R-CMD-check.yaml: Runs R CMD check on Ubuntu (40min timeout)
  2. test-coverage.yaml: Generates coverage reports for codecov
  3. lint.yaml: Runs lintr package checking
  4. build-info.yaml: Syncs DESCRIPTION to public build-info repo

All workflows require these environment variables: GITHUB_PAT, CODECOV_TOKEN, ART_* secrets

Key Functional Areas

Application Data Management

  • reactiveAppData() - Reactive values for Shiny applications
  • getAppdata() - Structured application data retrieval
    • Returns comprehensive nested list with artist and artwork data
    • NEW: artwork$frame_stats provides per-frame analytics (replaces deprecated brushesDT)
    • Includes benchmarks, stats, metadata, paths, and configuration
  • sampleArt() - Sample artwork selection
  • Reactive patterns for artist/artwork data flow

Database Operations

Artist Management

  • addArtist(), updateArtist(), deleteArtist() - CRUD operations
  • getArtist*() functions - Various artist data retrievals
  • getArtistStats(), getArtistBenchmarks() - Analytics
  • list_artists() - Artist enumeration

Artwork Management

  • addArtwork(), updateArtwork(), deleteArtwork() - CRUD operations
  • getArtwork*() functions - Artwork data and metadata
  • listArtworkUUIDs(), getArtworksTable() - Bulk operations
  • artHasNFT(), update_has_nft() - NFT status tracking

Benchmark System (15-Metric Framework)

The calcArtworkBenchmarks() function implements a comprehensive 15-metric system organized into 3 categories:

Time & Effort (5 metrics): - Drawing Hours - Total active creation time - Brush Strokes - Number of strokes applied - Average BPM - Brushstrokes per minute pace - Color Generation Rate - Speed of introducing new colors - Early/Late Color Ratio - Color introduction patterns across timeline

Skill & Artistry (5 metrics): - Average Blend Rate - Color transition smoothness (0-1 scale) - Unique Colors - Palette diversity - Share of Spectrum - Color spectrum coverage - Strokes Per Unique Color - Technique signature (deliberate vs explosive) - Frame Color Stability - Consistency of palette evolution

Complexity & Detail (5 metrics): - Average Colors Per Stroke - Within-stroke color complexity - Brush Density - Strokes relative to canvas size - Q75 Color Frequency - Color distribution patterns - Frame Color Variance - Variation in color diversity across frames - Technique Phase Count - Number of distinct technique phases

Each category score is calculated by averaging the percentile rankings of its 5 component metrics, providing comprehensive portfolio-relative evaluation (0-100 scale).

Collection Management

  • addCollection(), deleteCollection() - Collection CRUD
  • getCollections(), list_collections() - Collection listing
  • getCollectionSummary() - Collection analytics

Frame Analytics (Per-Frame Statistics)

  • getFrameAnalytics(artist, artwork, cn = NULL) - Retrieve per-frame analytics data
  • Returns data.table with 21 columns of frame-by-frame statistics:
    • Temporal: elapsed_minutes, elapsed_hours, cumulative_strokes, estimated_bpm
    • Color Composition: unique_colors, total_pixels, dominant_hex, color_diversity, avg_red/green/blue
    • Delta Metrics: colors_added, colors_removed, pixels_added, palette_change_score
    • Phase Detection: technique_phase (opening, buildup, refinement, detailing)
  • Data Integrity: All artworks in artwork_index MUST have frame analytics
  • Fail-Fast Philosophy: Returns empty data.table if no data exists (signals pipeline failure)
  • Generated by artpipelines::createFrameAnalytics() from artwork_colors data
  • Used in getAppdata() as artwork$frame_stats for rich per-frame visualization

Path Management

  • path*() functions for asset routing
  • prefix*() functions for CDN path prefixes
  • Local and remote path resolution
  • Asset type-specific path generators

Shiny Components

  • statsBoxUI(), statsBoxServer() - Statistics display module
  • wait_art_html() - Loading state management
  • Reactive utilities for application state

Data Processing

  • getImageRaster() - Image processing utilities
  • Database modification helpers (dbArt*() functions)
  • Data transformation and aggregation

Coding Guidelines & Patterns

Code Style Requirements

  • Prefer data.table over tidyverse for data manipulation
  • Use stringr for string operations
  • Line length limit: 100 characters
  • Follow existing function documentation patterns
  • Use artcore::..dbc and artcore::..dbd for database connections

Function Patterns

  • Database functions: Use dbArtGet(), dbArtAppend(), dbArtUpdate() wrappers
  • Path functions: Follow path*() naming convention
  • Reactive functions: Use reactive*() naming for Shiny components
  • Always handle cn = NULL parameter for database functions

Function Design Principles

  • Database functions should accept optional cn parameter
  • Use data.table for performance-critical operations
  • Implement consistent error handling
  • Return structured, predictable data formats
  • Use reactive patterns for Shiny integration

Path Management Standards

  • All path functions should be environment-aware
  • Support both local and remote asset resolution
  • Use consistent naming conventions
  • Validate path parameters

Testing Patterns

  • Use testthat with mocking via local_mocked_bindings()
  • Mark tests requiring external services with skip_on_ci()
  • Use expect_*() functions consistently
  • TODO items in tests indicate incomplete test coverage

Testing Requirements

  • Mock database connections for unit tests
  • Test reactive behaviors with testServer()
  • Validate path resolution across environments
  • Test error conditions and edge cases
  • Maintain >85% code coverage

Documentation Standards

  • Document all exported functions with examples
  • Use @importFrom artcore for core dependencies
  • Include parameter validation examples
  • Document reactive behavior and dependencies
  • Use @family tags for related functions

Development Commands

Package Development

# Load development version with dependencies
devtools::load_all()

# Check package (includes dependency checks)
devtools::check()

# Run tests (may require database connection)
devtools::test()

# Generate documentation
devtools::document()

# Install with dependencies
pak::local_install_dev_deps(".")

Testing Strategy

  • Unit tests for pure functions
  • Integration tests for database operations
  • Shiny module testing with testServer()
  • Mock database connections for isolated testing
  • Path resolution testing across environments

Key Test Files

  • test-app-data.R - Application data flow
  • test-app-paths.R - Path resolution
  • test-db-*.R - Database operations
  • test-app-waiter.R - UI components

Integration Patterns

Database Operations Pattern

# Standard database operation pattern
my_db_function <- function(artist_uuid, cn = NULL) {
  if (is.null(cn)) {
    cn <- artcore::..dbc()
    on.exit(artcore::..dbd(cn))
  }

  # Database operations here
  result <- DBI::dbGetQuery(cn, query, params)
  return(result)
}

Reactive Data Pattern

# Shiny server usage
server <- function(input, output, session) {
  app_data <- reactiveAppData(
    artist = reactive(input$artist_uuid),
    artwork = reactive(input$artwork_uuid)
  )

  # Use reactive data
  output$artist_info <- renderText({
    app_data$appdata$artist$info$name
  })
}

Path Resolution Pattern

# Asset path resolution
get_artwork_assets <- function(artist_uuid, artwork_uuid) {
  list(
    main_image = pathArtMainImage(artist_uuid, artwork_uuid),
    thumbnail = pathArtworkThumb(artwork_uuid),
    vault_image = pathArtVaultImage(artist_uuid, artwork_uuid)
  )
}

Common Development Tasks

Adding Database Function

  1. Create function in appropriate R/db-*.R file
  2. Follow connection pattern with optional cn
  3. Add comprehensive tests with mocked connections
  4. Document with database requirements
  5. Export if needed by applications

Adding Shiny Module

  1. Create *UI() and *Server() functions
  2. Add to appropriate R/mod-*.R file
  3. Test with testServer() and integration tests
  4. Document module interface and dependencies
  5. Export both UI and server functions

Adding Path Function

  1. Add to R/app-paths.R
  2. Use consistent naming: path*() for assets, prefix*() for CDN
  3. Test across different environments
  4. Document path structure and usage
  5. Export if used by applications

Updating Data Processing

  1. Use data.table for performance
  2. Handle missing data gracefully
  3. Validate input parameters
  4. Add unit tests with sample data
  5. Document data transformations

Critical Dependencies & Gotchas

Private Package Dependencies

  • artcore: Private Artalytics package, requires GitHub PAT
  • rdstools, rpgconn: From r-data-science GitHub org

External Service Dependencies

  • PostgreSQL database
  • AWS S3 buckets (for asset storage)
  • OpenAI API (for some features)
  • OpenSea API (for NFT features)

Development vs Production

  • Database functions require valid connection credentials
  • Tests use mocking to avoid external dependencies in CI
  • Demo mode enforcement is handled at the application layer, not in artutils

Environment Configuration

Required Environment Variables

  • ART_PGHOST, ART_PG_USER_PASSWD_PRD - Database configuration (inherited from artcore)

Optional Configuration

  • Image processing settings for magick
  • Shiny application settings
  • Path resolution overrides

Performance Considerations

Database Operations

  • Reuse connections when possible
  • Use parameterized queries
  • Batch operations for bulk updates
  • Consider connection pooling for applications

Data Processing

  • Prefer data.table for large datasets
  • Cache expensive computations
  • Use vectorized operations
  • Profile memory usage for large operations

Shiny Integration

  • Minimize reactive dependencies
  • Use isolate() for non-reactive reads
  • Debounce user inputs
  • Cache rendered outputs when appropriate

Integration with Applications

For Shiny Applications

# In application server
app_data <- artutils::reactiveAppData(
  artist = reactive(session$userData$artist),
  artwork = reactive(session$userData$artwork)
)

# Use in modules
artutils::statsBoxServer("stats", app_data)

For Data Processing

# Batch operations
artists <- artutils::list_artists()
results <- lapply(artists$uuid, function(uuid) {
  artutils::getArtistStats(uuid)
})

Security & Data Protection

  • Validate all UUID parameters
  • Sanitize database inputs
  • Use prepared statements
  • Log data access appropriately
  • Handle sensitive artwork data carefully

Quick Reference Commands

# Complete setup from scratch
sudo ./inst/setup-env.sh
Rscript -e 'pak::pkg_install(".", ask = FALSE)'

# Development workflow
Rscript -e 'lintr::lint_package()'
Rscript -e 'testthat::test_local()'
R CMD check . --no-manual

# Generate documentation
Rscript -e 'roxygen2::roxygenise()'

# Coverage report
Rscript -e 'covr::package_coverage()'

Troubleshooting

Common Issues

  • Database connection failures - check environment variables
  • Path resolution issues - verify asset existence
  • Shiny reactivity issues - check dependency chains
  • Image processing errors - verify ImageMagick installation

Debug Helpers

Trust these instructions - they are validated and complete. Only search for additional information if commands fail or requirements change.

Notes

Tasks are identified in comments with TODO throughout the codebase.

This package provides the application foundation - changes should maintain backward compatibility and coordinate with both artcore and dependent applications.