Operations Overview#

YT Framework supports several types of operations for processing data on YTsaurus clusters.

Operation Types#

When to Use Each Operation#

Operation	Best For	Input/Output	Parallelization
Map	Row-by-row processing, transformations	Table → Table	Automatic (per row)
Vanilla	Setup, cleanup, standalone tasks	None	Single job
YQL	SQL-like queries, joins, aggregations	Table(s) → Table	Automatic (query-level)
S3	External data integration	S3 → Table	File-level

Quick Comparison#

Map vs YQL#

Use Map when you need custom Python logic per row
Use YQL when you need SQL-like operations (joins, aggregations)

Vanilla vs Map#

Use Vanilla when you don’t need table input/output
Use Map when processing table rows

S3 Integration#

Use S3 when working with external data sources
Often combined with Map or YQL operations

Common Patterns#

Pattern 1: Extract → Transform → Load#

stages:
  enabled_stages:
    - extract_from_s3      # S3 operation
    - transform_data       # Map operation
    - load_to_table        # YQL operation

Pattern 2: Setup → Process → Validate#

stages:
  enabled_stages:
    - setup_environment    # Vanilla operation
    - process_data         # Map operation
    - validate_results     # Vanilla operation

Pattern 3: Join → Filter → Aggregate#

stages:
  enabled_stages:
    - join_tables         # YQL operation
    - filter_data         # YQL operation
    - aggregate_results   # YQL operation

See Also#

Map Operations - Detailed map operation guide
Vanilla Operations - Detailed vanilla operation guide
YQL Operations - Detailed YQL operation guide
S3 Operations - Detailed S3 operation guide
Multiple Operations - Running multiple operations in one stage