Operations#
Stages call helpers (run_map, run_vanilla, YT client methods, etc.) declared under client.operations in YAML. This section documents each pattern.
Guides#
Topic |
Link |
|---|---|
Map |
|
Map-reduce (TypedJob) |
|
Map-reduce (command mode) |
|
Vanilla |
|
YQL |
|
S3 |
|
Table helpers |
|
Sort |
Picking a tool#
Pattern |
Input / output |
Parallelism |
|---|---|---|
Map |
Table → table |
YT splits input across tasks |
Map-reduce / reduce |
Sorted or grouped table work |
Map + reduce phases |
Vanilla |
None required |
Single job |
YQL |
One or more tables → table |
Query planner |
S3 |
Object store → table (typical) |
Driver listing + cluster for follow-up |
Table helpers |
Driver-side Python |
None on cluster |
Sort |
Table → sorted table |
YT sort operation |
Map vs YQL#
Custom Python per row → map.
Declarative SQL shape → YQL.
Vanilla vs map#
No table contract → vanilla.
Row stream → map.
S3 plus tables#
S3 stages often feed map or YQL; compose them as separate stages or multiple operations in one stage (Multiple operations).
Example stage lists#
Extract / transform / load:
stages:
enabled_stages:
- extract_from_s3
- transform_data
- load_to_table
Setup / process / validate:
stages:
enabled_stages:
- setup_environment
- process_data
- validate_results