Secrets#

Put credentials in configs/secrets.env (key=value, one per line). Non-secret runtime knobs are listed in Environment variables.

# configs/secrets.env
YT_PROXY=https://your-proxy.example
YT_TOKEN=your-token

# Optional: S3
S3_ENDPOINT=https://your-s3-endpoint.example
S3_DOWNLOAD_ACCESS_KEY=...
S3_DOWNLOAD_SECRET_KEY=...
S3_UPLOAD_ACCESS_KEY=...
S3_UPLOAD_SECRET_KEY=...

YT#

Prod mode needs a reachable proxy and token:

YT_PROXY=your-yt-proxy-url
YT_TOKEN=your-yt-token

Get values from whoever runs your YT cell.

S3#

Reads typically use the download pair; writes use the upload pair unless one credential set has both roles.

How secrets reach your code#

The pipeline loader reads secrets.env early. For values you must pass explicitly into helpers (for example constructing S3Client), call load_secrets on the configs directory:

from yt_framework.utils.env import load_secrets

def run(self, debug: DebugContext) -> DebugContext:
    secrets = load_secrets(self.deps.configs_dir)
    _proxy = secrets.get("YT_PROXY")
    return debug

Map, vanilla, map-reduce, and reduce jobs (production)#

build_operation_environment still merges secrets.env, client.operations.<name>.env, and framework helpers into one dict. On a real cluster, YTProdClient does not send most of those keys to the job’s plain environment field (the one the YT web UI shows). They go to the operation-level secure_vault instead. The cluster exposes values to the job as YT_SECURE_VAULT_<KEY>; for string commands, the framework prepends a small stdlib-only Python snippet that copies those into the usual names (YT_TOKEN, S3_DOWNLOAD_ACCESS_KEY, etc.) before your script runs.

A short allowlist keeps clearly non-secret keys in plain environment (for example YT_STAGE_NAME and tokenizer artifact path variables). To expose additional non-secret keys in the UI, set environment_public_keys on that operation’s config. The insecure rollback is use_plain_environment_for_secrets: true (not recommended).

TypedJob legs do not get that automatic shim: call promote_secure_vault_environment() from yt_framework.yt.support.operation_secure_env at the start of your job, or use a string command.

Do not put secrets in command lines; upstream discussions (ytsaurus#780, ytsaurus#990) note that commands are another surface that may leak values in the UI.

Common variables#

Variable

When you need it

YT_PROXY, YT_TOKEN

Prod driver talking to YT

S3_*

S3-backed operations

DOCKER_AUTH_USERNAME, DOCKER_AUTH_PASSWORD

Private registry pulls for docker_image

Hygiene#

Warning

Do not commit secrets.env

  • Add configs/secrets.env to .gitignore.

  • Commit a secrets.example.env with dummy values for onboarding.

  • Rotate tokens on the schedule your security team expects.

  • In CI, inject the same keys via the environment; the loader also reads process env when the file is absent.

For optional pytest runs against a real cell (separate from pipeline secrets.env), see Real cluster integration tests.

Example ignore rules:

configs/secrets.env
*.env
!*example.env

Example template:

# configs/secrets.example.env
YT_PROXY=https://proxy.example
YT_TOKEN=replace-me
# S3 optional...

CI#

export YT_PROXY="https://..."
export YT_TOKEN="..."
python pipeline.py

If secrets.env is missing, variables already present in the process environment still work.

Troubleshooting#

Symptom

What to check

File ignored / not found

Path is configs/ next to pipeline.py, filename secrets.env unless you customized loading

Auth errors

Typos, expired token, wrong cluster proxy

HTTP 403 from S3

Endpoint URL, bucket policy, wrong access/secret pair for the operation

See also#