Getting Started
Set up the SWE-gen block and run the first task generation pass
SWE-gen wraps the pinned repos/swegen checkout in the SWE-Lego-Live block
layout. The pinned repository is recorded as a gitlink under
subblock/swegen/repos/swegen and should match commit e804af9.
Prerequisites
- Docker - task validation and task execution use containerized Harbor tasks.
- Python - install the SWE-gen package from
repos/swegen. - GitHub tokens - used for PR collection and PR metadata lookup.
- OpenAI-compatible LLM endpoint - used for PR evaluation and task instruction generation.
- Claude-compatible task model - used by the task completion stage.
1. Enter the block
Run commands from the block root:
cd subblock/swegenConfirm the pinned repo exists:
git rev-parse HEAD:repos/swegenThe expected pin for this docs site is:
e804af92aad81f42928453959e24e3f5dc666c442. Install the environment
Create the block virtual environment and install SWE-gen:
python3 -m venv artifacts/envs/swegen-env
source artifacts/envs/swegen-env/bin/activate
pip install -U pip
pip install -e repos/swegen/If you already maintain a compatible environment, activate it before running
the scripts. The important requirement is that swegen resolves to the pinned
checkout.
3. Configure credentials
Set runtime inputs in the shell or in a local env file that is not committed:
export GITHUB_TOKENS="ghp_xxx,ghp_yyy"
export OPENAI_API_KEY="..."
export ANTHROPIC_API_KEY="$OPENAI_API_KEY"
export OPENAI_API_BASE_URL="https://your-openai-compatible-endpoint/v1"
export ANTHROPIC_BASE_URL="https://your-anthropic-compatible-endpoint"
export OPENAI_MODEL="openai/MiniMax-M2.7"
export ANTHROPIC_MODEL="claude-sonnet-4-6"GITHUB_TOKEN may also be derived from the first entry in GITHUB_TOKENS.
Do not commit tokens or local env files. Keep the OpenAI-compatible and
Anthropic-compatible variables aligned with the same provider setup you intend
to use for the run. If your shell already contains old ANTHROPIC_API_KEY or
ANTHROPIC_BASE_URL values, overwrite or unset them before launching
generation; stale Anthropic-compatible variables can make the Claude SDK stage
use the wrong endpoint or key.
For reproducible runs on shared machines, use a clean Claude Code config directory so user-level hooks and plugins do not affect task completion:
export CLAUDE_CONFIG_DIR="$PWD/artifacts/claude-config/swegen-clean"
mkdir -p "$CLAUDE_CONFIG_DIR"4. Run a dry run
Check the block before starting a large batch:
bash scripts/dryrun.shThis validates expected directories, config values, and basic runtime dependencies. Fix dry-run failures before launching generation.
5. Run a smoke generation
For a small first pass, use a direct swegen create command with explicit
limits and shared output paths:
swegen create \
--input-ids-file artifacts/collected_prs/python_pr_ids.txt \
--max-pr 1 \
--n-concurrent 1 \
--output artifacts/swe_tasks/py-cc \
--state-dir scripts/.swegen-py \
--timeout 2400 \
--cc-timeout 1800 \
--no-require-issue \
--min-source-files 1 \
--max-source-files 10 \
--docker-prune-batch 0The language scripts are intended for production batches and read concurrency
from config.yaml; prefixing N_CONCURRENT=1 may not override their configured
defaults.
For the full multi-language run:
bash scripts/create_all_bg.shPer-language scripts are available for py, js, ts, go, c, cpp,
java, and rust.
6. Resume after moving nodes
If you restored a March state package into a new clone, check the batch state
before running generation. SWE-gen stores batch state under a filename derived
from input_ids_file.resolve(). A new clone path creates a new hash even when
the relative input file is the same.
For portable recovery, keep these files together:
artifacts/swe_tasks/<lang>-cc/artifacts/swe_tasks/<lang>-cc/verifiable_tasks.txtartifacts/swe_tasks/<lang>-cc/.swegen-create-batch/*.json- the matching PR input files in
artifacts/collected_prs/
If the clone path changes, rewrite or regenerate the .swegen-create-batch
state filenames so they match the current resolved input file paths before
launching scripts/create_{lang}.sh.