Dagger
Search

harbor-check

No long description provided.

Installation

dagger install github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0

Entrypoint

Return Type
HarborCheck !
Arguments
NameTypeDefault ValueDescription
wsWorkspace -No description provided
sourcePathString -No description provided
sourceDirDirectory -No description provided
harborPackageString -No description provided
harborExtras[String ! ] -No description provided
pythonVersionString -No description provided
containerContainer -No description provided
claudeCodeOauthTokenSecret -No description provided
openrouterApiKeySecret -No description provided
codexAccessTokenSecret -No description provided
miniSweConfigString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
func (m *MyModule) Example() *dagger.HarborCheck  {
	return dag.
			HarborCheck()
}
@function
def example() -> dagger.HarborCheck:
	return (
		dag.harbor_check()
	)
@func()
example(): HarborCheck {
	return dag
		.harborCheck()
}

Types

HarborCheck 🔗

source() 🔗

The source directory (a task tree or repo), taken from the workspace at sourcePath in the constructor. Harbor commands run against it; the argument-free checks validate it directly.

Return Type
Directory !
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 source
func (m *MyModule) Example() *dagger.Directory  {
	return dag.
			HarborCheck().
			Source()
}
@function
def example() -> dagger.Directory:
	return (
		dag.harbor_check()
		.source()
	)
@func()
example(): Directory {
	return dag
		.harborCheck()
		.source()
}

harborPackage() 🔗

Pip/uv install spec for Harbor itself (pin to a commit for reproducibility).

Return Type
String !
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 harbor-package
func (m *MyModule) Example(ctx context.Context) string  {
	return dag.
			HarborCheck().
			HarborPackage(ctx)
}
@function
async def example() -> str:
	return await (
		dag.harbor_check()
		.harbor_package()
	)
@func()
async example(): Promise<string> {
	return dag
		.harborCheck()
		.harborPackage()
}

harborExtras() 🔗

Extra packages installed alongside Harbor via uv pip install.

Default mirrors the live pipeline: Claude Agent SDK for agent execution plus Modal SDK for the live executorRun --executor=modal path. RewardKit ships inside Harbor as a package (selected via the REWARDKIT_JUDGE=claude-code env var), and mini-swe-agent is an agent installed in Harbor (invoked with -a mini-swe-agent) — neither is a separate install, so do not add them here. “Kimi” is just a model string + the mini-swe-agent config YAML in the task repo.

Return Type
[String ! ] !
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 harbor-extras
func (m *MyModule) Example(ctx context.Context) []string  {
	return dag.
			HarborCheck().
			HarborExtras(ctx)
}
@function
async def example() -> List[str]:
	return await (
		dag.harbor_check()
		.harbor_extras()
	)
@func()
async example(): Promise<string[]> {
	return dag
		.harborCheck()
		.harborExtras()
}

pythonVersion() 🔗

Python version for the default Alpine + uv base.

Return Type
String !
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 python-version
func (m *MyModule) Example(ctx context.Context) string  {
	return dag.
			HarborCheck().
			PythonVersion(ctx)
}
@function
async def example() -> str:
	return await (
		dag.harbor_check()
		.python_version()
	)
@func()
async example(): Promise<string> {
	return dag
		.harborCheck()
		.pythonVersion()
}

container() 🔗

Optional custom base image. When set, it replaces the default Alpine + uv base; Harbor + extras + harbor_runner are still installed into it.

Return Type
Container 
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 container
func (m *MyModule) Example() *dagger.Container  {
	return dag.
			HarborCheck().
			Container()
}
@function
def example() -> dagger.Container:
	return (
		dag.harbor_check()
		.container()
	)
@func()
example(): Container {
	return dag
		.harborCheck()
		.container()
}

claudeCodeOauthToken() 🔗

Claude Code OAuth token, forwarded to Harbor as a secret env var. Never printed; Dagger scrubs its value from logs.

Return Type
Secret 
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 claude-code-oauth-token
func (m *MyModule) Example() *dagger.Secret  {
	return dag.
			HarborCheck().
			ClaudeCodeOauthToken()
}
@function
def example() -> dagger.Secret:
	return (
		dag.harbor_check()
		.claude_code_oauth_token()
	)
@func()
example(): Secret {
	return dag
		.harborCheck()
		.claudeCodeOauthToken()
}

openrouterApiKey() 🔗

OpenRouter API key (for Kimi trials), forwarded as a secret env var.

Return Type
Secret 
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 openrouter-api-key
func (m *MyModule) Example() *dagger.Secret  {
	return dag.
			HarborCheck().
			OpenrouterApiKey()
}
@function
def example() -> dagger.Secret:
	return (
		dag.harbor_check()
		.openrouter_api_key()
	)
@func()
example(): Secret {
	return dag
		.harborCheck()
		.openrouterApiKey()
}

codexAccessToken() 🔗

Codex (ChatGPT) access token for the RewardKit codex agent judge, forwarded as a secret env var. With REWARDKIT_FORCE_OAUTH=1, RewardKit prefers this over OPENAI_API_KEY (rewardkit/judges.py) — the codex analogue of the Anthropic subscription-token path.

Return Type
Secret 
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 codex-access-token
func (m *MyModule) Example() *dagger.Secret  {
	return dag.
			HarborCheck().
			CodexAccessToken()
}
@function
def example() -> dagger.Secret:
	return (
		dag.harbor_check()
		.codex_access_token()
	)
@func()
example(): Secret {
	return dag
		.harborCheck()
		.codexAccessToken()
}

miniSweConfig() 🔗

Default mini-swe-agent config YAML (a path relative to the source, e.g. the task repo’s mini-swe-agent.yaml) forwarded as harbor run -c <config>.

Constructor-defaulted for the common case and also accepted per call by evidence-emitting primitives, so task-type registries can route each task shape to its own agent config without rebuilding the toolchain accessor.

Return Type
String !
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 mini-swe-config
func (m *MyModule) Example(ctx context.Context) string  {
	return dag.
			HarborCheck().
			MiniSweConfig(ctx)
}
@function
async def example() -> str:
	return await (
		dag.harbor_check()
		.mini_swe_config()
	)
@func()
async example(): Promise<string> {
	return dag
		.harborCheck()
		.miniSweConfig()
}

validateTask() 🔗

Validate the source as a Harbor task directory (Task / TaskConfig models). Fails the check when Harbor reports the task invalid.

Return Type
Void !
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 validate-task
func (m *MyModule) Example(ctx context.Context)   {
	return dag.
			HarborCheck().
			ValidateTask(ctx)
}
@function
async def example() -> None:
	return await (
		dag.harbor_check()
		.validate_task()
	)
@func()
async example(): Promise<void> {
	return dag
		.harborCheck()
		.validateTask()
}

checkTrajectory() 🔗

Validate the source’s trajectory with Harbor’s TrajectoryValidator.

Return Type
Void !
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 check-trajectory
func (m *MyModule) Example(ctx context.Context)   {
	return dag.
			HarborCheck().
			CheckTrajectory(ctx)
}
@function
async def example() -> None:
	return await (
		dag.harbor_check()
		.check_trajectory()
	)
@func()
async example(): Promise<void> {
	return dag
		.harborCheck()
		.checkTrajectory()
}

environmentBuilds() 🔗

Build the mounted source’s task environment as a check: fails when harbor task start-env cannot build the environment/ (Dockerfile / docker_image / compose). Confirms the environment builds before trials run.

Return Type
Void !
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 environment-builds
func (m *MyModule) Example(ctx context.Context)   {
	return dag.
			HarborCheck().
			EnvironmentBuilds(ctx)
}
@function
async def example() -> None:
	return await (
		dag.harbor_check()
		.environment_builds()
	)
@func()
async example(): Promise<void> {
	return dag
		.harborCheck()
		.environmentBuilds()
}

proofHash() 🔗

Deterministic proof-freshness content hash of the source task tree (generated/cache files excluded). Returns sha256: — byte-identical to the digest harbor sync writes into a dataset manifest (reconciled to Harbor’s native Packager.compute_content_hash, with a marked local fallback).

Return Type
String !
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 proof-hash
func (m *MyModule) Example(ctx context.Context) string  {
	return dag.
			HarborCheck().
			ProofHash(ctx)
}
@function
async def example() -> str:
	return await (
		dag.harbor_check()
		.proof_hash()
	)
@func()
async example(): Promise<string> {
	return dag
		.harborCheck()
		.proofHash()
}

proofDigest() 🔗

Content digest of the source task tree as ContentDigest JSON (algorithm, digest, source [“harbor”|“fallback”], files).

Return Type
String !
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 proof-digest
func (m *MyModule) Example(ctx context.Context) string  {
	return dag.
			HarborCheck().
			ProofDigest(ctx)
}
@function
async def example() -> str:
	return await (
		dag.harbor_check()
		.proof_digest()
	)
@func()
async example(): Promise<string> {
	return dag
		.harborCheck()
		.proofDigest()
}

rewardDetails() 🔗

Parse every reward-details.json under the source into a stable JSON array of criterion observations.

Return Type
String !
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 reward-details
func (m *MyModule) Example(ctx context.Context) string  {
	return dag.
			HarborCheck().
			RewardDetails(ctx)
}
@function
async def example() -> str:
	return await (
		dag.harbor_check()
		.reward_details()
	)
@func()
async example(): Promise<string> {
	return dag
		.harborCheck()
		.rewardDetails()
}

solveRates() 🔗

Compute per-criterion solve rates (passed / total) under the source.

Return Type
String !
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 solve-rates
func (m *MyModule) Example(ctx context.Context) string  {
	return dag.
			HarborCheck().
			SolveRates(ctx)
}
@function
async def example() -> str:
	return await (
		dag.harbor_check()
		.solve_rates()
	)
@func()
async example(): Promise<string> {
	return dag
		.harborCheck()
		.solveRates()
}

normalize() 🔗

Normalize a Harbor job directory (relative to the source) into stable JSON.

Return Type
String !
Arguments
NameTypeDefault ValueDescription
jobDirString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 normalize
func (m *MyModule) Example(ctx context.Context) string  {
	return dag.
			HarborCheck().
			Normalize(ctx)
}
@function
async def example() -> str:
	return await (
		dag.harbor_check()
		.normalize()
	)
@func()
async example(): Promise<string> {
	return dag
		.harborCheck()
		.normalize()
}

manifest() 🔗

Build an artifact manifest (path, size, sha256) of the source as JSON.

Return Type
String !
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 manifest
func (m *MyModule) Example(ctx context.Context) string  {
	return dag.
			HarborCheck().
			Manifest(ctx)
}
@function
async def example() -> str:
	return await (
		dag.harbor_check()
		.manifest()
	)
@func()
async example(): Promise<string> {
	return dag
		.harborCheck()
		.manifest()
}

redact() 🔗

Redaction scan: scrub secret-shaped strings from text.

Return Type
String !
Arguments
NameTypeDefault ValueDescription
textString !-No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 redact --text string
func (m *MyModule) Example(ctx context.Context, text string) string  {
	return dag.
			HarborCheck().
			Redact(ctx, text)
}
@function
async def example(text: str) -> str:
	return await (
		dag.harbor_check()
		.redact(text)
	)
@func()
async example(text: string): Promise<string> {
	return dag
		.harborCheck()
		.redact(text)
}

check() 🔗

Run harbor check against tasks/<slug> from working directory kit, using rubric rubric and model model, writing the report under .dagger-output. Returns the post-run container (pull the output directory from it).

Return Type
Container !
Arguments
NameTypeDefault ValueDescription
slugString !-No description provided
rubricString !-No description provided
modelString -No description provided
kitString -No description provided
outputPathString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 check --slug string --rubric string
func (m *MyModule) Example(slug string, rubric string) *dagger.Container  {
	return dag.
			HarborCheck().
			Check(slug, rubric)
}
@function
def example(slug: str, rubric: str) -> dagger.Container:
	return (
		dag.harbor_check()
		.check(slug, rubric)
	)
@func()
example(slug: string, rubric: string): Container {
	return dag
		.harborCheck()
		.check(slug, rubric)
}

run() 🔗

Generic harbor run. Covers oracle/nop/SKU controls and Kimi difficulty trials — the orchestration decides which agent/model/env/controls to use.

Token-bearing forwarding flags (--verifier-env, --agent-env) are expanded in-shell from the secret env vars set by withSecrets, mirroring how the legacy pipeline forwarded them.

Return Type
Container !
Arguments
NameTypeDefault ValueDescription
agentString !-No description provided
taskPathString !-No description provided
modelString !-No description provided
jobNameString !-No description provided
envString -No description provided
miniSweConfigString -No description provided
outputDirString -No description provided
reasoningEffortString -

Optional reasoning_effort agent kwarg (Kimi trials use “high”).

forwardOpenrouterBoolean -

Forward OPENROUTER_API_KEY to the agent (Kimi trials need it).

Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 run --agent string --task-path string --model string --job-name string
func (m *MyModule) Example(agent string, taskPath string, model string, jobName string) *dagger.Container  {
	return dag.
			HarborCheck().
			Run(agent, taskPath, model, jobName)
}
@function
def example(agent: str, task_path: str, model: str, job_name: str) -> dagger.Container:
	return (
		dag.harbor_check()
		.run(agent, task_path, model, job_name)
	)
@func()
example(agent: string, taskPath: string, model: string, jobName: string): Container {
	return dag
		.harborCheck()
		.run(agent, taskPath, model, jobName)
}

runOracle() 🔗

Convenience: harbor run -a oracle (golden control).

Return Type
Container !
Arguments
NameTypeDefault ValueDescription
taskPathString !-No description provided
modelString -No description provided
miniSweConfigString -No description provided
envString -No description provided
jobNameString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 run-oracle --task-path string
func (m *MyModule) Example(taskPath string) *dagger.Container  {
	return dag.
			HarborCheck().
			RunOracle(taskPath)
}
@function
def example(task_path: str) -> dagger.Container:
	return (
		dag.harbor_check()
		.run_oracle(task_path)
	)
@func()
example(taskPath: string): Container {
	return dag
		.harborCheck()
		.runOracle(taskPath)
}

runNop() 🔗

Convenience: harbor run -a nop (no-op control).

Return Type
Container !
Arguments
NameTypeDefault ValueDescription
taskPathString !-No description provided
modelString -No description provided
miniSweConfigString -No description provided
envString -No description provided
jobNameString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 run-nop --task-path string
func (m *MyModule) Example(taskPath string) *dagger.Container  {
	return dag.
			HarborCheck().
			RunNop(taskPath)
}
@function
def example(task_path: str) -> dagger.Container:
	return (
		dag.harbor_check()
		.run_nop(task_path)
	)
@func()
example(taskPath: string): Container {
	return dag
		.harborCheck()
		.runNop(taskPath)
}

kimiTrial() 🔗

Convenience: a Kimi difficulty trial via mini-swe-agent with high reasoning effort and OpenRouter forwarding.

Return Type
Container !
Arguments
NameTypeDefault ValueDescription
taskPathString !-No description provided
trialInteger -No description provided
modelString -No description provided
miniSweConfigString -No description provided
envString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 kimi-trial --task-path string
func (m *MyModule) Example(taskPath string) *dagger.Container  {
	return dag.
			HarborCheck().
			KimiTrial(taskPath)
}
@function
def example(task_path: str) -> dagger.Container:
	return (
		dag.harbor_check()
		.kimi_trial(task_path)
	)
@func()
example(taskPath: string): Container {
	return dag
		.harborCheck()
		.kimiTrial(taskPath)
}

analyze() 🔗

Run harbor analyze over a jobs directory with the given rubric/model.

Return Type
Container !
Arguments
NameTypeDefault ValueDescription
jobsDirString !-No description provided
rubricString !-No description provided
modelString -No description provided
outputPathString -No description provided
concurrencyString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 analyze --jobs-dir string --rubric string
func (m *MyModule) Example(jobsDir string, rubric string) *dagger.Container  {
	return dag.
			HarborCheck().
			Analyze(jobsDir, rubric)
}
@function
def example(jobs_dir: str, rubric: str) -> dagger.Container:
	return (
		dag.harbor_check()
		.analyze(jobs_dir, rubric)
	)
@func()
example(jobsDir: string, rubric: string): Container {
	return dag
		.harborCheck()
		.analyze(jobsDir, rubric)
}

startEnv() 🔗

Build (start) the source task’s environment/ via harbor task start-env, exercising its Dockerfile / docker_image / compose without running an agent. Returns the post-build container (pull logs/artifacts from it).

Return Type
Container !
Arguments
NameTypeDefault ValueDescription
taskPathString !-No description provided
envString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 start-env --task-path string
func (m *MyModule) Example(taskPath string) *dagger.Container  {
	return dag.
			HarborCheck().
			StartEnv(taskPath)
}
@function
def example(task_path: str) -> dagger.Container:
	return (
		dag.harbor_check()
		.start_env(task_path)
	)
@func()
example(taskPath: string): Container {
	return dag
		.harborCheck()
		.startEnv(taskPath)
}

trial() 🔗

Run harbor trial start (a single trial). Mirrors run exactly but invokes the trial subcommand and forwards --trial-name in place of --job-name.

The Python source-of-truth for this exact argv/flag shape is harbor_runner/commands.py (render_harbor_command), which is unit-tested.

Token-bearing forwarding flags (--verifier-env, --agent-env) are expanded in-shell from the secret env vars set by withSecrets; plaintext is never interpolated.

Return Type
Container !
Arguments
NameTypeDefault ValueDescription
agentString !-No description provided
taskPathString !-No description provided
modelString !-No description provided
trialNameString !-No description provided
miniSweConfigString -No description provided
envString -No description provided
outputDirString -No description provided
reasoningEffortString -

Optional reasoning_effort agent kwarg (Kimi trials use “high”).

forwardOpenrouterBoolean -

Forward OPENROUTER_API_KEY to the agent (Kimi trials need it).

Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 trial --agent string --task-path string --model string --trial-name string
func (m *MyModule) Example(agent string, taskPath string, model string, trialName string) *dagger.Container  {
	return dag.
			HarborCheck().
			Trial(agent, taskPath, model, trialName)
}
@function
def example(agent: str, task_path: str, model: str, trial_name: str) -> dagger.Container:
	return (
		dag.harbor_check()
		.trial(agent, task_path, model, trial_name)
	)
@func()
example(agent: string, taskPath: string, model: string, trialName: string): Container {
	return dag
		.harborCheck()
		.trial(agent, taskPath, model, trialName)
}

exportTraces() 🔗

Export Harbor agent trajectories under path to a local SFT dataset (parquet by default, or ShareGPT JSON when sharegpt is set), written under outputDir. Returns that directory.

episodes selects which episodes to include (“all” or “last”). Pushing to the Hugging Face Hub (hf://) is opt-in upstream and is intentionally NOT implemented in this primitive — only local parquet/JSON output is produced.

Return Type
Directory !
Arguments
NameTypeDefault ValueDescription
pathString !-No description provided
episodesString -No description provided
sharegptBoolean -No description provided
outputDirString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 export-traces --path string
func (m *MyModule) Example(path string) *dagger.Directory  {
	return dag.
			HarborCheck().
			ExportTraces(path)
}
@function
def example(path: str) -> dagger.Directory:
	return (
		dag.harbor_check()
		.export_traces(path)
	)
@func()
example(path: string): Directory {
	return dag
		.harborCheck()
		.exportTraces(path)
}

download() 🔗

Download a shared Harbor task or dataset (harbor download) and reconcile the fetched bytes against a content digest. allowNetwork defaults to false: with no network/registry-auth egress the function REFUSES without fetching. Pass allowNetwork=true to actually reach the registry; the verified digest reuses the same reconciled native hash as proof-freshness.

Return Type
Container !
Arguments
NameTypeDefault ValueDescription
refString !-No description provided
outputDirString -No description provided
allowNetworkBoolean -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 download --ref string
func (m *MyModule) Example(ref string) *dagger.Container  {
	return dag.
			HarborCheck().
			Download(ref)
}
@function
def example(ref: str) -> dagger.Container:
	return (
		dag.harbor_check()
		.download(ref)
	)
@func()
example(ref: string): Container {
	return dag
		.harborCheck()
		.download(ref)
}

screenPool() 🔗

Pool the native reward-details.json grades under gradesDir (relative to the source /work) via screen-pool. Reads every */reward-details.json, folds through screen.pool.pool (errored EXCLUDED, UNWEIGHTED bar, boundary band surfaced, INSUFFICIENT_DATA on < k non-errored) and returns the PoolResult JSON on stdout.

k/threshold/mode are configurable knobs (house-rule 6); the admission verdict is decided inside pool.py, not here (HC5). Int args are stringified via "${n}" string interpolation — dang has no Int.toString, and String! + Int! is a type error; interpolation is the conversion idiom.

Return Type
String !
Arguments
NameTypeDefault ValueDescription
gradesDirString !-No description provided
kInteger -No description provided
thresholdInteger -No description provided
modeString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 screen-pool --grades-dir string
func (m *MyModule) Example(ctx context.Context, gradesDir string) string  {
	return dag.
			HarborCheck().
			ScreenPool(ctx, gradesDir)
}
@function
async def example(grades_dir: str) -> str:
	return await (
		dag.harbor_check()
		.screen_pool(grades_dir)
	)
@func()
async example(gradesDir: string): Promise<string> {
	return dag
		.harborCheck()
		.screenPool(gradesDir)
}

screenClassify() 🔗

Classify a finished trial set under trialsDir (relative to /work) via screen-classify: apply the failure taxonomy and env-quality gates and return the disposition/decision JSON on stdout. Gates flag, never accommodate (house-rule 7) — a slow startup surfaces as SKIPPED_SLOW_STARTUP, a slow trial as TRIAL_TOO_SLOW; the measured seconds are a fact, the flag is the verdict (decided in classify.py, not here).

Return Type
String !
Arguments
NameTypeDefault ValueDescription
trialsDirString !-No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 screen-classify --trials-dir string
func (m *MyModule) Example(ctx context.Context, trialsDir string) string  {
	return dag.
			HarborCheck().
			ScreenClassify(ctx, trialsDir)
}
@function
async def example(trials_dir: str) -> str:
	return await (
		dag.harbor_check()
		.screen_classify(trials_dir)
	)
@func()
async example(trialsDir: string): Promise<string> {
	return dag
		.harborCheck()
		.screenClassify(trialsDir)
}

screenReport() 🔗

Fold per-task result dicts (tasksJson, a path relative to /work) through screen-report and return the deterministic, sorted report JSON on stdout. Output is byte-identical for identical input (no clock/PID/uuid; the inputs_digest is content-derived in report.py).

Return Type
String !
Arguments
NameTypeDefault ValueDescription
tasksJsonString !-No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 screen-report --tasks-json string
func (m *MyModule) Example(ctx context.Context, tasksJson string) string  {
	return dag.
			HarborCheck().
			ScreenReport(ctx, tasksJson)
}
@function
async def example(tasks_json: str) -> str:
	return await (
		dag.harbor_check()
		.screen_report(tasks_json)
	)
@func()
async example(tasksJson: string): Promise<string> {
	return dag
		.harborCheck()
		.screenReport(tasksJson)
}

screen() 🔗

Difficulty / admission screener for a single task. Runs the EXISTING trial (mini-swe-agent + the task’s own rewardkit verifier) maxTrials times, collecting each trial’s output directory; then classifies and pools those native reward-details.json grades and folds the result through screen-report, returning a Directory holding report.json + JUnit XML (matching exportTraces, which also returns a Directory).

Screening is difficulty-only and mini-swe-agent only: there is no llm agent path here, and controls=true is rejected (golden/no-op controls stay in runOracle/runNop, not the screener — HC5 boundary). The admission verdict is decided in pool.py from threshold/k; this dang function holds no threshold and embeds no decision. Secrets are forwarded only through the reused trial (withSecrets) — no new secret surface is introduced.

NOTE (lead to confirm the dang loop idiom): the dang module has no observed loop/iteration construct, so the trial fan-out is a FIXED unrolled sequence of trial calls (up to the maxTrials ceiling) collected conditionally — mirroring the existing let x = if (...) {...} idiom. Each attempt past maxTrials is skipped. When the dang loop idiom is confirmed, this unroll collapses to a loop over 1..maxTrials.

Return Type
Directory !
Arguments
NameTypeDefault ValueDescription
taskPathString !-No description provided
modelString -No description provided
miniSweConfigString !-No description provided
kInteger -No description provided
thresholdInteger -No description provided
maxTrialsInteger -No description provided
envString -No description provided
controlsBoolean -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 screen --task-path string --mini-swe-config string
func (m *MyModule) Example(taskPath string, miniSweConfig string) *dagger.Directory  {
	return dag.
			HarborCheck().
			Screen(taskPath, miniSweConfig)
}
@function
def example(task_path: str, mini_swe_config: str) -> dagger.Directory:
	return (
		dag.harbor_check()
		.screen(task_path, mini_swe_config)
	)
@func()
example(taskPath: string, miniSweConfig: string): Directory {
	return dag
		.harborCheck()
		.screen(taskPath, miniSweConfig)
}

checkTask() 🔗

Run harbor check against task (a task path relative to the source) and emit CheckEvidence JSON ({task, status, exitCode, summaryRef}) on stdout. The check’s exit code is captured (always exits 0 at the wrapper) and folded through check-evidence; status is pass/fail from the exit code (the admission verdict is pipeline policy, not encoded here).

HC3: withExec is the only exec surface. HC4: task rides in an env var, never interpolated into the command text.

Return Type
String !
Arguments
NameTypeDefault ValueDescription
taskString !-No description provided
rubricString -No description provided
modelString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 check-task --task string
func (m *MyModule) Example(ctx context.Context, task string) string  {
	return dag.
			HarborCheck().
			CheckTask(ctx, task)
}
@function
async def example(task: str) -> str:
	return await (
		dag.harbor_check()
		.check_task(task)
	)
@func()
async example(task: string): Promise<string> {
	return dag
		.harborCheck()
		.checkTask(task)
}

runControl() 🔗

Dispatch a control run for task by agent (oracle / nop / any SKU control) and emit ControlEvidence JSON ({task, agent, status, exitCode, reward}). The control run uses the same command builder as run, but captures the Harbor exit code inside the shell that invokes it; the reward is read from the run’s reward.json via the control-evidence emitter (the reward is a FACT; oracle≈1 / nop≈0 being correct is the pipeline’s verdict).

HC3: all execution via withExec. HC4: task/agent ride in env vars.

Return Type
String !
Arguments
NameTypeDefault ValueDescription
taskString !-No description provided
agentString !-No description provided
modelString -No description provided
miniSweConfigString -No description provided
envString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 run-control --task string --agent string
func (m *MyModule) Example(ctx context.Context, task string, agent string) string  {
	return dag.
			HarborCheck().
			RunControl(ctx, task, agent)
}
@function
async def example(task: str, agent: str) -> str:
	return await (
		dag.harbor_check()
		.run_control(task, agent)
	)
@func()
async example(task: string, agent: string): Promise<string> {
	return dag
		.harborCheck()
		.runControl(task, agent)
}

runKimiTrials() 🔗

Run trials Kimi difficulty trials for task and emit canonical 0–1 DifficultyEvidence JSON ({task, model, pooled_pct, k, errored_excluded, criteria, meanReward, costUsd, rewardDetailsPresent} — NO verdict field).

Orchestrates the EXISTING screen path: fan out trials difficulty trials via screenTrialDir, pool the native grades via screen-pool into pooled.json, then fold through difficulty-evidence (the SOLE 0-100→0-1 converter, HC2). Facts only; the admission verdict is pipeline policy.

HC3: all execution via withExec. The trial fan-out is a FIXED unroll up to 10 (the dang loop idiom is not yet confirmed — see the note in screen).

Return Type
String !
Arguments
NameTypeDefault ValueDescription
taskString !-No description provided
trialsInteger -No description provided
modelString -No description provided
miniSweConfigString -No description provided
kInteger -No description provided
thresholdInteger -No description provided
envString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 run-kimi-trials --task string
func (m *MyModule) Example(ctx context.Context, task string) string  {
	return dag.
			HarborCheck().
			RunKimiTrials(ctx, task)
}
@function
async def example(task: str) -> str:
	return await (
		dag.harbor_check()
		.run_kimi_trials(task)
	)
@func()
async example(task: string): Promise<string> {
	return dag
		.harborCheck()
		.runKimiTrials(task)
}

analyzeJobs() 🔗

Run harbor analyze over the task’s jobs directory and emit AnalyzeEvidence JSON ({task, status, rewardHackingFindings[]}). The analyze exit code is captured (always exits 0 at the wrapper) and folded through analyze-evidence, which harvests any reward-hacking findings from the written analyze report. Surfaced findings are facts, never an auto-verdict.

HC3: withExec is the only exec surface. HC4: task/jobsDir ride in env vars.

Return Type
String !
Arguments
NameTypeDefault ValueDescription
taskString !-No description provided
jobsDirString -No description provided
rubricString -No description provided
modelString -No description provided
concurrencyString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 analyze-jobs --task string
func (m *MyModule) Example(ctx context.Context, task string) string  {
	return dag.
			HarborCheck().
			AnalyzeJobs(ctx, task)
}
@function
async def example(task: str) -> str:
	return await (
		dag.harbor_check()
		.analyze_jobs(task)
	)
@func()
async example(task: string): Promise<string> {
	return dag
		.harborCheck()
		.analyzeJobs(task)
}

validateJobs() 🔗

Inspect the task’s jobs/ subtree and emit JobsEvidence JSON ({task, present, finalJobs[], staleRetries[]}). Pure filesystem inspection in the container via jobs-evidence — no Harbor invocation, no secrets, no mutation (incomplete trials are surfaced as staleRetries, never quarantined).

HC3: withExec is the only exec surface; jobs_evidence.py is pure pathlib+json.

Return Type
String !
Arguments
NameTypeDefault ValueDescription
taskString !-No description provided
taskRootString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 validate-jobs --task string
func (m *MyModule) Example(ctx context.Context, task string) string  {
	return dag.
			HarborCheck().
			ValidateJobs(ctx, task)
}
@function
async def example(task: str) -> str:
	return await (
		dag.harbor_check()
		.validate_jobs(task)
	)
@func()
async example(task: string): Promise<string> {
	return dag
		.harborCheck()
		.validateJobs(task)
}

platformCompat() 🔗

Check whether the task’s environment/ Dockerfile builds on the given OCI platform. Uses harbor_runner.platform.buildx.buildx_argv (pure, HC3-clean) to produce the docker buildx build --platform argv, then runs it via withExec (the toolchain’s only exec surface) and parses the result through platform-parsePlatformEvidence JSON (String!).

HC3: withExec is the only exec surface; the buildx argv builder and parse modules contain no host-exec calls.

Return Type
String !
Arguments
NameTypeDefault ValueDescription
taskPathString !-No description provided
platformString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 platform-compat --task-path string
func (m *MyModule) Example(ctx context.Context, taskPath string) string  {
	return dag.
			HarborCheck().
			PlatformCompat(ctx, taskPath)
}
@function
async def example(task_path: str) -> str:
	return await (
		dag.harbor_check()
		.platform_compat(task_path)
	)
@func()
async example(taskPath: string): Promise<string> {
	return dag
		.harborCheck()
		.platformCompat(taskPath)
}

executorRun() 🔗

Run a single Harbor trial via the local Docker or Modal executor and emit ExecutorEvidence JSON ({task, executor, status}) in EVERY case (A6 fix).

For executor=local: actually run the trial via the existing trial path (mini-swe-agent + the task’s verifier), capture its exit code with a post-exec probe (sh -c writing $? then exiting 0 so withExec does not fail), and shape the evidence from that code through executor-run.

For executor=modal: run the same Harbor trial path with -e modal, then pass the captured exit code to executor-run --executor modal --exit-code ... so the evidence contract stays identical. Modal token creds are forwarded as container env vars via withSecretVariable (never on argv).

Secrets are forwarded via the existing withSecrets pattern plus the optional Modal token pair — no plaintext on argv (HC4). HC3: withExec is the only exec surface; every reached module is host-exec-free.

Return Type
String !
Arguments
NameTypeDefault ValueDescription
taskPathString !-No description provided
executorString -No description provided
platformString -No description provided
jobNameString -No description provided
modelString -No description provided
miniSweConfigString -No description provided
envString -No description provided
modalTokenIdSecret -No description provided
modalTokenSecretSecret -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 executor-run --task-path string
func (m *MyModule) Example(ctx context.Context, taskPath string) string  {
	return dag.
			HarborCheck().
			ExecutorRun(ctx, taskPath)
}
@function
async def example(task_path: str) -> str:
	return await (
		dag.harbor_check()
		.executor_run(task_path)
	)
@func()
async example(taskPath: string): Promise<string> {
	return dag
		.harborCheck()
		.executorRun(taskPath)
}

selfReviewReport() 🔗

Emit a SelfReviewReport JSON fact and optionally write it to outputPath.

requiredLanesJson is a JSON string array. checksJson is a JSON array of {id,state,evidence} objects. The wrapper passes both as scalar argv values so the Dagger surface stays typed and the Python emitter owns report shaping.

Return Type
String !
Arguments
NameTypeDefault ValueDescription
taskString !-No description provided
taskTypeString !-No description provided
sourceKindString !-No description provided
reviewAgentString !-No description provided
reportPathString !-No description provided
reviewStateString -No description provided
requiredLanesJsonString -No description provided
checksJsonString -No description provided
outputPathString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 self-review-report --task string --task-type string --source-kind string --review-agent string --report-path string
func (m *MyModule) Example(ctx context.Context, task string, taskType string, sourceKind string, reviewAgent string, reportPath string) string  {
	return dag.
			HarborCheck().
			SelfReviewReport(ctx, task, taskType, sourceKind, reviewAgent, reportPath)
}
@function
async def example(task: str, task_type: str, source_kind: str, review_agent: str, report_path: str) -> str:
	return await (
		dag.harbor_check()
		.self_review_report(task, task_type, source_kind, review_agent, report_path)
	)
@func()
async example(task: string, taskType: string, sourceKind: string, reviewAgent: string, reportPath: string): Promise<string> {
	return dag
		.harborCheck()
		.selfReviewReport(task, taskType, sourceKind, reviewAgent, reportPath)
}

runSweepTrials() 🔗

Fan out trials trials per (agent, model) cell across the FULL matrix and emit SweepDifficultyEvidence JSON ({task, cells[{agent, model, pooled_pct, k, errored_excluded}]}). Facts only — no verdict field (HC5: the dang layer holds no thresholds and embeds no decision; pooled_pct is a 0–1 fraction, HC2).

The matrix is built by sweep-matrix (A7 fix): agents/models are passed as base64-encoded JSON via env vars, so a model string containing commas/brackets can never corrupt the argv (no broken ["a,b"] string-concat). Each cell runs its OWN trials into cell-<i>/trial-<n> (A8 fix: one row per matrix cell, not a single hardcoded cell), with the per-cell trial fan-out unrolled to 10 (A9 fix: cap raised from 4).

The matrix fan-out is a FIXED unroll over a bounded 2×2 (agent, model) grid (the dang loop idiom is not confirmed — see the note in screen). Cells beyond the first are guarded by agentCount/modelCount, which default to the 1×1 default matrix so the pipeline’s source+task call runs exactly the default cell; a caller widening agents/models MUST pass matching counts and MUST provide lists whose lengths are at least agentCount / modelCount. The current dang surface does not expose a list-length primitive, so this is an explicit caller contract: agentCount >= 2 authorizes use of agents[1]!, and modelCount >= 2 authorizes use of models[1]!. The Python sweep-matrix step then cross-checks the encoded list source against the expected indexed values before emitting evidence. sweep-collect reports any matrix slot whose trials were not run as INSUFFICIENT_DATA / pooled_pct=null — a fact, never a silent drop.

HC3: all execution is via withExec; the sweep/matrix/collect modules are pure.

Return Type
String !
Arguments
NameTypeDefault ValueDescription
taskPathString !-No description provided
agents[String ! ] -No description provided
models[String ! ] -No description provided
agentsB64String -No description provided
modelsB64String -No description provided
agentCountInteger -No description provided
modelCountInteger -No description provided
trialsInteger -No description provided
miniSweConfigString -No description provided
envString -No description provided
kInteger -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 run-sweep-trials --task-path string
func (m *MyModule) Example(ctx context.Context, taskPath string) string  {
	return dag.
			HarborCheck().
			RunSweepTrials(ctx, taskPath)
}
@function
async def example(task_path: str) -> str:
	return await (
		dag.harbor_check()
		.run_sweep_trials(task_path)
	)
@func()
async example(taskPath: string): Promise<string> {
	return dag
		.harborCheck()
		.runSweepTrials(taskPath)
}

screenRetry() 🔗

Retry-resume: remove completed trial dirs whose result.json carries an exception_info.exception_type matching any entry in errorTypes (comma- separated), via harbor_runner.screen.retry_resume.select_trials_to_retry. Returns a pruned Directory! of the trials dir (the removed dirs are gone; surviving trials remain). Secrets are NOT needed — this is a pure filesystem operation inside the container.

HC3: retry_resume.py is pure pathlib+json; withExec is the only exec surface.

Return Type
Directory !
Arguments
NameTypeDefault ValueDescription
trialsDirString !-No description provided
errorTypesString -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 screen-retry --trials-dir string
func (m *MyModule) Example(trialsDir string) *dagger.Directory  {
	return dag.
			HarborCheck().
			ScreenRetry(trialsDir)
}
@function
def example(trials_dir: str) -> dagger.Directory:
	return (
		dag.harbor_check()
		.screen_retry(trials_dir)
	)
@func()
example(trialsDir: string): Directory {
	return dag
		.harborCheck()
		.screenRetry(trialsDir)
}

screenCleanup() 🔗

Cleanup: quarantine trial dirs that are incomplete (missing/invalid config.json or result.json), via harbor_runner.screen.retry_resume.clean_incomplete_trials. Returns the pruned Directory! of the trials dir. Pure filesystem operation — no secrets needed.

HC3: retry_resume.py is pure pathlib+json; withExec is the only exec surface.

Return Type
Directory !
Arguments
NameTypeDefault ValueDescription
trialsDirString !-No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 screen-cleanup --trials-dir string
func (m *MyModule) Example(trialsDir string) *dagger.Directory  {
	return dag.
			HarborCheck().
			ScreenCleanup(trialsDir)
}
@function
def example(trials_dir: str) -> dagger.Directory:
	return (
		dag.harbor_check()
		.screen_cleanup(trials_dir)
	)
@func()
example(trialsDir: string): Directory {
	return dag
		.harborCheck()
		.screenCleanup(trialsDir)
}

syncTrials() 🔗

Upsert a Harbor job directory’s job + trial rows to Supabase via harbor_runner.sync.trials.sync_trials. No-op (graceful skip) when supabaseUrl/supabaseKey are not supplied — the call always succeeds, it just writes nothing. Returns {written, skipped} JSON (String!).

HC4: the Supabase URL/key are attached as Dagger secrets via withSecretVariable and read by the CLI from os.environ — they NEVER appear on argv (A11 fix). The argv is a STRUCTURED withExec (no sh -c, no command-substitution that would echo the plaintext URL/key into the final command); the only inputs on the command line are the non-secret --job-dir.

HC3: sync/trials.py contains no host-exec calls; withExec is the sole exec surface.

Return Type
String !
Arguments
NameTypeDefault ValueDescription
jobDirString !-No description provided
supabaseUrlSecret -No description provided
supabaseKeySecret -No description provided
Example
dagger -m github.com/Kurry/harbor-check@34fb3409d0c79a92daf5345daddeb30f16737fe0 call \
 sync-trials --job-dir string
func (m *MyModule) Example(ctx context.Context, jobDir string) string  {
	return dag.
			HarborCheck().
			SyncTrials(ctx, jobDir)
}
@function
async def example(job_dir: str) -> str:
	return await (
		dag.harbor_check()
		.sync_trials(job_dir)
	)
@func()
async example(jobDir: string): Promise<string> {
	return dag
		.harborCheck()
		.syncTrials(jobDir)
}