evaluator
No long description provided.
Installation
dagger install github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23eEntrypoint
Return Type
Evaluator !Arguments
| Name | Type | Default Value | Description |
|---|---|---|---|
| model | String | - | Model to use for the evaluator agent. |
Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
func (m *MyModule) Example() *dagger.Evaluator {
return dag.
Evaluator()
}@function
def example() -> dagger.Evaluator:
return (
dag.evaluator()
)@func()
example(): Evaluator {
return dag
.evaluator()
}Types
Evaluator 🔗
docs() 🔗
The documentation for the tool calling scheme to generate a prompt for.
Return Type
File ! Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
docsfunc (m *MyModule) Example() *dagger.File {
return dag.
Evaluator().
Docs()
}@function
def example() -> dagger.File:
return (
dag.evaluator()
.docs()
)@func()
example(): File {
return dag
.evaluator()
.docs()
}systemPrompt() 🔗
A system prompt to apply to all evals.
Return Type
File ! Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
system-promptfunc (m *MyModule) Example() *dagger.File {
return dag.
Evaluator().
SystemPrompt()
}@function
def example() -> dagger.File:
return (
dag.evaluator()
.system_prompt()
)@func()
example(): File {
return dag
.evaluator()
.systemPrompt()
}disableDefaultSystemPrompt() 🔗
Whether to disable the default system prompt.
Return Type
Boolean ! Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
disable-default-system-promptfunc (m *MyModule) Example(ctx context.Context) bool {
return dag.
Evaluator().
DisableDefaultSystemPrompt(ctx)
}@function
async def example() -> bool:
return await (
dag.evaluator()
.disable_default_system_prompt()
)@func()
async example(): Promise<boolean> {
return dag
.evaluator()
.disableDefaultSystemPrompt()
}evaluatorModel() 🔗
Return Type
String ! Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
evaluator-modelfunc (m *MyModule) Example(ctx context.Context) string {
return dag.
Evaluator().
EvaluatorModel(ctx)
}@function
async def example() -> str:
return await (
dag.evaluator()
.evaluator_model()
)@func()
async example(): Promise<string> {
return dag
.evaluator()
.evaluatorModel()
}withSystemPrompt() 🔗
Set a system prompt to be provided to the evals.
Return Type
Evaluator !Arguments
| Name | Type | Default Value | Description |
|---|---|---|---|
| prompt | String ! | - | No description provided |
Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
with-system-prompt --prompt stringfunc (m *MyModule) Example(prompt string) *dagger.Evaluator {
return dag.
Evaluator().
WithSystemPrompt(prompt)
}@function
def example(prompt: str) -> dagger.Evaluator:
return (
dag.evaluator()
.with_system_prompt(prompt)
)@func()
example(prompt: string): Evaluator {
return dag
.evaluator()
.withSystemPrompt(prompt)
}withSystemPromptFile() 🔗
Set a system prompt to be provided to the evals.
Return Type
Evaluator !Arguments
| Name | Type | Default Value | Description |
|---|---|---|---|
| file | File ! | - | No description provided |
Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
with-system-prompt-file --file file:pathfunc (m *MyModule) Example(file *dagger.File) *dagger.Evaluator {
return dag.
Evaluator().
WithSystemPromptFile(file)
}@function
def example(file: dagger.File) -> dagger.Evaluator:
return (
dag.evaluator()
.with_system_prompt_file(file)
)@func()
example(file: File): Evaluator {
return dag
.evaluator()
.withSystemPromptFile(file)
}withoutDefaultSystemPrompt() 🔗
Disable Dagger’s built-in system prompt.
You probably don’t need to use this - Dagger’s system prompt provides the fundamentals for how the agent interacts with Dagger objects. This is primarily exposed so that we (Dagger) can iteratively test the default system prompt itself.
Return Type
Evaluator ! Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
without-default-system-promptfunc (m *MyModule) Example() *dagger.Evaluator {
return dag.
Evaluator().
WithoutDefaultSystemPrompt()
}@function
def example() -> dagger.Evaluator:
return (
dag.evaluator()
.without_default_system_prompt()
)@func()
example(): Evaluator {
return dag
.evaluator()
.withoutDefaultSystemPrompt()
}withDocs() 🔗
Set the full documentation the system prompt intends to effectuate.
Return Type
Evaluator !Arguments
| Name | Type | Default Value | Description |
|---|---|---|---|
| prompt | String ! | - | No description provided |
Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
with-docs --prompt stringfunc (m *MyModule) Example(prompt string) *dagger.Evaluator {
return dag.
Evaluator().
WithDocs(prompt)
}@function
def example(prompt: str) -> dagger.Evaluator:
return (
dag.evaluator()
.with_docs(prompt)
)@func()
example(prompt: string): Evaluator {
return dag
.evaluator()
.withDocs(prompt)
}withDocsFile() 🔗
Set the full documentation the system prompt intends to effectuate.
Return Type
Evaluator !Arguments
| Name | Type | Default Value | Description |
|---|---|---|---|
| file | File ! | - | No description provided |
Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
with-docs-file --file file:pathfunc (m *MyModule) Example(file *dagger.File) *dagger.Evaluator {
return dag.
Evaluator().
WithDocsFile(file)
}@function
def example(file: dagger.File) -> dagger.Evaluator:
return (
dag.evaluator()
.with_docs_file(file)
)@func()
example(file: File): Evaluator {
return dag
.evaluator()
.withDocsFile(file)
}withEval() 🔗
Return Type
Evaluator !Arguments
| Name | Type | Default Value | Description |
|---|---|---|---|
| eval | Interface ! | - | No description provided |
Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
with-evalfunc (m *MyModule) Example(eval ) *dagger.Evaluator {
return dag.
Evaluator().
WithEval(eval)
}@function
def example(eval: ) -> dagger.Evaluator:
return (
dag.evaluator()
.with_eval(eval)
)@func()
example(eval: ): Evaluator {
return dag
.evaluator()
.withEval(eval)
}withEvals() 🔗
Return Type
Evaluator !Arguments
| Name | Type | Default Value | Description |
|---|---|---|---|
| evals | [Interface ! ] ! | - | No description provided |
Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
with-evalsfunc (m *MyModule) Example(evals []) *dagger.Evaluator {
return dag.
Evaluator().
WithEvals(evals)
}@function
def example(evals: List[]) -> dagger.Evaluator:
return (
dag.evaluator()
.with_evals(evals)
)@func()
example(evals: []): Evaluator {
return dag
.evaluator()
.withEvals(evals)
}evalsAcrossModels() 🔗
Run evals across models.
Models run in parallel, and evals run in series, with all attempts in parallel.
Return Type
EvalsAcrossModels !Arguments
| Name | Type | Default Value | Description |
|---|---|---|---|
| evals | [String ! ] | - | Evals to run. Defaults to all. |
| models | [String ! ] | - | Models to run evals across. Defaults to all. |
| attempts | Integer | - | Attempts to run each eval. Defaults to a per-provider value. |
Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
evals-across-modelsfunc (m *MyModule) Example() *dagger.EvaluatorEvalsAcrossModels {
return dag.
Evaluator().
EvalsAcrossModels()
}@function
def example() -> dagger.EvaluatorEvalsAcrossModels:
return (
dag.evaluator()
.evals_across_models()
)@func()
example(): EvaluatorEvalsAcrossModels {
return dag
.evaluator()
.evalsAcrossModels()
}explore() 🔗
Return Type
[String ! ] ! Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
explorefunc (m *MyModule) Example(ctx context.Context) []string {
return dag.
Evaluator().
Explore(ctx)
}@function
async def example() -> List[str]:
return await (
dag.evaluator()
.explore()
)@func()
async example(): Promise<string[]> {
return dag
.evaluator()
.explore()
}generateSystemPrompt() 🔗
Return Type
String ! Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
generate-system-promptfunc (m *MyModule) Example(ctx context.Context) string {
return dag.
Evaluator().
GenerateSystemPrompt(ctx)
}@function
async def example() -> str:
return await (
dag.evaluator()
.generate_system_prompt()
)@func()
async example(): Promise<string> {
return dag
.evaluator()
.generateSystemPrompt()
}iterate() 🔗
Iterate runs all evals across all models in a loop until all of the evals succeed, analyzing the failures and generating a new system prompt to course-correct.
Return Type
String ! Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
iteratefunc (m *MyModule) Example(ctx context.Context) string {
return dag.
Evaluator().
Iterate(ctx)
}@function
async def example() -> str:
return await (
dag.evaluator()
.iterate()
)@func()
async example(): Promise<string> {
return dag
.evaluator()
.iterate()
}compare() 🔗
Return Type
String !Arguments
| Name | Type | Default Value | Description |
|---|---|---|---|
| before | File ! | - | No description provided |
| after | File ! | - | No description provided |
Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
compare --before file:path --after file:pathfunc (m *MyModule) Example(ctx context.Context, before *dagger.File, after *dagger.File) string {
return dag.
Evaluator().
Compare(ctx, before, after)
}@function
async def example(before: dagger.File, after: dagger.File) -> str:
return await (
dag.evaluator()
.compare(before, after)
)@func()
async example(before: File, after: File): Promise<string> {
return dag
.evaluator()
.compare(before, after)
}EvalsAcrossModels 🔗
traceId() 🔗
Return Type
String ! Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
evals-across-models \
trace-idfunc (m *MyModule) Example(ctx context.Context) string {
return dag.
Evaluator().
EvalsAcrossModels().
TraceId(ctx)
}@function
async def example() -> str:
return await (
dag.evaluator()
.evals_across_models()
.trace_id()
)@func()
async example(): Promise<string> {
return dag
.evaluator()
.evalsAcrossModels()
.traceId()
}modelResults() 🔗
Return Type
[ModelResult ! ] ! Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
evals-across-models \
model-resultsfunc (m *MyModule) Example() []*dagger.EvaluatorModelResult {
return dag.
Evaluator().
EvalsAcrossModels().
ModelResults()
}@function
def example() -> List[dagger.EvaluatorModelResult]:
return (
dag.evaluator()
.evals_across_models()
.model_results()
)@func()
example(): EvaluatorModelResult[] {
return dag
.evaluator()
.evalsAcrossModels()
.modelResults()
}check() 🔗
Return Type
Void ! Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
evals-across-models \
checkfunc (m *MyModule) Example(ctx context.Context) {
return dag.
Evaluator().
EvalsAcrossModels().
Check(ctx)
}@function
async def example() -> None:
return await (
dag.evaluator()
.evals_across_models()
.check()
)@func()
async example(): Promise<void> {
return dag
.evaluator()
.evalsAcrossModels()
.check()
}analyzeAndGenerateSystemPrompt() 🔗
Return Type
String ! Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
evals-across-models \
analyze-and-generate-system-promptfunc (m *MyModule) Example(ctx context.Context) string {
return dag.
Evaluator().
EvalsAcrossModels().
AnalyzeAndGenerateSystemPrompt(ctx)
}@function
async def example() -> str:
return await (
dag.evaluator()
.evals_across_models()
.analyze_and_generate_system_prompt()
)@func()
async example(): Promise<string> {
return dag
.evaluator()
.evalsAcrossModels()
.analyzeAndGenerateSystemPrompt()
}csv() 🔗
Return Type
String !Arguments
| Name | Type | Default Value | Description |
|---|---|---|---|
| noHeader | Boolean ! | false | Don't include a header. |
Example
dagger -m github.com/Superoldman96/dagger/modules/evaluator@ece2ead6ebe87962d8f1654724dd385c3ce5a23e call \
evals-across-models \
csv --no-header booleanfunc (m *MyModule) Example(ctx context.Context, noHeader bool) string {
return dag.
Evaluator().
EvalsAcrossModels().
Csv(ctx, noHeader)
}@function
async def example(no_header: bool) -> str:
return await (
dag.evaluator()
.evals_across_models()
.csv(no_header)
)@func()
async example(noHeader: boolean): Promise<string> {
return dag
.evaluator()
.evalsAcrossModels()
.csv(noHeader)
}ModelResult 🔗
modelName() 🔗
Return Type
String ! Example
Function EvaluatorModelResult.modelName is not accessible from the evaluator moduleFunction EvaluatorModelResult.modelName is not accessible from the evaluator moduleFunction EvaluatorModelResult.modelName is not accessible from the evaluator moduleFunction EvaluatorModelResult.modelName is not accessible from the evaluator modulespanId() 🔗
Return Type
String ! Example
Function EvaluatorModelResult.spanId is not accessible from the evaluator moduleFunction EvaluatorModelResult.spanId is not accessible from the evaluator moduleFunction EvaluatorModelResult.spanId is not accessible from the evaluator moduleFunction EvaluatorModelResult.spanId is not accessible from the evaluator moduleevalReports() 🔗
Return Type
[EvalResult ! ] ! Example
Function EvaluatorModelResult.evalReports is not accessible from the evaluator moduleFunction EvaluatorModelResult.evalReports is not accessible from the evaluator moduleFunction EvaluatorModelResult.evalReports is not accessible from the evaluator moduleFunction EvaluatorModelResult.evalReports is not accessible from the evaluator modulecheck() 🔗
Return Type
Void ! Example
Function EvaluatorModelResult.check is not accessible from the evaluator moduleFunction EvaluatorModelResult.check is not accessible from the evaluator moduleFunction EvaluatorModelResult.check is not accessible from the evaluator moduleFunction EvaluatorModelResult.check is not accessible from the evaluator moduleEvalResult 🔗
name() 🔗
Return Type
String ! Example
Function EvaluatorEvalResult.name is not accessible from the evaluator moduleFunction EvaluatorEvalResult.name is not accessible from the evaluator moduleFunction EvaluatorEvalResult.name is not accessible from the evaluator moduleFunction EvaluatorEvalResult.name is not accessible from the evaluator modulespanId() 🔗
Return Type
String ! Example
Function EvaluatorEvalResult.spanId is not accessible from the evaluator moduleFunction EvaluatorEvalResult.spanId is not accessible from the evaluator moduleFunction EvaluatorEvalResult.spanId is not accessible from the evaluator moduleFunction EvaluatorEvalResult.spanId is not accessible from the evaluator moduleerror() 🔗
Return Type
String ! Example
Function EvaluatorEvalResult.error is not accessible from the evaluator moduleFunction EvaluatorEvalResult.error is not accessible from the evaluator moduleFunction EvaluatorEvalResult.error is not accessible from the evaluator moduleFunction EvaluatorEvalResult.error is not accessible from the evaluator modulereport() 🔗
Return Type
String ! Example
Function EvaluatorEvalResult.report is not accessible from the evaluator moduleFunction EvaluatorEvalResult.report is not accessible from the evaluator moduleFunction EvaluatorEvalResult.report is not accessible from the evaluator moduleFunction EvaluatorEvalResult.report is not accessible from the evaluator modulesuccessRate() 🔗
Return Type
Float ! Example
Function EvaluatorEvalResult.successRate is not accessible from the evaluator moduleFunction EvaluatorEvalResult.successRate is not accessible from the evaluator moduleFunction EvaluatorEvalResult.successRate is not accessible from the evaluator moduleFunction EvaluatorEvalResult.successRate is not accessible from the evaluator moduletotalAttempts() 🔗
Return Type
Integer ! Example
Function EvaluatorEvalResult.totalAttempts is not accessible from the evaluator moduleFunction EvaluatorEvalResult.totalAttempts is not accessible from the evaluator moduleFunction EvaluatorEvalResult.totalAttempts is not accessible from the evaluator moduleFunction EvaluatorEvalResult.totalAttempts is not accessible from the evaluator moduleinputTokens() 🔗
Return Type
Integer ! Example
Function EvaluatorEvalResult.inputTokens is not accessible from the evaluator moduleFunction EvaluatorEvalResult.inputTokens is not accessible from the evaluator moduleFunction EvaluatorEvalResult.inputTokens is not accessible from the evaluator moduleFunction EvaluatorEvalResult.inputTokens is not accessible from the evaluator moduleoutputTokens() 🔗
Return Type
Integer ! Example
Function EvaluatorEvalResult.outputTokens is not accessible from the evaluator moduleFunction EvaluatorEvalResult.outputTokens is not accessible from the evaluator moduleFunction EvaluatorEvalResult.outputTokens is not accessible from the evaluator moduleFunction EvaluatorEvalResult.outputTokens is not accessible from the evaluator modulecheck() 🔗
Return Type
Void ! Example
Function EvaluatorEvalResult.check is not accessible from the evaluator moduleFunction EvaluatorEvalResult.check is not accessible from the evaluator moduleFunction EvaluatorEvalResult.check is not accessible from the evaluator moduleFunction EvaluatorEvalResult.check is not accessible from the evaluator module