evaluator

The Evaluator provides a comprehensive framework for testing AI models against
custom evaluations, analyzing failures, and iteratively refining system prompts
to improve performance. It supports parallel execution across multiple models,
automatic prompt optimization, and detailed reporting with telemetry integration.

Key features:
- Run evaluations across multiple AI models in parallel
- Automatically analyze failures and generate improved system prompts
- Export results to CSV format for further analysis
- Compare evaluation results between different runs
- Integrated with Dagger's telemetry for detailed tracing

More info: https://dagger.io/blog/evals-as-code

Example (RunMyEvals)

no available example in current language

// Run the Dagger evals across the major model providers.
func (dev *Examples) Evaluator_RunMyEvals(
	ctx context.Context,
	// Run particular evals, or all evals if unspecified.
	// +optional
	evals []string,
	// Run particular models, or all models if unspecified.
	// +optional
	models []string,
) error {
	myEvaluator := dag.Evaluator().
		WithDocsFile(dev.Source.File("core/llm_docs.md")).
		WithoutDefaultSystemPrompt().
		WithSystemPromptFile(dev.Source.File("core/llm_dagger_prompt.md")).
		WithEvals([]*dagger.EvaluatorEval{
			// FIXME: ideally this list would live closer to where the evals are
			// defined, but it's not possible for a module to return an interface type
			// https://github.com/dagger/dagger/issues/7582
			dag.Evals().Basic().AsEvaluatorEval(),
			dag.Evals().BuildMulti().AsEvaluatorEval(),
			dag.Evals().BuildMultiNoVar().AsEvaluatorEval(),
			dag.Evals().WorkspacePattern().AsEvaluatorEval(),
			dag.Evals().ReadImplicitVars().AsEvaluatorEval(),
			dag.Evals().UndoChanges().AsEvaluatorEval(),
			dag.Evals().CoreAPI().AsEvaluatorEval(),
			dag.Evals().ModuleDependencies().AsEvaluatorEval(),
			dag.Evals().Responses().AsEvaluatorEval(),
		})
	return myEvaluator.
		EvalsAcrossModels(dagger.EvaluatorEvalsAcrossModelsOpts{
			Evals:  evals,
			Models: models,
		}).
		Check(ctx)
}

no available example in current language

no available example in current language

Installation

dagger install github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b

Entrypoint

Return Type

Evaluator !

Arguments

Name	Type	Default Value	Description
model	String	-	The AI model name to use for the evaluator agent (e.g., "gpt-4o", "claude-sonnet-4-0"). If not specified, uses the default model configured in the environment.

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \

func (m *MyModule) Example() *dagger.Evaluator  {
	return dag.
			Evaluator()
}

@function
def example() -> dagger.Evaluator:
	return (
		dag.evaluator()
	)

@func()
example(): Evaluator {
	return dag
		.evaluator()
}

Types

Evaluator 🔗

docs() 🔗

The documentation that defines expected model behavior and serves as the reference for evaluations.

Return Type

File !

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 docs

func (m *MyModule) Example() *dagger.File  {
	return dag.
			Evaluator().
			Docs()
}

@function
def example() -> dagger.File:
	return (
		dag.evaluator()
		.docs()
	)

@func()
example(): File {
	return dag
		.evaluator()
		.docs()
}

systemPrompt() 🔗

A system prompt file that will be applied to all evaluations to provide consistent guidance.

Return Type

File !

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 system-prompt

func (m *MyModule) Example() *dagger.File  {
	return dag.
			Evaluator().
			SystemPrompt()
}

@function
def example() -> dagger.File:
	return (
		dag.evaluator()
		.system_prompt()
	)

@func()
example(): File {
	return dag
		.evaluator()
		.systemPrompt()
}

disableDefaultSystemPrompt() 🔗

Whether to disable Dagger’s built-in default system prompt (usually not recommended).

Return Type

Boolean !

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 disable-default-system-prompt

func (m *MyModule) Example(ctx context.Context) bool  {
	return dag.
			Evaluator().
			DisableDefaultSystemPrompt(ctx)
}

@function
async def example() -> bool:
	return await (
		dag.evaluator()
		.disable_default_system_prompt()
	)

@func()
async example(): Promise<boolean> {
	return dag
		.evaluator()
		.disableDefaultSystemPrompt()
}

evaluatorModel() 🔗

The AI model to use for the evaluator agent that performs analysis and prompt generation.

Return Type

String !

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 evaluator-model

func (m *MyModule) Example(ctx context.Context) string  {
	return dag.
			Evaluator().
			EvaluatorModel(ctx)
}

@function
async def example() -> str:
	return await (
		dag.evaluator()
		.evaluator_model()
	)

@func()
async example(): Promise<string> {
	return dag
		.evaluator()
		.evaluatorModel()
}

withSystemPrompt() 🔗

Set a system prompt to be provided to all evaluations.

The system prompt provides foundational instructions and context that will be applied to every evaluation run. This helps ensure consistent behavior across all models and evaluations.

Return Type

Evaluator !

Arguments

Name	Type	Default Value	Description
prompt	String !	-	The system prompt text to use for all evaluations.

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 with-system-prompt --prompt string

func (m *MyModule) Example(prompt string) *dagger.Evaluator  {
	return dag.
			Evaluator().
			WithSystemPrompt(prompt)
}

@function
def example(prompt: str) -> dagger.Evaluator:
	return (
		dag.evaluator()
		.with_system_prompt(prompt)
	)

@func()
example(prompt: string): Evaluator {
	return dag
		.evaluator()
		.withSystemPrompt(prompt)
}

withSystemPromptFile() 🔗

Set a system prompt from a file to be provided to all evaluations.

This allows you to load a system prompt from an external file, which is useful for managing longer prompts or when the prompt content is maintained separately from your code.

Return Type

Evaluator !

Arguments

Name	Type	Default Value	Description
file	File !	-	The file containing the system prompt to use for all evaluations.

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 with-system-prompt-file --file file:path

func (m *MyModule) Example(file *dagger.File) *dagger.Evaluator  {
	return dag.
			Evaluator().
			WithSystemPromptFile(file)
}

@function
def example(file: dagger.File) -> dagger.Evaluator:
	return (
		dag.evaluator()
		.with_system_prompt_file(file)
	)

@func()
example(file: File): Evaluator {
	return dag
		.evaluator()
		.withSystemPromptFile(file)
}

withoutDefaultSystemPrompt() 🔗

Disable Dagger’s built-in system prompt.

You probably don’t need to use this - Dagger’s system prompt provides the fundamentals for how the agent interacts with Dagger objects. This is primarily exposed so that we (Dagger) can iteratively test the default system prompt itself.

Return Type

Evaluator !

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 without-default-system-prompt

func (m *MyModule) Example() *dagger.Evaluator  {
	return dag.
			Evaluator().
			WithoutDefaultSystemPrompt()
}

@function
def example() -> dagger.Evaluator:
	return (
		dag.evaluator()
		.without_default_system_prompt()
	)

@func()
example(): Evaluator {
	return dag
		.evaluator()
		.withoutDefaultSystemPrompt()
}

withDocs() 🔗

Set the documentation content that the system prompt should enforce.

This documentation serves as the reference material that evaluations will test against. The system prompt should guide the model to follow the principles and patterns defined in this documentation.

Return Type

Evaluator !

Arguments

Name	Type	Default Value	Description
prompt	String !	-	The documentation content as a string.

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 with-docs --prompt string

func (m *MyModule) Example(prompt string) *dagger.Evaluator  {
	return dag.
			Evaluator().
			WithDocs(prompt)
}

@function
def example(prompt: str) -> dagger.Evaluator:
	return (
		dag.evaluator()
		.with_docs(prompt)
	)

@func()
example(prompt: string): Evaluator {
	return dag
		.evaluator()
		.withDocs(prompt)
}

withDocsFile() 🔗

Set the documentation file that the system prompt should enforce.

This allows you to load documentation from an external file. The documentation serves as the reference material for what behavior the evaluations should test, and the system prompt should guide the model to follow these principles.

Return Type

Evaluator !

Arguments

Name	Type	Default Value	Description
file	File !	-	The file containing the documentation to reference.

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 with-docs-file --file file:path

func (m *MyModule) Example(file *dagger.File) *dagger.Evaluator  {
	return dag.
			Evaluator().
			WithDocsFile(file)
}

@function
def example(file: dagger.File) -> dagger.Evaluator:
	return (
		dag.evaluator()
		.with_docs_file(file)
	)

@func()
example(file: File): Evaluator {
	return dag
		.evaluator()
		.withDocsFile(file)
}

withEval() 🔗

WithEval adds a single evaluation to the evaluator.

Return Type

Evaluator !

Arguments

Name	Type	Default Value	Description
eval	Interface !	-	The evaluation to add to the list of evals to run.

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 with-eval

func (m *MyModule) Example(eval ) *dagger.Evaluator  {
	return dag.
			Evaluator().
			WithEval(eval)
}

@function
def example(eval: ) -> dagger.Evaluator:
	return (
		dag.evaluator()
		.with_eval(eval)
	)

@func()
example(eval: ): Evaluator {
	return dag
		.evaluator()
		.withEval(eval)
}

withEvals() 🔗

WithEvals adds multiple evaluations to the evaluator.

Return Type

Evaluator !

Arguments

Name	Type	Default Value	Description
evals	[Interface ! ] !	-	The list of evaluations to add to the evaluator.

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 with-evals

func (m *MyModule) Example(evals []) *dagger.Evaluator  {
	return dag.
			Evaluator().
			WithEvals(evals)
}

@function
def example(evals: List[]) -> dagger.Evaluator:
	return (
		dag.evaluator()
		.with_evals(evals)
	)

@func()
example(evals: []): Evaluator {
	return dag
		.evaluator()
		.withEvals(evals)
}

evalsAcrossModels() 🔗

Run evals across models.

Models run in parallel, and evals run in series, with all attempts in parallel.

Return Type

EvalsAcrossModels !

Arguments

Name	Type	Default Value	Description
evals	[String ! ]	-	Evals to run. Defaults to all.
models	[String ! ]	-	Models to run evals across. Defaults to all.
attempts	Integer	-	Attempts to run each eval. Defaults to a per-provider value.

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 evals-across-models

func (m *MyModule) Example() *dagger.EvaluatorEvalsAcrossModels  {
	return dag.
			Evaluator().
			EvalsAcrossModels()
}

@function
def example() -> dagger.EvaluatorEvalsAcrossModels:
	return (
		dag.evaluator()
		.evals_across_models()
	)

@func()
example(): EvaluatorEvalsAcrossModels {
	return dag
		.evaluator()
		.evalsAcrossModels()
}

explore() 🔗

Explore evaluations across models to identify patterns and issues.

This function uses an LLM agent to act as a quality assurance engineer, automatically running evaluations across different models and identifying interesting patterns. It focuses on finding evaluations that work on some models but fail on others, helping to identify model-specific weaknesses or strengths.

The agent will avoid re-running evaluations that fail consistently across all models, but will retry evaluations that show partial success to gather more insights.

Return Type

[String ! ] !

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 explore

func (m *MyModule) Example(ctx context.Context) []string  {
	return dag.
			Evaluator().
			Explore(ctx)
}

@function
async def example() -> List[str]:
	return await (
		dag.evaluator()
		.explore()
	)

@func()
async example(): Promise<string[]> {
	return dag
		.evaluator()
		.explore()
}

generateSystemPrompt() 🔗

Generate a new system prompt based on the provided documentation.

This function uses an LLM to analyze the documentation and generate a system prompt that captures the key rules and principles. The process involves first interpreting the documentation to extract all inferable rules, then crafting a focused system prompt that provides proper framing without being overly verbose or turning into meaningless word salad.

The generated prompt aims to establish foundation and context while allowing the model flexibility to apply the guidelines appropriately.

Return Type

String !

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 generate-system-prompt

func (m *MyModule) Example(ctx context.Context) string  {
	return dag.
			Evaluator().
			GenerateSystemPrompt(ctx)
}

@function
async def example() -> str:
	return await (
		dag.evaluator()
		.generate_system_prompt()
	)

@func()
async example(): Promise<string> {
	return dag
		.evaluator()
		.generateSystemPrompt()
}

iterate() 🔗

Iterate runs all evals across all models in a loop until all of the evals succeed, analyzing the failures and generating a new system prompt to course-correct.

Return Type

String !

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 iterate

func (m *MyModule) Example(ctx context.Context) string  {
	return dag.
			Evaluator().
			Iterate(ctx)
}

@function
async def example() -> str:
	return await (
		dag.evaluator()
		.iterate()
	)

@func()
async example(): Promise<string> {
	return dag
		.evaluator()
		.iterate()
}

compare() 🔗

Compare two CSV evaluation reports and generate an analysis.

This function takes two CSV files containing evaluation results (typically from different runs or with different system prompts) and generates a detailed comparison report. The comparison includes success rate changes, token usage differences, and trace links for debugging.

The generated report is analyzed by an LLM to provide insights into the differences and their potential causes.

Return Type

String !

Arguments

Name	Type	Default Value	Description
before	File !	-	The CSV file containing the baseline evaluation results.
after	File !	-	The CSV file containing the new evaluation results to compare against.

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 compare --before file:path --after file:path

func (m *MyModule) Example(ctx context.Context, before *dagger.File, after *dagger.File) string  {
	return dag.
			Evaluator().
			Compare(ctx, before, after)
}

@function
async def example(before: dagger.File, after: dagger.File) -> str:
	return await (
		dag.evaluator()
		.compare(before, after)
	)

@func()
async example(before: File, after: File): Promise<string> {
	return dag
		.evaluator()
		.compare(before, after)
}

EvalsAcrossModels 🔗

EvalsAcrossModels represents the results of running evaluations across multiple models.

traceId() 🔗

Return Type

String !

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 evals-across-models \
 trace-id

func (m *MyModule) Example(ctx context.Context) string  {
	return dag.
			Evaluator().
			EvalsAcrossModels().
			TraceId(ctx)
}

@function
async def example() -> str:
	return await (
		dag.evaluator()
		.evals_across_models()
		.trace_id()
	)

@func()
async example(): Promise<string> {
	return dag
		.evaluator()
		.evalsAcrossModels()
		.traceId()
}

modelResults() 🔗

Return Type

[ModelResult ! ] !

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 evals-across-models \
 model-results

func (m *MyModule) Example() []*dagger.EvaluatorModelResult  {
	return dag.
			Evaluator().
			EvalsAcrossModels().
			ModelResults()
}

@function
def example() -> List[dagger.EvaluatorModelResult]:
	return (
		dag.evaluator()
		.evals_across_models()
		.model_results()
	)

@func()
example(): EvaluatorModelResult[] {
	return dag
		.evaluator()
		.evalsAcrossModels()
		.modelResults()
}

check() 🔗

Return Type

Void !

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 evals-across-models \
 check

func (m *MyModule) Example(ctx context.Context)   {
	return dag.
			Evaluator().
			EvalsAcrossModels().
			Check(ctx)
}

@function
async def example() -> None:
	return await (
		dag.evaluator()
		.evals_across_models()
		.check()
	)

@func()
async example(): Promise<void> {
	return dag
		.evaluator()
		.evalsAcrossModels()
		.check()
}

analyzeAndGenerateSystemPrompt() 🔗

AnalyzeAndGenerateSystemPrompt performs comprehensive failure analysis and generates an improved system prompt.

This function implements a sophisticated multi-stage analysis process:

Report Generation: Collects all evaluation reports from different models and organizes them for analysis, providing a comprehensive view of successes and failures.
Initial Analysis: Generates a summary of current understanding, grading overall results and focusing on failure patterns. Uses specific examples from reports to support the analysis.
Cross-Reference Analysis: Compares the analysis against the original documentation and system prompt, suggesting improvements without over-specializing for specific evaluations. Focuses on deeper, systemic issues rather than superficial fixes.
Success Pattern Analysis: Compares successful results with failed ones to identify what made the successful cases work. Extracts generalizable principles from the documentation and prompts that led to success.
Prompt Generation: Creates a new system prompt incorporating all insights, focusing on incremental improvements rather than complete rewrites unless absolutely necessary.

The process emphasizes finding general, root-cause issues over specific evaluation failures, ensuring that improvements help broadly rather than just fixing individual test cases.

Return Type

String !

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 evals-across-models \
 analyze-and-generate-system-prompt

func (m *MyModule) Example(ctx context.Context) string  {
	return dag.
			Evaluator().
			EvalsAcrossModels().
			AnalyzeAndGenerateSystemPrompt(ctx)
}

@function
async def example() -> str:
	return await (
		dag.evaluator()
		.evals_across_models()
		.analyze_and_generate_system_prompt()
	)

@func()
async example(): Promise<string> {
	return dag
		.evaluator()
		.evalsAcrossModels()
		.analyzeAndGenerateSystemPrompt()
}

csv() 🔗

CSV exports evaluation results to CSV format for analysis and comparison.

This function generates a CSV representation of all evaluation results across models, including performance metrics, token usage, and trace information for debugging. The CSV includes the following columns:

model: The name of the AI model tested
eval: The name of the evaluation that was run
input_tokens: Number of input tokens used
output_tokens: Number of output tokens generated
total_attempts: Total number of evaluation attempts made
success_rate: Success rate as a decimal (0.0 to 1.0)
trace_id: Unique identifier for the trace
model_span_id: Span ID for the model execution
eval_span_id: Span ID for the specific evaluation

The CSV format makes it easy to import results into spreadsheet applications, databases, or data analysis tools for further processing.

Return Type

String !

Arguments

Name	Type	Default Value	Description
noHeader	Boolean !	false	Don't include a header row in the CSV output.

Example

dagger -m github.com/sipsma/dagger/modules/evaluator@72068a2264889c08dcac909651ae6e9bd4d4f01b call \
 evals-across-models \
 csv --no-header boolean

func (m *MyModule) Example(ctx context.Context, noHeader bool) string  {
	return dag.
			Evaluator().
			EvalsAcrossModels().
			Csv(ctx, noHeader)
}

@function
async def example(no_header: bool) -> str:
	return await (
		dag.evaluator()
		.evals_across_models()
		.csv(no_header)
	)

@func()
async example(noHeader: boolean): Promise<string> {
	return dag
		.evaluator()
		.evalsAcrossModels()
		.csv(noHeader)
}

ModelResult 🔗

ModelResult represents the evaluation results for a single model.

modelName() 🔗

Return Type

String !

Example

Function EvaluatorModelResult.modelName is not accessible from the evaluator module

Function EvaluatorModelResult.modelName is not accessible from the evaluator module

Function EvaluatorModelResult.modelName is not accessible from the evaluator module

Function EvaluatorModelResult.modelName is not accessible from the evaluator module

spanId() 🔗

Return Type

String !

Example

Function EvaluatorModelResult.spanId is not accessible from the evaluator module

Function EvaluatorModelResult.spanId is not accessible from the evaluator module

Function EvaluatorModelResult.spanId is not accessible from the evaluator module

Function EvaluatorModelResult.spanId is not accessible from the evaluator module

evalReports() 🔗

Return Type

[EvalResult ! ] !

Example

Function EvaluatorModelResult.evalReports is not accessible from the evaluator module

Function EvaluatorModelResult.evalReports is not accessible from the evaluator module

Function EvaluatorModelResult.evalReports is not accessible from the evaluator module

Function EvaluatorModelResult.evalReports is not accessible from the evaluator module

check() 🔗

Return Type

Void !

Example

Function EvaluatorModelResult.check is not accessible from the evaluator module

Function EvaluatorModelResult.check is not accessible from the evaluator module

Function EvaluatorModelResult.check is not accessible from the evaluator module

Function EvaluatorModelResult.check is not accessible from the evaluator module

EvalResult 🔗

EvalResult represents the results of a single evaluation.

name() 🔗

Return Type

String !

Example

Function EvaluatorEvalResult.name is not accessible from the evaluator module

Function EvaluatorEvalResult.name is not accessible from the evaluator module

Function EvaluatorEvalResult.name is not accessible from the evaluator module

Function EvaluatorEvalResult.name is not accessible from the evaluator module

spanId() 🔗

Return Type

String !

Example

Function EvaluatorEvalResult.spanId is not accessible from the evaluator module

Function EvaluatorEvalResult.spanId is not accessible from the evaluator module

Function EvaluatorEvalResult.spanId is not accessible from the evaluator module

Function EvaluatorEvalResult.spanId is not accessible from the evaluator module

error() 🔗

Return Type

String !

Example

Function EvaluatorEvalResult.error is not accessible from the evaluator module

Function EvaluatorEvalResult.error is not accessible from the evaluator module

Function EvaluatorEvalResult.error is not accessible from the evaluator module

Function EvaluatorEvalResult.error is not accessible from the evaluator module

report() 🔗

Return Type

String !

Example

Function EvaluatorEvalResult.report is not accessible from the evaluator module

Function EvaluatorEvalResult.report is not accessible from the evaluator module

Function EvaluatorEvalResult.report is not accessible from the evaluator module

Function EvaluatorEvalResult.report is not accessible from the evaluator module

successRate() 🔗

Return Type

Float !

Example

Function EvaluatorEvalResult.successRate is not accessible from the evaluator module

Function EvaluatorEvalResult.successRate is not accessible from the evaluator module

Function EvaluatorEvalResult.successRate is not accessible from the evaluator module

Function EvaluatorEvalResult.successRate is not accessible from the evaluator module

totalAttempts() 🔗

Return Type

Integer !

Example

Function EvaluatorEvalResult.totalAttempts is not accessible from the evaluator module

Function EvaluatorEvalResult.totalAttempts is not accessible from the evaluator module

Function EvaluatorEvalResult.totalAttempts is not accessible from the evaluator module

Function EvaluatorEvalResult.totalAttempts is not accessible from the evaluator module

inputTokens() 🔗

Return Type

Integer !

Example

Function EvaluatorEvalResult.inputTokens is not accessible from the evaluator module

Function EvaluatorEvalResult.inputTokens is not accessible from the evaluator module

Function EvaluatorEvalResult.inputTokens is not accessible from the evaluator module

Function EvaluatorEvalResult.inputTokens is not accessible from the evaluator module

outputTokens() 🔗

Return Type

Integer !

Example

Function EvaluatorEvalResult.outputTokens is not accessible from the evaluator module

Function EvaluatorEvalResult.outputTokens is not accessible from the evaluator module

Function EvaluatorEvalResult.outputTokens is not accessible from the evaluator module

Function EvaluatorEvalResult.outputTokens is not accessible from the evaluator module

check() 🔗

Return Type

Void !

Example

Function EvaluatorEvalResult.check is not accessible from the evaluator module

Function EvaluatorEvalResult.check is not accessible from the evaluator module

Function EvaluatorEvalResult.check is not accessible from the evaluator module

Function EvaluatorEvalResult.check is not accessible from the evaluator module