# Evalite

This guide shows how to integrate [Scorable](https://scorable.ai) LLM-as-a-Judge evaluators into your [Evalite](https://evalite.dev) test suites.

## Installation

```bash
npm install @root-signals/scorable evalite
```

## Setup

```typescript
import { evalite, createScorer } from "evalite";
import { Scorable } from "@root-signals/scorable";

// Initialize Scorable Client
const scorable = new Scorable({
  apiKey: process.env.SCORABLE_API_KEY,
});
```

## Creating a Scorable Scorer

Define a reusable scorer factory that can be used across your test suites:

```typescript
export const createScorableScorer = (judgeName: string) => {
  return createScorer<string, string>({
    name: "Scorable Judge",
    description: `Evaluates output using Scorable Judge: ${judgeName}`,
    scorer: async ({ input, output }) => {
      try {
        const result = await scorable.judges.executeByName(judgeName, {
          request: input,
          response: output,
          tags: ["test", "<git-hash>"]
        });

        // Alternatively, call an evaluator directly
        // const result = await scorable.evaluators.executeByName("Accuracy", {
        //   request: input,
        //   response: output,
        //   tags: ["test", "<git-hash>"]
        // });

        // Returns the average score of all metrics
        const scores = result.evaluator_results.map((r) => r.score);
        return {
          score: scores.length > 0
            ? scores.reduce((a, b) => a + b, 0) / scores.length
            : 0,
          metadata: {
            rationale: result.evaluator_results
              .map((r) => r.justification)
              .join("\n"),
          }
        }
      } catch (error) {
        console.error("Scorable evaluation failed:", error);
        return 0;
      }
    },
  });
};
```

## Using in Evalite Test Suites

```typescript
evalite("AI Assistant Multi-Task Evaluation", {
  data: async () => [
    {
      input: "Archive my last 3 newsletters and let me know when done.",
    },
    {
      input: "Create a label called 'Receipts' and apply it to my latest Amazon email.",
    },
    {
      input: "Summarize the thread from 'Travel Booking' about my flight.",
    },
  ],
  task: async (input) => {
    // Your LLM logic here
    const response = await myAiWorkflow(input);
    return response;
  },
  scorers: [
    createScorableScorer("Gmail Assistant Response Auditor")
  ],
});
```
