# Evaluate a multi-turn chatbot conversation

This cookbook shows how to build a chatbot that evaluates conversation quality in real-time using Scorable. The example demonstrates a cooking assistant that uses OpenAI endpoint and evaluates the conversation after each interaction.

## Setup

Install the required packages:

```bash
pip install openai scorable
```

## Building an Evaluated Chatbot

This chatbot evaluates the quality of its responses using Scorable. It tracks the conversation history and assesses the helpfulness of the conversation after each interaction.

```python
from openai import OpenAI
from scorable import Scorable
from scorable.multiturn import Turn

class EvaluatedChat:
    def __init__(self, model="gpt-5.2", scorable_api_key=None, openai_api_key=None):
        self.system_prompt = (
            "You are a helpful cooking assistant that answers questions about recipes and cooking."
        )
        self.model = model
        self.openai_client = OpenAI(api_key=openai_api_key)
        self.scorable_client = Scorable(api_key=scorable_api_key)
        self.conversation_history = []

    def add_message(self, user_message):
        # Add user message to history
        self.conversation_history.append({"role": "user", "content": user_message})

        # Get response from OpenAI using Responses API
        response = self.openai_client.responses.create(
            model=self.model,
            instructions=self.system_prompt,
            input=self.conversation_history,
        )

        # Extract assistant response
        assistant_message = response.output_text
        self.conversation_history.append({"role": "assistant", "content": assistant_message})

        # Evaluate the conversation
        evaluation = self.evaluate_conversation()

        return {"response": assistant_message, "evaluation": evaluation}

    def evaluate_conversation(self):
        # Convert conversation history to Scorable Turns format
        turns = [Turn(role=m["role"], content=m["content"]) for m in self.conversation_history]

        # Evaluate helpfulness
        result = self.scorable_client.evaluators.Helpfulness(turns=turns)
        return {"score": result.score, "justification": result.justification}
```

## Example Usage

```python
# Initialize the chatbot
chat = EvaluatedChat(
    # Alternatively, you can use the SCORABLE_API_KEY environment variable
    scorable_api_key="your-scorable-api-key",
    openai_api_key="your-openai-api-key"
)

# First interaction
result = chat.add_message("How do I make a perfect scrambled egg?")
print("Assistant:", result['response'])
print(f"Helpfulness: {result['evaluation']['score']:.2f}")

# Second interaction
result = chat.add_message("What temperature should I use?")
print("Assistant:", result['response'])
print(f"Helpfulness: {result['evaluation']['score']:.2f}")
```

### Using Judges for Multiple Evaluators

To run multiple evaluators at once (e.g., helpfulness, clarity, politeness, custom evaluators), use a judge instead:

```python
def evaluate_conversation(self):
    turns = [Turn(role=m["role"], content=m["content"]) for m in self.conversation_history]

    # Run a judge with multiple evaluators
    result = self.scorable_client.judges.run(
        judge_id="your-judge-id",
        turns=turns
    )

    return {"evaluator_results": result.evaluator_results}
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.scorable.ai/concepts-and-examples/cookbooks/evaluate-chatbot-conversation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
