# Why Anything?

## The Challenge: Unreliable Software

In traditional software engineering, we rely on unit tests and deterministic outputs. `assert(2 + 2 == 4)` always passes.

**Generative AI breaks this paradigm.** 💥

LLMs are:

1. **Non-deterministic**: The same input can yield different outputs.
2. **Unstructured**: They output free text, not structured data.
3. **Hard to Control**: Their behavior depends on prompts and context, which are often ambiguous.

> **This makes LLM-powered applications inherently unpredictable and hard to test.**

## Why not just use public benchmarks?

Public leaderboards (like HuggingFace Open LLM Leaderboard) measure **generic model capabilities**, not **your application's performance**.

* **Relevance**: Knowing a model is good at high school math doesn't tell you if it will be polite to your customers.
* **Context**: Benchmarks don't know about your RAG context, your system prompts, or your specific business rules.
* **Data Leakage**: Public benchmarks are often contaminated.

> **You need to measure and monitor your specific use case. 🎯**

However, finding the exact metrics to measure often takes time and iteration. You might start with generic checks and evolve into highly specific business rules as you learn more about your model's failure modes.

In short,

> **You want to measure and monitor your specific LLM-powered automation, not the generic academic capabilities of an LLM.**


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.scorable.ai/overview/why-anything.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
