# Frequently Asked Questions

### Terminology

<details>

<summary>What is <em>Intent</em> for?</summary>

Intent is the high-level, human-understandable description of the attribute an Evaluator measures. For example: “To measure how clearly the returns handler explains the 20% discount offer on the next purchase”.

</details>

<details>

<summary>What are <em>Datasets</em>?</summary>

Datasets allow you to bring test data for benchmarking (*Root* & *Custom*) and optimizing (*Custom*) evaluators.

</details>

### Behaviour

<details>

<summary>Does <em>Intent</em> change the behaviour of the evaluator?</summary>

Yes. Evaluator *Intent* does alter the evaluator behaviour.

</details>

<details>

<summary>Does Calibration change the behaviour of the evaluator?</summary>

No. Calibration is for benchmarking (testing) the evaluators to understand whether they are "calibrated" to your expected/desired behaviour or not. Calibration samples do not alter the behaviour of the evaluators.

</details>

<details>

<summary>How do <em>Demonstration</em>s work?</summary>

Demonstrations are used as in-context few-shot samples combined with our well-tuned meta-prompt. They are not utilized for supervised fine-tuning (SFT).

</details>

### Usage

<details>

<summary>Our stack is not in Python, can we still use Scorable?</summary>

Absolutely. We have a [REST API](https://api.docs.scorable.ai/reference/v1_evaluators_execute_by_name_create) that you can run from your favourite tech stack.

<picture><source srcset="https://1145415225-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUoDNiw7ySSaFXXkaGCic%2Fuploads%2Fgit-blob-abf32462aa1d3ac14aa1782421664730bf3d8774%2FScreenshot%202025-03-26%20at%2015.50.55.png?alt=media" media="(prefers-color-scheme: dark)"><img src="https://1145415225-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FUoDNiw7ySSaFXXkaGCic%2Fuploads%2Fgit-blob-e56c4a1a9dfad8f888e6deab57a75fd1b890e9cc%2FScreenshot%202025-03-26%20at%2015.51.07.png?alt=media" alt=""></picture>

</details>

<details>

<summary>Do I need to have Calibrations for all Custom Evaluators?</summary>

You do not have to bring *Calibration* samples but we strongly recommend at least a handful of them in order to understand the behaviour of the evaluators.

</details>

<details>

<summary>Can I change the behaviour of the evaluator by bringing labeled data?</summary>

You can change the behaviour of your Custom Evaluators by bringing annotated samples as *Demonstration*s. Behaviour of *Root Evaluators* can not be altered.

</details>

<details>

<summary>Can I run a previous version of a Custom Evaluator?</summary>

Yes.

</details>

<details>

<summary>If we already have a ground truth expected output, can we use your evaluators?</summary>

Yes. Various evaluators from us support reference-based evaluations where you can bring your ground truth expected responses. See our [evaluator catalogue here](https://docs.scorable.ai/quick-start/usage/evaluators#list-of-evaluators-maintained-by-root-signals).

</details>

<details>

<summary>How can I differentiate evaluations and related statistics for different applications (or versions) of mine?</summary>

You can use arbitrary tags for evaluation executions. See the [example here](https://sdk.rootsignals.ai/en/latest/examples.html#monitoring-llm-pipelines-with-tags).

</details>

<details>

<summary>Can I integrate Scorable evaluators to experiment tracking tools such as MLflow etc.?</summary>

Yes. Our evaluators return a structured response (e.g. a dictionary) with scores, justifications, tags etc. These results can be logged to any experiment tracking system or database similar to any other metric, metadata, or attribute.

</details>

### Models

<details>

<summary>What is the LLM that powers ready-made Root evaluators? Can I change it?</summary>

Root Evaluators are powered by various LLMs under the hood. This can not be changed except for on-premise deployments.

</details>

<details>

<summary>Can I see which models are GDPR compliant?</summary>

Yes, you can see model metadata under *<mark style="color:purple;">Settings > LLM Accounts</mark>*. More info can be found under [Control & Compliance](https://docs.scorable.ai/usage/usage/models#control-and-compliance) section of our docs.

</details>

<details>

<summary>Are Evaluators/Judges deterministic?</summary>

No. We have tight confidence intervals (for the same input) but small fluctuations are to be expected. Expected standard deviations can be found [in our docs](https://docs.scorable.ai/usage/usage/evaluators#determinism).

</details>
