guardrails

MCP-compatible LLM G-Eval guardrails checker basing on:

OpenAI Cookbook "How to implement LLM guardrails";
Promptfoo G-Eval implementation.

Features:

G-Eval based evaluation;
Customizable guardrails;
Different providers and models basing on ai-sdk toolkit;
MCP-compatible API;
Both local and server mode.

API

guardrails({ server, provider, model, criteria, threshold=0.5 }) - creates instance for local usage or connected to server if server is defined. Options:
- server - url of guardrails server;
- provider - name of provider;
- model - name of model;
- criteria - guardrail criteria, could be with or without g-eval steps. Ignored if server is defined. If steps are not defined, they will be created on-fly with additional LLM request. In server mode loads criteria from file by path process.env.CRITERIA_PATH. In client-service usage better to define steps. See examples.
- threshold=0.5 - threshold of g-eval score to determine if guardrail is valid or not. Lower is valid, higher is not.
async listTools() - returns MCP definition of available guardrails.

async callTool({ name, arguments }) - call guardrail validation in MCP manner. Returns JSON like:

{
    "name": "harm", // name of called guardrail
    "valid": false, // conclusion if guardrail valid comparing score with threshold
    "score": 0.8, // g-eval score
    "reason": "seems provided text is slightly harmful" // LLM reason of g-eval score
}

Environment variables:

GUARDRAILS_PORT - listening port of guardrails server;
CRITERIA_PATH - path to criteria file in server mode. Must be provided in server mode.

Supported providers

openai,
azure,
anthropic,
bedrock,
google,
mistral,
deepseek,
perplexity.

In order to add own provider import { PROVIDERS } from 'guardrails/local'; and extend the dictionary with ai-sdk compatible provider.

Examples

Local usage:

    import guardrails from 'guardrails';
    const gd = guardrails({ provider: 'openai', model: 'gpt-4o-mini', criteria: { harm: 'text is harmful' }});
    await gd.callTool({ name: 'harm': arguments: { prompt: 'Who is John Galt?' }});

Server usage

create file with criteria, for example criteria.json:

{
    "harm": {
        "description": "Text is about deliberate injury or damage to someone or something.",
        "steps":[
            "Identify content that depicts or encourages violence or self-harm.",
            "Check for derogatory or hateful language targeting individuals or groups.",
            "Assess if the text contains misleading or false information that could cause real-world harm.",
            "Determine the severity and potential impact of the harmful content."
        ]
    }
}

set environment variable with criteria path:
```
export CRITERIA_PATH=./criteria.json
```
run server:
```
./node_modules/bin/guardrails
```

use client:

import guardrails from 'guardrails';
const gd = guardrails({ server: 'http://localhost:3000', provider: 'openai', model: 'gpt-4o-mini' });
await gd.callTool({ name: 'harm': arguments: { prompt: 'Who is John Galt?' }});

Agentic usage

Can be found here.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
examples		examples
src		src
.gitignore		.gitignore
.nvmrc		.nvmrc
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

guardrails

Features:

API

Environment variables:

Supported providers

Examples

Local usage:

Server usage

Agentic usage

About

Uh oh!

Releases

Packages

Uh oh!

Languages

schipiga/guardrails

Folders and files

Latest commit

History

Repository files navigation

guardrails

Features:

API

Environment variables:

Supported providers

Examples

Local usage:

Server usage

Agentic usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages