Best Eval AI Skills & MCP Servers
64 curated Eval skills and MCP servers — install any of them into Claude, Cursor, ChatGPT, n8n, or any AI stack with one command.
Workpaper
WorkPaper API, CLI evaluator, and MCP server for headless spreadsheet formulas in Node.js services and agents.
Cogmemai
CogmemAi: Autonomous Cognitive Memory for Any Ai System. 95.10% on LongMemEval (top published score on the field's hardest long-term memory benchmark) and 91% on LoCoMo (above human performance). Autonomous memory capture: your Ai's work is saved even whe
Recourse Cli
MCP server for AI agents to evaluate consequences before destructive actions. Analyzes Terraform plans, shell commands, and MCP tool calls.
Md Feedback
MCP server for markdown plan review — companion to the MD Feedback VS Code extension. AI agents read annotations, mark tasks done, evaluate quality gates, and generate session handoffs. 27 tools for Claude Code, Cursor, and other MCP-compatible clients.
Formulon
MCP server for Formulon Excel-compatible formula and workbook evaluation
Vulcn
Security evals for the AI era. Probes · Targets · Graders · Proof. Confirmed XSS / SQLi / BOLA / prompt-injection / MCP-RCE with reproducible proof attached to every finding.
Ori Memory
Cognitive architecture for persistent AI agent memory. Knowledge graph with learning retrieval, ACT-R decay, and spreading activation. Markdown-native, local-first, zero cloud. MCP server + CLI.
Sigil
Persistent memory for AI coding agents. Local-first knowledge engine with atomic facts, entity graph, and hybrid retrieval. Auto-integrated with Claude Code via hooks; MCP-native for Cursor, Continue, Cline, Windsurf, and any other MCP client.
Lightrag
Model Context Protocol (MCP) server for LightRAG - 30 fully working tools with complete RAG and Knowledge Graph integration
Skar
Skar turns a captured AI agent trace into a committed pytest regression test. MCP server + CLI. Use when a tool-using agent run fails and you want to lock the failure as an executable test.
Pdf Reader
MCP server for efficient PDF text extraction, search, and metadata retrieval for Claude Code
Paper Search Agent
MCP server for paper-search-agent: academic paper discovery, access planning, and full-text retrieval via campus network
Memory Lancedb
MCP server for LanceDB-backed long-term memory with hybrid retrieval (Vector + BM25), cross-encoder rerank, multi-scope isolation, and memory lifecycle management
Adaptive Recall
Adaptive memory system for AI applications. Multi-strategy retrieval, cognitive scoring, knowledge graph, and self-improving ML. Connects via MCP or REST API.
Mcp
Model Context Protocol server for digitalcalculator.info financial calculators. v0.3.0 ships 9 calculator tools (mortgage monthlyPayment, compound-interest futureValue, retirement401k projection, Social Security estimatedBenefit, paycheck netPay, IRA cont
Ds Rag
MCP-сервер для поиска UI-компонентов с использованием RAG на базе LanceDB и GigaChat
Merch Connector
MCP server that gives merchandising agents eyes on any storefront — scrape, audit, compare, roundtable analysis, and eval tracking via 11 tools.
Ask262
MCP server for understanding Javascript internals from ECMAScript specification. Provides vector search over the ECMAScript spec, section content retrieval, and JavaScript code execution with spec section tracing via engine262.
Agentdb
Self-learning vector memory for AI agents — single-file .rvf cognitive container with HNSW search, episodic Reflexion memory, causal graph + Cypher, 9 RL algorithms, Thompson Sampling bandit, 41 MCP tools, hybrid (BM25 + dense) retrieval, GNN attention. 1
Node Webrtc
MCP server for @agentdance/node-webrtc — lets AI agents discover, evaluate, and get started with the pure-TypeScript WebRTC stack
Cf Memory
Cloudflare-hosted MCP server for code indexing, retrieval, and assistant memory with a direct remote MCP endpoint and local stdio bridge.
Sdk
MCP server unit testing, end to end (e2e) testing, and server evals
Server
A Model Context Protocol (MCP) server for Ragie
Nia Web Eval Agent
NIA AI Web Evaluation Agent MCP Server - Autonomous browser testing and debugging
About Eval skills on iClaude
iClaude is the universal install layer for AI skills. Every Eval skill on this page can be installed into Claude Code, Claude Desktop, Cursor, ChatGPT, n8n, Codex, and more — using a single copy-paste command. No config drift, no per-stack adapters, no manual MCP wiring.