Page 14 of 80 · 1,324 tools
Labelbox
Labelbox is a data-centric AI platform for labeling training data, managing datasets, and evaluating model quality. Used by…
Vellum AI
Vellum is an AI product development platform with prompt versioning, side-by-side comparisons, and evaluation workflows. Product and engineering…
Mirascope
Mirascope is a Python toolkit for building LLM applications with clean abstractions for prompts, calls, and extractions. Type-safe…
HoneyHive
HoneyHive is an AI evaluation and observability platform for teams building LLM applications. Dataset management, automated evaluations, and…
Agenta
Agenta is an open-source LLMOps platform for prompt management, evaluation, and deployment. Teams collaborate on prompts, run systematic…
Eden AI
Eden AI provides a unified API for 100+ AI models across text, image, audio, and video. Test and…
Portkey AI
Portkey is an AI gateway providing unified access to 200+ LLMs with built-in observability, caching, and fallbacks. Production-grade…
Unify AI
Unify automatically routes LLM requests to the cheapest or fastest provider based on your optimization criteria. Benchmark any…
DeepEval
DeepEval is an open-source LLM evaluation framework with 14+ evaluation metrics including hallucination, answer relevancy, and faithfulness. pytest-style…
TruLens
TruLens is an open-source framework for evaluating and tracking LLM applications. Feedback functions assess truthfulness, harmlessness, and helpfulness…
🔍
Don't see your tool?
We review every submission within 24–48 hours. Free listing, no strings attached.
Submit Your Tool