Description:
AI EVAL Engineering Azure OpenAI; EVAL; Bench Marking Strong understanding of LLMs and generative AI concepts, including model behavior and output evaluationExperience with AI evaluation and benchmarking methodologies, including baseline creation and model comparison Hands-on expertise in Eval testing, creating structured test suites to measure accuracy, relevance, safety, and performanceAbility to define and apply evaluation metrics (precisionrecall, BLEUROUGE, F1, hallucination rate, latency,
Dec 15, 2025;
from:
dice.com