We're looking for tools to monitor our bots' quality at scale. This ask would be for a module where our QA analysts can audit and correct evaluations generated by an LLM over batches of conversations
Created by Monica Rangel
·