Refine the GenAI system based on evaluation insights
The end goal for GenAI output evaluation is to use the insights to refine your LLM system until it is production-worthy and meets your criteria.
Now that you have a trustworthy benchmark, you can use a variety to techniques to improve your GenAI system. These include:
- LLM fine tuning: Fine tuning allows you to change the LLM's parameters to adapt its performance to your criteria. Snorkel integrates with Amazon SageMaker, one fine-tuning option. Our LLM fine tuning and alignment tutorial has an example of how to use SageMaker to fine tune your LLM.
- RAG tuning: On request, Snorkel can provide an example notebook with instructions for using Snorkel to tune a RAG system.
- Prompt development: Snorkel Flow's prompt development features will be released in early 2025.
To follow along with an example of how to use Snorkel's evaluation framework, see the Evaluate GenAI output tutorial.
Once the model has been sufficiently improved, it can undergo another round of evaluation. Continue to track your evaluation progress until the system meets your performance thresholds.