Case Study: Using Synthetic Data to Benchmark RAG Systems