We started a thread to discuss these kind of engineering tasks at Data engineering best practices
One quick thought, but an ever growing csv that is loaded into memory is not a scalable solution. Ideally you would load your data into a true database and query that