Wine voice search using Whisper, DuckDB, and Metaphone

Howdy Streamlit folks,

I prototyped a hands-free inventory counting system for sommeliers recently.

A thorny part of this problem is inaccurate transcriptions in voice search. How do you handle situations where “Chateau Champignon” is transcribed as “shadow champagne on”?!

I have a demo of the voice search on Streamlit cloud:

The code is all on Github: GitHub - voberoi/voice-search-with-whisper-duckdb-and-metaphone: This repository is a voice search demo using OpenAI Whisper, DuckDB, and the Metaphone algorithm. The associate blog post is here:

I also wrote a blog post to contextualize the problem and explain how this works here: Helping sommeliers inventory wines faster with Whisper, DuckDB, and Metaphone



This is wild! Great work

Thank you @samthedataman!

Nice app, and a smart solution to the difficult task of mispronunciation, etc. Have you thought of wine label OCR as a possible solution too (similar to Vivino)?

Thanks for all the extra info… very handy.



Yes, we had thought of that as a solution.

It would likely have slowed things down in this case – there are poor lighting conditions in storage rooms and cellars, it requires use of your hands to take a photo, and the strings present in the restaurant’s inventory systems pull from both the front and back labels.

1 Like