Howdy Streamlit folks,
I prototyped a hands-free inventory counting system for sommeliers recently.
A thorny part of this problem is inaccurate transcriptions in voice search. How do you handle situations where “Chateau Champignon” is transcribed as “shadow champagne on”?!
I have a demo of the voice search on Streamlit cloud: https://voice-search-with-whisper-duckdb-and-metaphone.streamlit.app/.
The code is all on Github: GitHub - voberoi/voice-search-with-whisper-duckdb-and-metaphone: This repository is a voice search demo using OpenAI Whisper, DuckDB, and the Metaphone algorithm. The associate blog post is here: https://vikramoberoi.com/helping-sommeliers-inventory-wine-faster-with-whisper-duckdb-and-metaphone/
I also wrote a blog post to contextualize the problem and explain how this works here: Helping sommeliers inventory wines faster with Whisper, DuckDB, and Metaphone
Cheers,
Vikram
2 Likes
Nice app, and a smart solution to the difficult task of mispronunciation, etc. Have you thought of wine label OCR as a possible solution too (similar to Vivino)?
Thanks for all the extra info… very handy.
Arvindra
Thanks!
Yes, we had thought of that as a solution.
It would likely have slowed things down in this case – there are poor lighting conditions in storage rooms and cellars, it requires use of your hands to take a photo, and the strings present in the restaurant’s inventory systems pull from both the front and back labels.
1 Like