Wine voice search using Whisper, DuckDB, and Metaphone

Howdy Streamlit folks,

I prototyped a hands-free inventory counting system for sommeliers recently.

A thorny part of this problem is inaccurate transcriptions in voice search. How do you handle situations where “Chateau Champignon” is transcribed as “shadow champagne on”?!

I have a demo of the voice search on Streamlit cloud: https://voice-search-with-whisper-duckdb-and-metaphone.streamlit.app/.

The code is all on Github: GitHub - voberoi/voice-search-with-whisper-duckdb-and-metaphone: This repository is a voice search demo using OpenAI Whisper, DuckDB, and the Metaphone algorithm. The associate blog post is here: https://vikramoberoi.com/helping-sommeliers-inventory-wine-faster-with-whisper-duckdb-and-metaphone/

I also wrote a blog post to contextualize the problem and explain how this works here: Helping sommeliers inventory wines faster with Whisper, DuckDB, and Metaphone

Cheers,
Vikram

2 Likes

This is wild! Great work

Thank you @samthedataman!

Nice app, and a smart solution to the difficult task of mispronunciation, etc. Have you thought of wine label OCR as a possible solution too (similar to Vivino)?

Thanks for all the extra info… very handy.

Arvindra

Thanks!

Yes, we had thought of that as a solution.

It would likely have slowed things down in this case – there are poor lighting conditions in storage rooms and cellars, it requires use of your hands to take a photo, and the strings present in the restaurant’s inventory systems pull from both the front and back labels.

1 Like