[Showcase] 20centAI: A Minimal AI Chat Client That Switches Models Mid-Chat (and Saves 90% Tokens)

Hi Streamlit Community!

I wanted to see if I could build a chat client that doesn’t just talk to one AI – but switches between Claude, Mistral, and DeepSeek on the fly, without losing context.

:cow: What is 20centAI?

It starts as a simple chat interface. But under the hood, it demonstrates three patterns for building robust AI integrations:

  • Provider Abstraction: Swap models mid-conversation (no code changes)
  • Rolling Compression: Older messages get summarized automatically (~90% token savings)
  • Explicit State: No Streamlit magic – everything is transparent and testable

:hammer_and_wrench: The Tech Stack:

  • Streamlit: UI and session management
  • Python Protocols: Clean provider interface
  • Heuristic Compression: Simple but effective context management
  • ~600 lines total: Built to be read, not just run

:new_moon: What I learned:

  • Context management is harder than it looks
  • Token costs add up fast – compression isn’t optional for long chats
  • Sometimes the simplest heuristic beats a complex algorithm

:link: Check it out:

:new_moon: The question for you:
If you were building a multi-provider chat app, what pattern would you try first?
Or: What’s the one thing you’d change about this approach?

No pressure – just curious what other Streamlit devs think.

P.S. If you find a bug: That’s not a feature. Please tell me.