GPT-4-omni-mini is incredible

I’ve been doing a lot of testing with the new GPT-4-omni-mini model, and I’m blown away. It’s hard to believe it’s cheaper than either GPT 3.5 Turbo or Claude 3 Haiku.

I updated all my toy Streamlit apps to use it, and they are all the better for it. Emily is capable of having longer conversations and offering a more in-depth tarot session and just feels a lot “smarter” (https://emilytarot.com) and the stories generated at https://littlecattales.com definitely feel like they keep to the intended plot better, and it is certainly better and producing the stories without error in general. I can’t notice any difference on just a few attempts at https://thetroublewithbridges.com but it’s faster and cheaper at least.

I’ve been doing lots of testing of it within the context of an QA bot that must return reliable, accurate data as well, and it is performing really well in my particular conditions. Just a great model.

Anyone else tried it out yet?

2 Likes

I totally agree - for my use case it is just as fast and accurate as 4o. The most impressive part is the openai api usage chart, my costs are 1/10th what they were with 4o.

As I’ve continued to build with and utilize omni mini, I just continue to be blown away by the performance for the price point. This is seriously opening a lot of doors for including more AI features where it would have been cost-prohibitive before.

It’s very, very good at function calling and instruction following. I’m currently prototyping a natural language data prep feature for a client and the accuracy that 4o-mini achieves in function calling is quite striking, particularly when compared to 3.5 turbo. I’m still early in the prototyping phase so don’t have any quantitative assessments but after a couple hundred manual tests I’ve only seen a handful of minor mistakes, even when dealing with pretty wide datasets and complex instructions requiring multiple function calls to achieve the requested data transformations. I spent considerably more on 7 requests to omni compared to 158 on omni-mini

This is still by far my favorite model. I have plenty of custom apps and tools using LLMs now, but Omni Mini has proven itself to be the most capable and cost efficient. Here’s a few months of my usage on my note taking / daily logging app

I’ve built tools for it to do a number of things and it’s just incredible at using tools for accomplishing the tasks I set it. It pretty much exclusively handles my calendar “stuff” now. I’ll put in some example snippets of how I’m interacting with it and what it does for me regularly. I’ve ready plenty of arguments about how LLMs cannot perform true reasoning, but it sure looks awful close to me…


Another six months later and this is still by far my favorite and most used model. I’m doing millions of tokens a day now and don’t get me wrong I love Haiku 3.5, Sonnet 3.7, Mistral Small, GPT-4o, and several others, but GPT 4o mini is a task following, tool using BEAST and 100% my first choice.

We still deploy almost entirely GPT-4o into production, but getting it working for whatever I’m building with mini in the prototype stage has always translated into higher quality responses with the more costly models than before I tuned my prompts.

Streamlit is still my go-to prototyping tool, along with a locally running instance of Arize Phoenix | Phoenix for very rapid iteration on small tweaks to the prompts / tool wordings.

Wow, have you not tried Gemini 2.5 yet?

No I haven’t, most of my clients have only approved AWS Bedrock and OpenAI for deployments so I haven’t ventured much beyond what they offer. However looking at the pricing of Gemini 2.5, they don’t really seem like they are in the same “class” of model:

Input Price Per million tokens
GPT 4 Omni Mini - $0.15 ($.075 for cached inputs)
Gemini 2.5 Pro - $1.25

Output Price Per million tokens
GPT 4 Omni Mini - $0.60
Gemini 2.5 Pro - $10.00

Those are pretty dramatically different!

Have you had good success with Gemini?

Gemini 2.0 Flash is amazing for the price, and Flash 2.5 should be here soon, likely priced about the same ($0.10/million In, $0.40/million Out).