With the latest openai updates, you can now get streaming token counts at the end of the stream.
When you enable usage tracking in streaming, your last response includes the token count. For example, the following content shows the last two responses from a streaming with the usage enabled.
ChatCompletionChunk(id='chatcmpl-9W9qvyei3ygljRjHtBI2a4UNmC8bz',
choices=[Choice(delta=ChoiceDelta(content=None, function_call=None,
role=None, tool_calls=None), finish_reason='stop', index=0,
logprobs=None)], created=1717451397, model='gpt-4o-2024-05-13',
object='chat.completion.chunk', system_fingerprint='fp_319be4768e',
usage=None)
ChatCompletionChunk(id='chatcmpl-9W9qvyei3ygljRjHtBI2a4UNmC8bz',
choices=[], created=1717451397, model='gpt-4o-2024-05-13',
object='chat.completion.chunk', system_fingerprint='fp_319be4768e',
usage=CompletionUsage(completion_tokens=100, prompt_tokens=641,
total_tokens=741))
To enable it add a stream_options parameter to the chat.completions.create (not sure this would work if you are using the assistant call)
messages=messages,
stream=True,
stream_options={"include_usage": True},
This changes the previous fix I mentioned above to watch for an empty stream result… so I added another if nesting to identify the end of the stream and capture the usage results.
if "ChoiceDelta" in str(response):
if response.choices[0].delta.content:
full_response += response.choices[0].delta.content
if response.usage:
completion_tokens = response.usage.completion_tokens
prompt_tokens = response.usage.prompt_tokens
message_placeholder.write(full_response + "▌")
There is a max_retries and timeout option that I am looking forward to try. GitHub - openai/openai-python: The official Python library for the OpenAI API
Also, if you have not visited your openAI dev account recently, you can now make projects to organize your API keys. This is great because you can configure limits and budgets for keys in a project, give names that prepend the keys with test / prod, and select which models an API key can use. For me this helps organize costs better.
OpenAI changelogs
https://platform.openai.com/docs/changelog