OSError: It looks like the config file at 'roberta-base/config.json' is not a valid JSON file

I have been trying to deploy my Question Answering model on St Cloud.
As per steps, I pushed the local model contents to github with my folder name as ‘roberta-base’ which has the .bin model and json.config files.

It runs fine on my local machine but on streamlit it throws error during inference.
Repo: Ayush1702/ESG-Question-Answering (github.com)

@st.cache(allow_output_mutation=True)
def question_answering():
    model_name = "/app/question-answering/roberta-base/"
    model = AutoModelForQuestionAnswering.from_pretrained(model_name)
    return model

if st.button('Submit'):
    context_input = st.session_state.context
    question_input = st.session_state.question_default
    with st.spinner('Loading Model'):
        my_model = question_answering()
    tokenizer_path = "/app/question-answering/roberta-base/"
    tokenizer = AutoTokenizer.from_pretrained(tokenizer_path)
    question_answerer = pipeline("question-answering", model=my_model, tokenizer=tokenizer)
    result = question_answerer(question=question_input, context=context_input)
    st.write(HTML_WRAPPER.format(result['answer']), unsafe_allow_html=True)

>>> OSError: It looks like the config file at '/app/question-answering/roberta-base/config.json' is not a valid JSON file.

Hi @Ayush :wave:

Could you please link to your GitHub repo so we can help debug?

1 Like

Here

I’d firstly suggest using relative references instead of passing a GitHub URL as a file path. URLs are not valid file paths.

Before

model_name = "https://github.com/Ayush1702/ESG-Question-Answering/blob/main/roberta-base/"
tokenizer_path = "/app/esg-question-answering/roberta-base/"

After

model_name = "roberta-base/"
tokenizer_path = "roberta-base/"

Next, you likely need to fix your config.json file. Specifically, the value of _name_or_path key, whose current (and possibly incorrect) value is https://github.com/Ayush1702/ESG-Question-Answering/blob/main/roberta-base/.

If after the above changes, you’re still running into the error, it may be an issue with Git LFS as described here:

In that case, you would have to store your model elsewhere (e.g. Dropbox, AWS, GCP) and download and cache it from your app.

Modified Paths yet no change, config file is fine too
Git LFS is not an issue, it is <1GB limit as model size is ~500MB

I’m not sure what else to troubleshoot, Can you please check by cloning?

I’ve raised this issue with our team. It may or may not be an issue with Git LFS, we’re not certain :confused:

1 Like

Hi Snehan,

Any luck?
Can you try checking with Python 3.7, I’m unable to change python version from settings.

The Python version is likely not an issue. To change Python versions, you will have to delete the app, and select the appropriate Python version from the Advanced settings modal before you re-deploy your app:

We still don’t have an update on the GitLFS issue.