Hello good people,
I have a specific use case where I need the user to upload xls files which might have different 2 encoding types.
Unfortunately using encoding = “auto” doesn’t do the trick with one of them, so I would to try opening the file with one encoder and then move to the next one if the first fails. Right now the only workaround I could think of is the following:
I think your try/except solution could work structured a little differently. Could the file_uploader widget be moved out of the try block, and instead you take those bytes and try to convert the Excel file? Meaning, take the file upload in an encoding that covers both the GB18030 and utf-16-le encoding ranges (UTF-8?), then convert to each of the encodings you might get and see if it gives you the right answer?
If I do that the uploading doesn’t give any error, and if the first try is successful I get my data and everything works.
The problem is when the right encoding is the second one( ‘gb18030’ in the example above), when that is the case I get this error:
“EmptyDataError: No columns to parse from file”
Then if I try first with ‘gb18030’ everything works again
It is almost like after the first attempt the variable uploaded_file is lost somehow (I’m sure this is not the right technical explanation).
Do I need somehow to cache the uploaded_file in order to try several things after?
Thanks again
F.
Yes, this is what I meant, and it looks like you are close. I think what’s happening here is that file_uploader returns a BytesIObuffer, which in most cases functions the same way as having the file itself. The one difference is, once you read the buffer, it’s empty.
Try putting a statement like file_bytes = uploaded_file.read() after the uploaded_file line, then try to read the file_bytes object instead. My theory here is that file_bytes will be a bytestring, and that will persist across the try/except block.