Reading binary file using numpy in Streamlit

I have the following code snippet that works perfectly well in python, however, I am trying to use streamlit to upload the binary files, but I can’t seem to make it work. Here is the working code in python:

 def read_bin():
     dt = np.dtype([('col1','d'),('col2','d'),
                 ('col3','d'),('col4','d')])                 
     data = np.fromfile('Files/Bin Files/myfile.bin',dtype=dt,sep='')
     df = pd.DataFrame(data)
     return df

Now, I want to have the user to upload the binary file and perform the operation using streamlit interface. Here is the code which doesn’t work for me:

if options == ls:
    st.sidebar.title('Upload Binary File')
    bin_file = st.sidebar.file_uploader('Upload File', key = 'ls')
    if bin_file:
        st.sidebar.success('The file was uploaded successfully!', icon="✅")     
 
def read_bin():
     dt = np.dtype([('col1','d'),('col2','d'),
                 ('col3','d'),('col4','d')])>                 
     data = np.fromfile(bin_file,dtype=dt,sep='')
     df = pd.DataFrame(data)
     return df

if options == ls:
    if bin_file:
        displayBin = st.checkbox('Display File')
        if displayBin:
            df = read_bin()
            st.write(df)

So, basically, instead of showing the path to my folder where the bin file is located, I am showing the path to the uploaded file. But it doesn’t seem to work for me.
Appreciate any help.

Hello @serdar_bay, you might try creating a NamedTemporaryFile from the returned value of the file_uploader, and passing the filename from that to np.fromfile instead of passing bin_file itself. You can see this post for an example.

1 Like

Hi @blackary - Thanks for the feedback. I tried as you suggested, but it only returned an empty df with the header names, I guess I am not properly passing the arguments to the read_bin() function that is not able to reference the uploaded file. Could you please have a look at my code below and let me know if you spot what I might be doing wrong?

def read_bin(file):
     dt = np.dtype([('col1','d'),('col2','d'),
                 ('col3','d'),('col4','d')])               

     with NamedTemporaryFile(dir='.',suffix='.bin') as f:
        f.write(file.getbuffer())      
        data = np.fromfile(f.name, dtype=dt, sep='')
        df = pd.DataFrame(data)
        return df

if options == ls:
    if bin_file:
        displayBin = st.checkbox('Display File')
        if displayBin:
            df = read_bin(bin_file)
            st.write(df)

I also tried to use shutil library as suggested here post and it returned an empty df as well: Please see my code here below where I attempted to utilise shutil.

def read_bin(fl):  

    dt = np.dtype([('col1','d'),('col2','d'),
                 ('col3','d'),('col4','d')])   
    
    with open('par.bin', 'wb') as buffer:
        shutil.copyfileobj(fl, buffer)
        data = np.fromfile('par.bin', dtype=dt, sep='')
        df = pd.DataFrame(data)
        return df

if options == ls:
    if bin_file:
        displayBin = st.checkbox('Display File')
        if displayBin:
            df = read_bin(bin_file)
            st.write(df)

Also, is there no way to read bin files from the uploaded_files directly just like we can read csv or other file types without the need for creating a temporary file in buffer?

Ah, good point. Yes, you should be able to do it without any temporary files, and in fact you can if you use np.load instead of np.fromfile. Here is a simplified version of your script that works fine for me if I upload a npy file:


def read_bin(f):
    data = np.load(f)
    df = pd.DataFrame(data)
    return df

bin_file = st.file_uploader("Upload binary file", type="npy")
if bin_file is not None:
    displayBin = st.checkbox("Display File")
    if displayBin:
        df = read_bin(bin_file)
        st.write(df)

@blackary thanks!
I am actually trying to have a user to upload a bin file, not npy file. I can’t seem to make this work to read a bin file.

I tried @blackary’s code just removing the type="npy" restriction and uploaded a bin file. It read and displayed fine for me. Do you have an example bin file that you have having trouble reading? (Or a few lines of code to create a bin file to represent the way you have it formatted?) For my test I just created a bin file in python using a file in wb mode and didn’t have trouble reading it when I fed it into the file uploader.

Aha. I’m getting messages about a pickled binary. Playing around, I think I got it to work by using a buffer. With the given dt you posted, see if this works for you:

bin_file = st.file_uploader("Upload binary file", type='bin')
if bin_file is not None:
    displayBin = st.checkbox("Display File")
    if displayBin:
        try:
            df = np.frombuffer(bin_file.getbuffer(),dtype=dt)
            st.write(df)
        except:
            st.write('Could not read bin.')

Thanks @mathcatsand - It works well!

Just wondering, if by using np.buffer I can accomplish the same functionality I can achieve using np.fromfile. For example; I also have to read very large binary files in chunks, and the way I do it now is by utilizing count and offset attributes from np.fromfile method.

numpy.fromfile(*file* , *dtype=float* , *count=-1* , *sep=''* , *offset=0* , *** , *like=None* )

As I understand, np.buffer will not accept these attributes for offset and count. Any suggestion on how to get around reading large binary files in chunks without using np.fromfile?

numpy.fromfile doc

frombuffer also has count and offset optional keyword arguments, so I’d imagine it’s a pretty straightforward adaptation. Let me know if you have a problem, though.

https://numpy.org/doc/stable/reference/generated/numpy.frombuffer.html

Thanks @mathcatsand. I appreciate it. :slightly_smiling_face: