Reading binary file using numpy in Streamlit

serdar_bay · January 5, 2023, 2:16pm

I have the following code snippet that works perfectly well in python, however, I am trying to use streamlit to upload the binary files, but I can’t seem to make it work. Here is the working code in python:

 def read_bin():
     dt = np.dtype([('col1','d'),('col2','d'),
                 ('col3','d'),('col4','d')])                 
     data = np.fromfile('Files/Bin Files/myfile.bin',dtype=dt,sep='')
     df = pd.DataFrame(data)
     return df

Now, I want to have the user to upload the binary file and perform the operation using streamlit interface. Here is the code which doesn’t work for me:

if options == ls:
    st.sidebar.title('Upload Binary File')
    bin_file = st.sidebar.file_uploader('Upload File', key = 'ls')
    if bin_file:
        st.sidebar.success('The file was uploaded successfully!', icon="✅")     
 
def read_bin():
     dt = np.dtype([('col1','d'),('col2','d'),
                 ('col3','d'),('col4','d')])>                 
     data = np.fromfile(bin_file,dtype=dt,sep='')
     df = pd.DataFrame(data)
     return df

if options == ls:
    if bin_file:
        displayBin = st.checkbox('Display File')
        if displayBin:
            df = read_bin()
            st.write(df)

So, basically, instead of showing the path to my folder where the bin file is located, I am showing the path to the uploaded file. But it doesn’t seem to work for me.
Appreciate any help.

blackary · January 5, 2023, 2:47pm

Hello @serdar_bay, you might try creating a NamedTemporaryFile from the returned value of the file_uploader, and passing the filename from that to np.fromfile instead of passing bin_file itself. You can see this post for an example.

serdar_bay · January 6, 2023, 10:49am

Hi @blackary - Thanks for the feedback. I tried as you suggested, but it only returned an empty df with the header names, I guess I am not properly passing the arguments to the read_bin() function that is not able to reference the uploaded file. Could you please have a look at my code below and let me know if you spot what I might be doing wrong?

def read_bin(file):
     dt = np.dtype([('col1','d'),('col2','d'),
                 ('col3','d'),('col4','d')])               

     with NamedTemporaryFile(dir='.',suffix='.bin') as f:
        f.write(file.getbuffer())      
        data = np.fromfile(f.name, dtype=dt, sep='')
        df = pd.DataFrame(data)
        return df

if options == ls:
    if bin_file:
        displayBin = st.checkbox('Display File')
        if displayBin:
            df = read_bin(bin_file)
            st.write(df)

I also tried to use shutil library as suggested here post and it returned an empty df as well: Please see my code here below where I attempted to utilise shutil.

def read_bin(fl):  

    dt = np.dtype([('col1','d'),('col2','d'),
                 ('col3','d'),('col4','d')])   
    
    with open('par.bin', 'wb') as buffer:
        shutil.copyfileobj(fl, buffer)
        data = np.fromfile('par.bin', dtype=dt, sep='')
        df = pd.DataFrame(data)
        return df

if options == ls:
    if bin_file:
        displayBin = st.checkbox('Display File')
        if displayBin:
            df = read_bin(bin_file)
            st.write(df)

Also, is there no way to read bin files from the uploaded_files directly just like we can read csv or other file types without the need for creating a temporary file in buffer?

blackary · January 6, 2023, 2:27pm

Ah, good point. Yes, you should be able to do it without any temporary files, and in fact you can if you use np.load instead of np.fromfile. Here is a simplified version of your script that works fine for me if I upload a npy file:


def read_bin(f):
    data = np.load(f)
    df = pd.DataFrame(data)
    return df

bin_file = st.file_uploader("Upload binary file", type="npy")
if bin_file is not None:
    displayBin = st.checkbox("Display File")
    if displayBin:
        df = read_bin(bin_file)
        st.write(df)

serdar_bay · January 6, 2023, 4:51pm

@blackary thanks!
I am actually trying to have a user to upload a bin file, not npy file. I can’t seem to make this work to read a bin file.

mathcatsand · January 7, 2023, 10:35pm

I tried @blackary’s code just removing the type="npy" restriction and uploaded a bin file. It read and displayed fine for me. Do you have an example bin file that you have having trouble reading? (Or a few lines of code to create a bin file to represent the way you have it formatted?) For my test I just created a bin file in python using a file in wb mode and didn’t have trouble reading it when I fed it into the file uploader.

mathcatsand · January 8, 2023, 11:56am

Aha. I’m getting messages about a pickled binary. Playing around, I think I got it to work by using a buffer. With the given dt you posted, see if this works for you:

bin_file = st.file_uploader("Upload binary file", type='bin')
if bin_file is not None:
    displayBin = st.checkbox("Display File")
    if displayBin:
        try:
            df = np.frombuffer(bin_file.getbuffer(),dtype=dt)
            st.write(df)
        except:
            st.write('Could not read bin.')

serdar_bay · January 8, 2023, 2:01pm

Thanks @mathcatsand - It works well!

Just wondering, if by using np.buffer I can accomplish the same functionality I can achieve using np.fromfile. For example; I also have to read very large binary files in chunks, and the way I do it now is by utilizing count and offset attributes from np.fromfile method.

numpy.fromfile(*file* , *dtype=float* , *count=-1* , *sep=''* , *offset=0* , *** , *like=None* )

As I understand, np.buffer will not accept these attributes for offset and count. Any suggestion on how to get around reading large binary files in chunks without using np.fromfile?

numpy.fromfile doc

mathcatsand · January 8, 2023, 3:42pm

frombuffer also has count and offset optional keyword arguments, so I’d imagine it’s a pretty straightforward adaptation. Let me know if you have a problem, though.

https://numpy.org/doc/stable/reference/generated/numpy.frombuffer.html

serdar_bay · January 9, 2023, 9:24am

Thanks @mathcatsand. I appreciate it.

system · January 9, 2024, 9:24am

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to use file_uploader Using Streamlit	1	666	January 12, 2022
Using file_uploader with bzip2 encoded file Using Streamlit	2	325	September 2, 2023
How to get dataframe in excel to a variable using file uploader Using Streamlit file-upload	4	7792	October 23, 2021
Uploading a CSV file using file_uploader Using Streamlit file-upload , pandas	3	7480	July 6, 2023
Uploading CSV and excel files Using Streamlit	4	28498	May 13, 2022

Reading binary file using numpy in Streamlit

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies