How to monitor the filesystem and have streamlit updated when some files are modified?

Hi all.
First, thanks a lot for streamlit. I agree with everybody else, it’s awesome.

I’m using watchdog to monitor filesystems event, and I’m struggling to get the events triggered by watchdog to translate into streamlit widgets to get updated.

I have a standard Watchdog() class:

class Watchdog(FileSystemEventHandler):

def __init__(self):
    self.last_modified = dt.datetime.now()

def hook(self, hook):
    self.hook = hook

def on_modified(self, event):
    if dt.datetime.now() - self.last_modified < dt.timedelta(seconds=1):
        print(event)
        self.hook()
        return
    else:
         self.last_modified = dt.datetime.now()
    print(f'Event type: {event}  path : {event.src_path}')
    print(event.is_directory) # This attribute is also available

And I have the following method (belonging to a Dashboard() class in my code that starts the watchdog when needed and the method that is called when the watchdog detects a modification to the filesystem:

def _monitor(self):
watchdog = Watchdog()
watchdog.hook(self._on_modified)
observer = Observer()
observer.schedule(watchdog, path=’.’, recursive=False)
observer.start()
def _on_modified(self):
print(‘From Dash, modified’, self.IDNumber, self.page, self.show_statistics)
self.show_statistics = True
st.sidebar.markdown(self.IDNumber)
self.IDNumber +=1

It works well, as everytime I do a modification to a file in the directory monitored by the watchdog, I get the “From Dash, modified” message printed on stdout, with the values of IDNumer, page and show_statistics.

Which means that the Dashboard._on_modified() method is triggered by the Watchdog.on_modified() one, and that’s what I wanted.
But none of the streamlit widgets are updated, and I’m struggling to find out how to do that (for instance, the value of self.IDNumber is incremented as expected (I see that on the stdout), but the streamlit server does not seem to see that the value has changed, so nothing on my browser is updated).
Any help appreciated.
Thanks!

1 Like

It seems that there were a few things missing in my code, and particularily in the way I manage the watchdog.
If I add observer.join(), in the _monitor() method then, the streamlit server is kept in a perpetual state of running, which seems to indicate that things are indeed constantly changing and streamlit is aware of it.
That’s not what I want to achieve, so I’ll keep digging to find a way to have both the watchdog and the streamlit server work together.
If anyone some suggestions on how to do it, I’m happy to read them (I did not find anything in the documentation, but maybe streamlit has already a builtin watchdog system?).
Thanks.

Hey @EricDepagne, welcome to Streamlit!

A few things to know about Streamlit’s execution model (apologies if any of this is already obvious):

  • To render your app to the browser, Streamlit runs your Python script from top to bottom. Various st.foo() function calls cause messages to be sent to the browser, which then draws widgets and text and whatnot.
  • When Streamlit detects that your app needs to update, it reruns your script from top-to-bottom again.
  • There are two situations that cause Streamlit to rerun your script:
    • The script itself (or any module it imports or transitively depends on) changes on disk.
    • The user interacts with a widget in the browser.

This is all to say, Streamlit does not consider your app to be “running” once it finishes its top-to-bottom execution. In your case, the Watchdog thread you spawned was still running in the Streamlit Python process, but Streamlit’s script-runner code was no longer executing your app (and in fact, this probably results in a leak, since that Watchdog thread is still watching files, but those callbacks aren’t doing anything.)

(Note also that if your app is unconditionally creating a watchdog.Observer instance, a new Observer will be created each time the app is re-run.)

What I think you want to do here is:

  1. Create a watchdog.Observer only once (and not each time Streamlit re-runs the app).
  2. Have that Observer trigger re-runs of your app.

#1 is achievable with some minor abuse of @st.cache, which lets you run a piece of code only once, rather than every time your app is re-run:

@st.cache
def install_monitor():
    watchdog = Watchdog()
    # watchdog.hook = ... <-- We'll deal with this next
    observer = Observer()
    observer.schedule(watchdog, path=’.’, recursive=False)
    observer.start()

#2 is achievable with some major abuse of Streamlit’s rerun logic, which uses Watchdog under the hood to detect when source files have changed and your app should be re-run. As we say in New England, this is wicked sketchy and subject to break. Basically, if you create a dummy module (let’s just say it’s called dummy.py), and import that module from your Streamlit app:

import streamlit as st
import dummy
# ... rest of your app

Then, if the dummy module is modified, Streamlit will re-run your app script, because your app imports dummy. So if your on_modified hook rewrites the contents of dummy.py when a watched file changes, it will trigger a rerun.

Here’s a working example:

import datetime as dt

from watchdog.events import FileSystemEventHandler
from watchdog.observers import Observer

import reload_test.dummy
import streamlit as st


class Watchdog(FileSystemEventHandler):
    def __init__(self, hook):
        self.hook = hook

    def on_modified(self, event):
        self.hook()


def update_dummy_module():
    # Rewrite the dummy.py module. Because this script imports dummy,
    # modifiying dummy.py will cause Streamlit to rerun this script.
    dummy_path = reload_test.dummy.__file__
    with open(dummy_path, "w") as fp:
        fp.write(f'timestamp = "{dt.datetime.now()}"')


@st.cache
def install_monitor():
    # Because we use st.cache, this code will be executed only once,
    # so we won't get a new Watchdog thread each time the script runs.
    observer = Observer()
    observer.schedule(
        Watchdog(update_dummy_module),
        path="reload_test/data",
        recursive=False)
    observer.start()


install_monitor()
st.write("data file updated!", dt.datetime.now())

This is the directory structure I’m using for the above app:

reload_test/
  data/  # <-- modifying a file in here will trigger a rerun
  __init__.py
  app.py
  dummy.py

If you edit any file inside the data/ directory, the app’s Watchdog instance will notice that and trigger the update_dummy_module callback. That function will then rewrite dummy.py (I have it just assigning the current timestamp to a dummy variable, so that the contents of dummy.py are different each time the rewrite is triggered). Then Streamlit will notice that dummy.py has been updated, and since your app imports that module, your app will be rerun.

All that said, we have tentative plans to allow triggering re-runs in a less hacky way. As you can tell, this sort of thing isn’t a use-case we were anticipating at launch!

4 Likes

Hi @tim

I expect you to have requests for all kinds of use cases that you did not expect at launch because Streamlit is so powerfull and simple to use. :slight_smile: And it’s not just for ML. Streamlit lowers the barrier to entry so much for creating apps in Python in general.

I have been trying out some of the alternatives like Voila and Dash for comparison. And even though they in principle can do more and allows finer control because they have more advanced widgets and are using call backs. They require you to spend so much more time on front end stuff and requires a deeper understanding of HTML, Javascripts and CSS even though it’s sort of wrapped in Python.

Marc

1 Like

Hi @tim
Many thanks for your very detailed answer. None of the explanations were obvious to me, so I really appreciate you going into so much details.
I tried to implement #1, but I could not get what I wanted, and very likely linked to my poor knowledge of how streamlit works internally (and the explanations you gave confirm that!). So I’ve decided to implement the major abuse of st.cache and it does exactly what I needed.

Once again, thanks a lot!

1 Like

Hi @tim,
thank you very much for detailed explanation.
It opened my eyes how Streamlit internals work.

1 Like