Today I am thrilled to announce the successful completion of my end-to-end data engineering project on Geospatial Lightning Atmospheric Data
.
Overview:
I analyzed geo-located, time-tagged lightning event data using various open-source tools and technologies, including Prefect, Docker, SQLite/Spatialite, and Streamlit.
Components:
Docker Container: Developed docker image for portability.
Prefect 2.0: Seamlessly orchestrated and automated the data workflows.
Pandas: Leveraged for data exploration, transformation, and analytics.
SQLite with Spatialite extension: Stored and managed geospatial data for single user.
Streamlit Dashboard app: GIS data viewer, filters, summary plots and charts.
Achievements:
Used some basic transformations using pandas.
Conducted data analysis & visualization on weather datasets.
Successfully handled large-scale data processing.
Built a portable python data pipeline.
Implemented robust data engineering workflows.
Skills Enhanced:
Data Engineering
Data Analysis
Containerization
GIS Visualization
Explore the Project:
Dashboard app: https://lightning-containers.streamlit.app
GitHub Repo: GitHub - BayoAdejare/lightning-containers: Docker powered starter for geospatial analysis of lightning atmospheric data.
Acknowledgments:
Thanks to the open-source community for valuable tools and technologies and to US National Oceanic and Atmospheric Administration (NOAA) for the datasets.
Looking Forward:
Excited about the journey ahead, exploring more opportunities, and continuously growing in the fascinating world of data engineering.
Connection:
I am open to feedback, discussions, and connecting with fellow data & software engineer. Feel free to explore the project, share your thoughts, or connect with me for further discussions.
Architecture:
