Hi Sergio,
Below is the code:
import streamlit as st
import pandas as pd
import sqlalchemy
import urllib.parse
general_folder = “C:\\Web Apps”
config_file_name = “Client Data Processing Configuration.xlsx”
config_file_location = general_folder + ‘\’ + config_file_name
Load config file into the dataframe
xls = pd.ExcelFile(config_file_location)
config_df = pd.read_excel(xls, “Jobs”)
st.title(“Self Serve Data Processing”)
Identify a client
client_list = list(set(config_df[“Client Name”]))
client = st.selectbox(“Client”, client_list)
Identify the client job to run
config_df_client = config_df[config_df[“Client Name”] == client]
data_processing_jobs_for_client = list(config_df_client[“Data Processing Job”])
data_processing_job = st.selectbox(“Data processing Job”, data_processing_jobs_for_client)
Identify client and etl database server and database
client_db_server = list(config_df_client[“Database Server”])[0]
client_database = list(config_df_client[“Database Name”])[0]
etl_db_server = list(config_df_client[“ETL Server”])[0]
etl_database = “msdb”
def get_connection(db_server, database):
# Database connection
connection_string = (
r'Driver={SQL Server Native Client 11.0};'
r'Server=' + db_server + ';'
r'Database=' + database + ';'
r'Trusted_Connection=yes;'
)
params = urllib.parse.quote(connection_string)
engine = sqlalchemy.create_engine(
"mssql+pyodbc:///?odbc_connect=%s" % params
)
connection = engine.raw_connection()
return connection
client_connection = get_connection(client_db_server, client_database)
etl_connection = get_connection(etl_db_server, etl_database)
if st.button(“Start Job”):
etl_connection.execute(
"SET NOCOUNT ON; EXEC sp_start_job N'"+data_processing_job
+ "'; SET NOCOUNT OFF"
)
st.success("The job has started successfully")
if st.button(“Stop Job”):
etl_connection.execute(
"SET NOCOUNT ON; EXEC sp_stop_job N'"+data_processing_job
+ "'; SET NOCOUNT OFF"
)
st.success("The job has stopped successfully")
if st.button(“View Job Progress”):
sql = "SELECT * FROM [audit].[PackageStatusForLatestBatch] ORDER BY [Package Start Time]"
audit_view = pd.read_sql_query(sql, client_connection)
st.dataframe(audit_view)
Things to note:
-
PackageStatusForLatestBatch is a view built on top of the audit tables which shows SSIS package run progress. Those audit tables are populated by the SSIS packages. If you’d like to know how that works, it is a separate discussion on SSIS Management Framework. Alternatively, you can replace with your own way of identifying/showing progress.
-
There is a configuration excel file (“Client Data Processing Configuration.xlsx”), which contains information on clients and jobs per client as well as database server information. This excel file has 1 tab (Jobs), and within this tab the columns are:
Hope this helps.
Any questions, let me know.
Thanks!