Running multiple apps and crashing the memory due to having large NLP models

I have a few apps that I wish to run locally. Suppose they are called app1.py, app2.py and app3.py. Each of these app the have their own local directory and if I run them one by one they will be running on distinct ports, namely http://localhost:8501/, http://localhost:8502/ and http://localhost:8503/.

Each app uses a different NLP model, for instance in app1.py I have:

def load_model():
    model = BertForSequenceClassification.from_pretrained('ProsusAI/finbert')
    return model

def load_tokenizer():
    tokenizer = BertTokenizer.from_pretrained('ProsusAI/finbert')
    return tokenizer

model = load_model()
tokenizer = load_tokenizer()

while in app2.py I have:

def load_model():

    model = AutoModelWithLMHead.from_pretrained('t5-base', return_dict=True)

    return model

def load_tokenizer():

    tokenizer = AutoTokenizer.from_pretrained('t5-base')

    return tokenizer

model = load_model()

tokenizer = load_tokenizer()

These models can be quite large, the issue is when I run multiple server beyond the second app the memory crashes. I wonder if there is a better way to run multiple apps? and not load model directly from the app but rather have some sort of backend to call the models using API?