I have a few apps that I wish to run locally. Suppose they are called
app3.py. Each of these app the have their own local directory and if I run them one by one they will be running on distinct ports, namely
Each app uses a different NLP model, for instance in
app1.py I have:
def load_model(): model = BertForSequenceClassification.from_pretrained('ProsusAI/finbert') return model def load_tokenizer(): tokenizer = BertTokenizer.from_pretrained('ProsusAI/finbert') return tokenizer model = load_model() tokenizer = load_tokenizer()
app2.py I have:
def load_model(): model = AutoModelWithLMHead.from_pretrained('t5-base', return_dict=True) return model def load_tokenizer(): tokenizer = AutoTokenizer.from_pretrained('t5-base') return tokenizer model = load_model() tokenizer = load_tokenizer()
These models can be quite large, the issue is when I run multiple server beyond the second app the memory crashes. I wonder if there is a better way to run multiple apps? and not load model directly from the app but rather have some sort of backend to call the models using API?