How can I get st.text_area to output a properly formatted URL?

Matt14 · September 12, 2024, 12:03am

Hello,

I am trying to pass a url or list of urls to Firecrawl for scraping using st.text_area, but I am getting the following error message (full traceback at the bottom:

HTTPError: Unexpected error during scrape URL: Status code 400. Bad Request - [{'code': 'custom', 'message': 'URL must have a valid top-level domain or be a valid path', 'path': ['url']}]

The relevant code (as the app is not deployed) is:

urls = []
        urls_input = (st.text_area("Input one or more urls separated by commas"))
        for url in urls_input:
            urls.append(url)

If I simply code in a url list and pass that to the Firecrawl loader it works just fine so I know that’s not the issue. What type of object does the st.text_area create, and what format is it in?

Any help is greatly appreciated!

Traceback:

File “/Users/mottzerella/Documents/Coding_Practice/ztm_milestone_projects/heart_disease_project/QA_LLM_APP/.conda/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/exec_code.py”, line 88, in exec_func_with_error_handling
result = func()
^^^^^^
File “/Users/mottzerella/Documents/Coding_Practice/ztm_milestone_projects/heart_disease_project/QA_LLM_APP/.conda/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py”, line 590, in code_to_exec
exec(code, module.dict)
File “/Users/mottzerella/Documents/Coding_Practice/ztm_milestone_projects/heart_disease_project/QA_LLM_APP/Project - Streamlit Front-End for Question-Answering App/QA_LLM_Pinecone.py”, line 263, in
chunks = website_search(urls, chunk_size = chunk_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/mottzerella/Documents/Coding_Practice/ztm_milestone_projects/heart_disease_project/QA_LLM_APP/Project - Streamlit Front-End for Question-Answering App/QA_LLM_Pinecone.py”, line 63, in website_search
data = [FireCrawlLoader(api_key = ‘fc-33e3a9fcc4564af789ba05632267159e’,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/mottzerella/Documents/Coding_Practice/ztm_milestone_projects/heart_disease_project/QA_LLM_APP/Project - Streamlit Front-End for Question-Answering App/QA_LLM_Pinecone.py”, line 66, in
).load() for url in urls]
^^^^^^
File “/Users/mottzerella/Documents/Coding_Practice/ztm_milestone_projects/heart_disease_project/QA_LLM_APP/.conda/lib/python3.11/site-packages/langchain_core/document_loaders/base.py”, line 30, in load
return list(self.lazy_load())
^^^^^^^^^^^^^^^^^^^^^^
File “/Users/mottzerella/Documents/Coding_Practice/ztm_milestone_projects/heart_disease_project/QA_LLM_APP/.conda/lib/python3.11/site-packages/langchain_community/document_loaders/firecrawl.py”, line 110, in lazy_load
firecrawl_docs = [self.firecrawl.scrape_url(self.url, params=self.params)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/Users/mottzerella/Documents/Coding_Practice/ztm_milestone_projects/heart_disease_project/QA_LLM_APP/.conda/lib/python3.11/site-packages/firecrawl/firecrawl.py”, line 88, in scrape_url
self._handle_error(response, ‘scrape URL’)
File “/Users/mottzerella/Documents/Coding_Practice/ztm_milestone_projects/heart_disease_project/QA_LLM_APP/.conda/lib/python3.11/site-packages/firecrawl/firecrawl.py”, line 391, in _handle_error
raise requests.exceptions.HTTPError(message, response=response)

edsaac · September 12, 2024, 1:48am

It returns a string or None (see st.text_area - Streamlit Docs).

Iterating over that string means getting each character individually. You probably meant to split that string first based on the commas.

system · March 11, 2025, 1:49am

This topic was automatically closed 180 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Url input query Community Cloud streamlit-cloud	9	1173	January 6, 2024
How to give an input option for url Using Streamlit cache	15	8172	August 15, 2022
Text input, scan area (st.text_input) Using Streamlit discussion	2	207	April 23, 2024
Appending to text area using session state Using Streamlit	5	1466	November 7, 2023
Handling input errors Using Streamlit	3	12333	January 12, 2022

How can I get st.text_area to output a properly formatted URL?

Related topics

Hello there 👋🏻

Cookie settings

Strictly necessary cookies

Performance cookies

Functional cookies

Targeting cookies