Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Or cookie or proxy problem in the interaction of server with your module. #373

Open
marcobaturan opened this issue Jan 23, 2025 · 0 comments

Comments

@marcobaturan
Copy link

Hi Jdepoix,
Let me explain the context: I develop a web-app application in streamlit which use your module to take the transcription from YT videos.
The server is http://render.com, the web-app framework is Streamlit. And after read your documentation I try to add a proper format proxy and a cookies file extracted from Firefox. I set the parameters in the server and in the app following the instructions of both platforms.
But when I run the program and extract the transcript I get this error:
"""
YouTubeRequestFailed: Could not retrieve a transcript for the video https://www.youtube.com/watch?v=429 Client Error: Too Many Requests for url: https://www.google.com/sorry/index?continue=https://www.youtube.com/watch%3Fv%3DBtpBjc6IWfA&q=EgQs49mQGLznybwGIjDf9F_hZQV1QCVtKe974TO7JrTmnVk83aHZAsc8ocp9h6ml0d9PjB8OtJE62aGnoQMyAXJaAUM! This is most likely caused by: Request to YouTube failed: BtpBjc6IWfA If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error. Also make sure that there are no open issues which already describe your problem!
Traceback:

File "/opt/render/project/src/.venv/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 88, in exec_func_with_error_handling
result = func()
^^^^^^
File "/opt/render/project/src/.venv/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 579, in code_to_exec
exec(code, module.dict)
File "/opt/render/project/src/pages/YouTube.py", line 44, in
yt_method(response, llm_api_key, language=language, selected_limit=limit)
File "/opt/render/project/src/engine.py", line 273, in yt_method
json = YouTubeTranscriptApi.get_transcript(id_video, languages=['es', 'en', 'fr', 'de', 'it', 'hr', 'pt']
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/render/project/src/.venv/lib/python3.11/site-packages/youtube_transcript_api/_api.py", line 154, in get_transcript
cls.list_transcripts(video_id, proxies, cookies)
File "/opt/render/project/src/.venv/lib/python3.11/site-packages/youtube_transcript_api/_api.py", line 71, in list_transcripts
return TranscriptListFetcher(http_client).fetch(video_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/render/project/src/.venv/lib/python3.11/site-packages/youtube_transcript_api/_transcripts.py", line 49, in fetch
self._extract_captions_json(self._fetch_video_html(video_id), video_id),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/render/project/src/.venv/lib/python3.11/site-packages/youtube_transcript_api/_transcripts.py", line 85, in _fetch_video_html
html = self._fetch_html(video_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/render/project/src/.venv/lib/python3.11/site-packages/youtube_transcript_api/_transcripts.py", line 97, in _fetch_html
return unescape(_raise_http_errors(response, video_id).text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/render/project/src/.venv/lib/python3.11/site-packages/youtube_transcript_api/_transcripts.py", line 38, in _raise_http_errors
raise YouTubeRequestFailed(error, video_id)
"""

The snippet of code:
"""
# Retrieve the transcript of the video in the specified language
json = YouTubeTranscriptApi.get_transcript(id_video, languages=['es', 'en', 'fr', 'de', 'it', 'hr', 'pt']
, proxies={
'http': 'https://cgirlzeq:bhlduner1c3x@207.244.217.165:6712'}
, cookies='cookies.txt')

    time.sleep(3)  # avoid overload google service

"""
As you can see, I put a proxy, a cookies file and inclusive a pause of 3 seconds. I get the proxy URL froma free list of webshare.io.
In local run perfect, in the deploy server of onrender then fails. It's look like the Google Service detect eh automatic process from a server.
Thank your time and effort.
Regards,
MB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant