Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more flexibility on Audio Transcriber #1

Open
ohjho opened this issue Jan 13, 2025 · 1 comment
Open

more flexibility on Audio Transcriber #1

ohjho opened this issue Jan 13, 2025 · 1 comment

Comments

@ohjho
Copy link

ohjho commented Jan 13, 2025

first of all, just want to say this is a great package! 💯 Thanks for putting this out there ❤️

on to the audio transcription issue...

  • have you thought about using a "local" Transcriber like in openscenesense-ollama?
  • or alternatively, loading whisper using the huggingface serverless inference api (e.g. see below)
  • and an option to turn off audio transcription (which might be useful if the video has no audio track)
import requests

API_URL = "https://api-inference.huggingface.co/models/openai/whisper-small"
headers = {"Authorization": "Bearer hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"}

def query(filename):
    with open(filename, "rb") as f:
        data = f.read()
    response = requests.post(API_URL, headers=headers, data=data)
    return response.json()

output = query("sample1.flac")
@ymrohit
Copy link
Owner

ymrohit commented Jan 13, 2025

Hi @ohjho , Thank you for your feedback and suggestion. I think the huggingface serverless inference api can be implemented and would help this library a lot along with the option to turn off the audio. I thought of adding local inference but I didn't want to make this library heavy with powerful dependencies. The main goal of this library is to be light weight and fast so that it can run on any device. I am open for further feedback, please feel free to let me know. Also please feel free to contribute any new ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants