Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE]: I cant hear anything #1397

Open
4 of 6 tasks
mozibites opened this issue Nov 6, 2024 · 10 comments
Open
4 of 6 tasks

[ISSUE]: I cant hear anything #1397

mozibites opened this issue Nov 6, 2024 · 10 comments

Comments

@mozibites
Copy link

Voice Changer Version

MMVCServerSIO_mac_onnxcpu-nocuda_v1.5.3.17b

Operational System

Mac M1 Laptop

GPU

none

Read carefully and check the options

  • I've tried to Clear Settings
  • Sample/Default Models are working
  • I've tried to change the Chunk Size
  • GUI was successfully launched
  • I've read the tutorial
  • I've tried to extract to another folder (or re-extract) the .zip file

Model Type

RVC

Issue Description

The voice isn't working for me, my microphone isn't the problem.When I press start on the voice changer the volume stays at 0.Ive tried running this in the application and my browser but nothing works

Application Screenshot

Screenshot 2024-11-06 at 6 57 34 PM

Logs on console

MMVCServerSIO_mac_onnxcpu-nocuda_v.1.5.3.17b\ (1)/startHttp.command ; exit;
Booting PHASE :main
PYTHON:3.10.9 (main, Jan 15 2023, 23:00:56) [Clang 14.0.0 (clang-1400.0.29.202)]
Activating the Voice Changer.
[Voice Changer] download sample catalog. samples_0004_t.json
[Voice Changer] download sample catalog. samples_0004_o.json
[Voice Changer] download sample catalog. samples_0004_d.json
[Voice Changer] model_dir is already exists. skip download samples.
Internal_Port:18888
protocol: HTTP
-- ---- --
Please open the following URL in your browser.
http://:/
In many cases, it will launch when you access any of the following URLs.
http://127.0.0.1:18888/
2024-11-06 18:58:50.868 voice-changer-native-client[6810:356899] WARNING: Secure coding is not enabled for restorable state! Enable secure coding by implementing NSApplicationDelegate.applicationSupportsSecureRestorableState: and returning YES.
[VCClient] Access http://127.0.0.1:18888/
[VCClient] wait web server...0 http://127.0.0.1:18888/
Booting PHASE :main
Booting PHASE :main
Booting PHASE :MMVCServerSIO
[Voice Changer] VoiceChangerManager initializing...
[Voice Changer] model slot is changed -1 -> 2
................RVC
[Voice Changer] [RVCr2] Creating instance
VoiceChangerV2 Initialized (GPU_NUM(cuda):0, mps_enabled:True, onnx_device:CPU)
[Voice Changer][RVC]: update_settings gpu:0
[Voice Changer][RVCr2] Initializing...
[Voice Changer] generate new embedder. (no embedder)
[Voice Changer] Loading index...
[Voice Changer] Index file is not found
GENERATE INFERENCER<voice_changer.RVC.inferencer.OnnxRVCInferencer.OnnxRVCInferencer object at 0x3131e5390>
GENERATE EMBEDDER<voice_changer.RVC.embedder.OnnxContentvec.OnnxContentvec object at 0x3131e4e80>
GENERATE PITCH EXTRACTOR<voice_changer.RVC.pitchExtractor.RMVPEOnnxPitchExtractor.RMVPEOnnxPitchExtractor object at 0x3131e53f0>
[Voice Changer] [RVC] Initializing... done
[Voice Changer][RVC]: update_settings serverReadChunkSize:512
[Voice Changer][RVC]: update_settings f0Detector:crepe
[Voice Changer][RVC]: update_settings modelSlotIndex:1730879615002
[Voice Changer] VoiceChangerManager initializing... done.
[Voice Changer] MMVC_Rest initializing...
mac model_dir: /Users/moizsarfraz/Desktop/MMVCServerSIO_mac_onnxcpu-nocuda_v.1.5.3.17b (1)/model_dir
[Voice Changer] MMVC_Rest initializing... done.
[Voice Changer] MMVC_SocketIOApp initializing...
[Voice Changer] MMVC_SocketIOApp initializing... done.
[VCClient] wait web server... done 200
[2024-11-06 18:58:58] connet sid : RU77B0jRh3agoi7VAAAB
[2024-11-06 18:58:58] connet sid : aAy8dEwRpeVKsr52AAAD
Generated Strengths: for prev:(4096,), for cur:(4096,)
torch/nn/functional.py:4756: UserWarning: The operator 'aten::im2col' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:13.)
return torch._C._nn.im2col(input, _pair(kernel_size), _pair(dilation), _pair(padding), _pair(stride))
[Voice Changer] warming up... generating sola buffer.

@Kuuko-fokkusugaru
Copy link

The software shows a 12 seconds of processing delay so I feel like it's a performance issue. That said, you could first try updating the app to the latest version instead of using an outdated one. The latest version for v1 is v1.5.3.18a. You can also try different F0 detection methods as some are more light for the cpu than others. Lastly, you can also try the version 2.0.65 which is the new RVC. Maybe it performs better.

@mozibites
Copy link
Author

ight ill try that ty

@mozibites
Copy link
Author

The software shows a 12 seconds of processing delay so I feel like it's a performance issue. That said, you could first try updating the app to the latest version instead of using an outdated one. The latest version for v1 is v1.5.3.18a. You can also try different F0 detection methods as some are more light for the cpu than others. Lastly, you can also try the version 2.0.65 which is the new RVC. Maybe it performs better.

I cant process the new version but I found it was only available for windows but I tried whisky but still did nothing.Though I haven't tried the other methods you listed so ill try them tommorow

@Kuuko-fokkusugaru
Copy link

Alcohol is never the answer.

@TimofeyZubashev
Copy link

TimofeyZubashev commented Nov 6, 2024

OS: Mac Sonoma
Processor: M2

I am facing the same issue. Voice Changer doesnt work -> volume 0 all time. Same if recording voice

UPD: I was able to hear "something" below are the things, which helped.

  1. I have lowered the chunk size (idk what chunk in this context is so @Kuuko-fokkusugaru please explain how chunks work
  2. I have switched F0 Det. dropdown to "duo" (also idk how this setting works)

Then I was able to hear something in my headset, however the sound is very delayed and it is not even close to human speech.

What should I try next?

@Kuuko-fokkusugaru
Copy link

@TimofeyZubashev chunk size is the amount of audio that you send to the software to be converted. Lower values can result in more performance usage but less memory at the cost of lower quality but less delay. Higher values mean higher delay but less performance cost while more memory usage. It's hard to define a "good" value. Usually, something between 0.3 and 1.0 seconds does a great job but this also depends on the F0 detection method used. F0 detection is the different methods to detect your voice, pitch, and other details. Some methods are really good at picking all sort of noises while others are not so good and may result in glitchy output or weird accents or sounds. Each F0 detection method is meant to use more the CPU or the GPU, some are more towards quality with a higher performance cost while others are more lightwave in exchange of worse quality. The sound being delayed is kind of normal. When you use a chunk size of 1 second, the software will first record one second of your voice before sending it to convert. But then the conversion also adds extra time if your hardware is not strong enough tod o it in real time, which is your case. The sound not being close to human speech may be a bad tone setting. First of all, settings like "echo", "sup1, and "sup2" are settings that uses more CPU on top of the existing conversion so I'd suggest to leave them off. Echo is meant to suppress echo from your mic if you have a lot in your room. Sup 1 and 2 are meant to suppress noise like computer fans, traffic outside, etc. If your headset or mic software comes with noise suppression by default, try using that instead, it just helps on not triggering "voices" from sounds around your room. The tune setting it's to change your pitch to match that of the voice model that you use. If you are a girl turning into male voice, you need to use a negative number of around -12 depending on your voice and that of the model. If you are a guy using a female voice then you may need to use a positive number around 12. For voices of the same genre of yours it will depend on how deep or high yours and the model voices are so the range may be between -4 and 4 usually. Adjusting the tone should help on sounding human like. But it also depends on the voice model quality after all.

@mozibites
Copy link
Author

Alcohol is never the answer.

I never will or never have drank alcohol!But there is a software called “Whisky” that enable Mac users to access windows stuff like exes

@mozibites
Copy link
Author

2.0.65

Tysm i finally got it working when i download the 2.0.65 beta version your goated

@Kuuko-fokkusugaru
Copy link

Alcohol is never the answer.

I never will or never have drank alcohol!But there is a software called “Whisky” that enable Mac users to access windows stuff like exes

It was a joke lol sorry

@Kuuko-fokkusugaru
Copy link

I am glad that it worked! :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants