The current CPU-based version on HuggingFace has slow inference, you can access the GPU-based mirror on ModelScope

0:00 / 0:00
Dataset

Generate by emotion condition

Valence (reflects negative-positive levels of emotion)
Arousal (reflects the calmness-intensity of the emotion)

Generate by feature control

Pitch SD
Mode
40 228
-24 24
-5 10
The emotion to which the current template belongs
Feedback: the emotion you believe the generated result should belong to