Real-time RE-USE: A real-time universal speech enhancement framework that can explicitly set both algorithmic and computational latency.

The proposed model supports 30 distinct latency configurations. For non real-time applications, check our RE-USE model: https://huggingface.co/nvidia/RE-USE.

Note 1: For bandwidth extension, the performance may be affected by the characteristics of the input data, particularly the cutoff pattern. A simple solution is to apply low-pass filtering beforehand.

Note 2: You can set Exit_layer (between 3 and 12), and look_ahead_frames (between 0 and 2), to achieve different quality–latency trade-offs

Input Audio

Enter 'Exit_layer' (between 3 and 12, default 8):

Enter 'look_ahead_frames' (between 0 and 2, default 0):

(Optional) Enter 'target sampling rate' for bandwidth extension:

(Optional) Enter target sampling rate for pre-low-pass filtering before bandwidth extension:

Original Input Audio

Enhanced Audio

Spectrograms