Isolating a person speaking from a mixed track of background music and their voice?

1.3K

•

We're trying to isolate that down enough so that all of the major content trigger algos don't trigger on it, but I'm failing at computer.

Anyone have a workable (not perfect) method for doing something like that? We couldn't care less if it's an online AI bullshit thing.

I'm working with an audio stream that has someone narrating what they're doing, but there's music in the background. The music is not part of the program, but just happened to be playing while the person was working. We're trying to isolate that down enough so that all of the major content trigger algos don't trigger on it, but I'm failing at computer. Anyone have a workable (not perfect) method for doing something like that? We couldn't care less if it's an online AI bullshit thing.

[–] • 4 pts

There is an AI web that will do this. I think I used this in the past to strip voice from a music track: https://www.lalal.ai/stem-splitter/

The splitter might let you isolate the voice and remove the music.

link

[–] • 1 pt

I tried that one first. It works, but my clip is too long.

parent
link

[–] • 3 pts

This looks kind of shady but maybe it'll work: https://vocalremover.org/

How long is your clip? Maybe break it into smaller samples then stich it back together in audacity.

parent
link

[–] • 1 pt

Current clip is 1:21:00

parent
link

Load more (2 replies)

[–] • 0 pt

(edited )

EDIT: Replied on wrong comment, sorry.

parent
link

[–] • 3 pts

Re-narrate it yourself in a pirate voice.

link

[–] • 3 pts

embed

parent
link

[–] • 2 pts

I am the narrator as it is, I guess that's probably going to be the easiest way.

parent
link

[–] • 3 pts

(edited )

adobe podcast has a free service that isolates voice. it requires you to make an account and the max mp3 length is 30 minutes, but it’s free. Probably suitable if you’re just trying to ditch the background music without installing any modern software.

https://podcast.adobe.com/en/enhance

link

[–] • 1 pt

It’s over an hour. I think I’ll just renarrate.

parent
link

[–] • 2 pts

(edited )

yeah, with over an hour you have to split it into segments and recombine them. if renarrating is an option just do that

parent
link

[–] • 1 pt

Yeah, that would be easy, just 3 segments.

parent
link

Load more (1 reply)

[–] • 2 pts

Audacity is a good start.

link

[–] • 2 pts

I'm trying audacity and it's not really doing what I need. I think I need a newer version, but this machine won't run it.

parent
link

[–] • 2 pts

(edited )

I'm trying audacity and it's not really doing what I need. I think I need a newer version, but this machine won't run it.

Upgrade Audacity. Install the free Audacity OpenVINO plugins. It will add noise reduction capabilities and, more importantly, a plugin to separate music into up to for "stems" that include vocals, drums, bass and everything else. You can use it on non-music files and get vocal extractions too. For your purpose, the 2-stem option is probably best since it will separate vocals from everything else. The plugins are AI, but the engine is local and can use GPU or CPU for processing.

EDIT: no you can't run it on older crappy hardware