Mouth fusion wav2lip upgraded version


  • windows 10 64bit
  • wav2lip -hq
  • pytorch 1.12.1+cu113

video here

Here is the play link of youtube , you need to surf the Internet scientifically. If you like my video, please remember to subscribe to my channel, turn on the little bell next to it, like and share, thank you for your support.


The previous blog post, the mouth synchronization model Wav2Lip , introduced the mouth synchronization model. This article introduces the high-definition version of wav2lip . On the basis of the original, super-resolution images and face segmentation technology are used to improve the overall effect.


First, pull the source code

 git clone wav2lip -hq.git cd wav2lip -hq # 创建个新的虚拟环境conda create -n wav2liphq python=3.8 conda activate wav2liphq # 安装torch pip3 install torch torchvision torchaudio --extra-index-url # 安装其它依赖库,将其中的torch、torchvision注释掉,前面已经安装了gpu版本pip install -r requirements.txt

Then go to download the model, here you need 3 models, the first download address: , copy it to the directory checkpoints after downloading; The two models are face models. The download address is: . After downloading, copy it to the face_detection/detection/sfd ​​directory and rename it to s3fd.pth ; the third is the segmentation model of the face, download address: , copy it to the checkpoints directory, and rename it to face_segmentation.pth

Finally, we prepare an audio file and a video file for testing, execute the command

 python.exe --checkpoint_path checkpoints\wav2lip_gan.pth --segmentation_path checkpoints\face_segmentation.pth --sr_path checkpoints\esrgan_yunying.pth --face test.mp4 --audio test.mp3 --outfile output.mp4


This article is reprinted from
This site is for inclusion only, and the copyright belongs to the original author.