https://mezha.media/en/2023/01/10/microsoft-s-vall-e-can-simulate-any-voice-with-3-seconds-of-audio/
Microsoft's VALL-E can simulate any voice with 3 seconds of audio