soundfile torchvision torchaudio vector_quantize_pytorch vocos msgpack referencing jsonschema_specifications