Note: Try to keep the audio length under 2 minutes, since long audio files dont work well with a static spectrogram