探究PyAudio录音是否会丢帧

如下一段利用PyAudio库从麦克风实时采样并录音的代码(代码出处):

import pyaudio
import wave

chunk = 1024  # Record in chunks of 1024 samples
sample_format = pyaudio.paInt16  # 16 bits per sample
channels = 2
fs = 44100  # Record at 44100 samples per second
seconds = 3
filename = "output.wav"

p = pyaudio.PyAudio()  # Create an interface to PortAudio

print('Recording')

stream = p.open(format=sample_format,
                channels=channels,
                rate=fs,
                frames_per_buffer=chunk,
                input=True)

frames = []  # Initialize array to store frames

# Store data in chunks for 3 seconds
for i in range(0, int(fs / chunk * seconds)):
    data = stream.read(chunk)
    frames.append(data)

# Stop and close the stream 
stream.stop_stream()
stream.close()
# Terminate the PortAudio interface
p.terminate()

print('Finished recording')

# Save the recorded data as a WAV file
wf = wave.open(filename, 'wb')
wf.setnchannels(channels)
wf.setsampwidth(p.get_sample_size(sample_format))
wf.setframerate(fs)
wf.writeframes(b''.join(frames))
wf.close()

这里用的是传统的阻塞模式。当收集到的采样数少于chunk时,stream.read(chunk)语句将会阻塞。但是,有个问题随即而来——如果收集到的采样没有及时被程序读取,那么这些采样会被保留吗?

一个相似的情况是网络套接字通讯。假设一个程序每次只从套接字读取A个字节,而网卡缓冲区的大小是B字节(A<B)。在不断有新的数据传入的情况下,如果程序没有及时读取数据,那么这些数据将会留在网卡缓冲区中;直到网卡缓冲区满了,才会发生数据丢失的情况。那么从程序的角度来看,无论相邻的两次读取的操作间隔了多久,读取得到的数据都是连续的。这是因为网卡缓冲区起到了暂时保管数据的作用。

为了试验PyAudio实时录音是不是相同的机制,我在每个读取操作之间加入了显著的延迟(计算可得每次读取的数据时长为:1024÷44100=0.0232秒,那么0.1秒的延迟就是显著的)。如下是新的代码:

chunk = 1024
sample_format = pyaudio.paInt16
channels = 1
fs = 44100
read = 0
p = pyaudio.PyAudio()
stream = p.open(format=sample_format,
				channels=channels,
				rate=fs,
				frames_per_buffer=chunk,
				input=True)

while read < 10 * 44100:
	data = stream.read(chunk)
	read += chunk
	time.sleep(0.1)
	data = np.fromstring(data, dtype=np.int16)
	self.audio_frames.append(data)
stream.stop_stream()
stream.close()
p.terminate()

wf = wave.open('delay_test.wav', 'wb')
wf.setnchannels(channels)
wf.setsampwidth(p.get_sample_size(sample_format))
wf.setframerate(fs)
wf.writeframes(b''.join(self.audio_frames))
wf.close()

播放所得的音频,发现有明显的断续感,有丢帧的情况。这说明,实时录音和网络通讯的机制不同,如果数据没有被及时读取,就会被丢弃,底层机制只会为程序缓存chunk个采样。