Real-Time Memory Usage

Dealing with memory in a real time context…

The reason for writing this post was designing Luppp’s recording functionality. Recording audio data: it can’t be that difficult right? The issue that arises when recording audio data is that the thread in which the data arrives is real-time: so we cannot allocate a buffer for it as it streams in.

When allocating memory for use in real-time programs, its important to note that we should “lock” all the memory. This means that even when the OS is running out of memory, it will not “swap out” the memory to disk: this will cause accessing the memory to take much too long for real-time audio processing. Read up on mlock, the C unix call to lock memory in memory!

In this post, four solutions are considered to this problem. We talk about their strengths and weaknesses in the upcoming paragraphs:

  • Pre-allocate a fixed size
  • Periodic Buffer Replacement
  • RT-safe memory pool
  • ChunkBuffer Class

Note some solutions that are not considered in this article are IPC mechanisms like SHM), as they are not contained within the application itself, and (most) are not cross platform.

1: Pre-allocate a fixed size

Pre-allocating a fixed size buffer is by far the easiest method. It comes with a severe disadvantage though: the length of the recording is limited. Since for live-performance programs we can’t have such a limitation, I’ll stop talking about it, and presume that your use case doesn’t allow for pre-allocating for some reason.

2: Periodic Buffer Replacement

Using a float* for the buffer, and being able to dynamically switch out buffers affords us the possibility to pre-allocate a certain fixed size, and when we are running out of space, request a bigger array. This may be a workable solution, few malloc() calls are made, but when they are, they come with large sizes.

The methodology is as follows:

  1. Record until buffer is almost full
  2. Request new bigger buffer
  3. Copy existing data into new buffer
  4. Return old buffer to be de-allocated

Notes:

  • include variable size recording, swapping the buffers is quite trivial.
  • malloc() is called every time the buffer is almost full, and we create
  • a new buffer of the current size + X. A good value for X creates a balance between the frequency of malloc() calls and wasted space (due to the buffer being larger than required).
  • Quite simple: a pointer swap, and done.
  • There is some unused space, depending on size of increments.

3: RT-safe memory pool

Using an RT-safe memory pool essentially bypasses the problem: since we can now “allocate” (or more accurately: acquire) memory in the RT thread, we don’t have to worry about how big a buffer we are allocating. This approach is usable, but attention must be paid to the size of the pool, and the maximum size that can be requested from that pool. Also a non-RT thread must add to the pool in order to keep it of the right size. Updating this pool must not block, since the RT thread can’t pause: not even when the pool itself is being “upgraded”.

Notes:

  • Easier code on the RT side
  • Complications on the non-RT side
  • The background thread must poll the pool size regularly and add to it when needed.
  • The space allocated in the pool may not all be contiguous, which causes a whole cascade of complications.

4: ChunkBuffer Class

The “ChunkBuffer” concept is to have one class containing lots of little “chunks” of data: it can be dynamically incremented in size. The implementation requires allocated buffers of a certain size to be supplied to the ChunkBuffer class, which are then added to the collection. Each “Chunk” of data is placed sequentially after another in order to build a larger buffer. Note that this larger buffer is not contiguous: so doing pointer++ style reading will result in segfaults. Hence get() methods are necessary for the data, and the ChunkBuffer class must have some logic in place to access the right chunk of memory.

Notes:

  • Complexity of different buffers is encapsulated: the exterior doesn’t see the individual chunk instances.
  • Complexity exists in the ChunkBuffer class, it must request / add / read from separate non-contiguous memory.”

Conclusion

My use case is Luppp, which needs infinite length recording in real-time. The length of the recording is not known beforehand. Since there can be up to 8 tracks * 10 scenes per track, all recorded, wasted space must be considered too.

For the above use case, the chosen solution is Periodic Buffer Replacement: it is simple and totally RT safe. The memory waste is quite low: depending on the incremental size of the buffer. Currently the “buffer increment” is set to the sample-rate: this means buffers will be replaced once per second when recording: and the worst-case memory waste is approx 1 second of audio.