Mixalot is a grab bag of systems related to audio in Common Lisp under an MIT-style license. Currently it consists of a mixer component targeting the Advanced Linux Sound Architecture, a CFFI binding to the libmpg123 library, and a system allowing playback of MP3 files through the mixer.
The most recent version of Mixalot can be obtained from http://vintage-digital.com/hefner/software/mixalot/mixalot-current.tar.gz.
Mixalot includes several ASDF systems which should be symlinked into the ASDF central registry in the usual fashion. It depends directly on the following systems:
You may find this library useful for the following purposes:
The CFFI and bordeaux-threads libraries are used to ease porting between lisp implementations. The mixer component of Mixalot relies on ALSA and as such will only run on Linux systems. In order to be useful in an application, the mixer must also run in its own (native) thread. Therefore, the mixalot and mixalot-mp3 systems are currently usable only on SBCL (w/sb-thread) and Clozure Common Lisp under Linux. The mpg123-ffi system should be usable on any lisp that supports CFFI.
Mixalot implements an audio mixer which handles the mixing of multiple audio streams and playback through ALSA. Once created, a mixer runs in its own thread of execution, and other threads can add and remove audio streamers at any time. The terms "stream" and "streamer" are used interchangeably here, but the concrete objects in the lisp system are labelled "streamers" to avoid confusion with the Common Lisp stream classes.
All functions in this section reside in the mixalot package.
A mixer is host to zero or more streamers. In the mixalot package, there are functions to create/destroy mixers, and add/remove streamers from a mixer.
Creates a mixer running at the specified sampling rate, and returns it. The default ALSA device is opened and a thread for the mixer is started. The chosen sample rate should match the sampling rate of the data you intend to play through the mixer - the Mixalot mixer does not perform resampling.
Requests that mixer be shut down. This will occur after the current audio buffer has completed playback. When the mixer shuts down, each of its streams will be removed just as if mixer-remove-streamer had been called, so it is not necessary to manually clean up the streams when shutting down a mixer. Attempts to add additional streamers to mixer after calling destroy-mixer on it will result in an error.
Add a streamer for playback by the given mixer. After being added to the mixer, it will begin calling the streamer functions (see Basic Streamer Protocol) to generate samples.
Returns two values: the streamer itself, and the time in samples since the start of the mixer when the streamer will begin playback.
Request that streamer be from the mixer. The streamer-cleanup function will be called from within the mixer's thread when this occurs. This occurs asynchronously because the stream may be generating audio at the time this function is called, and it would not be safe to clean it up until it is assured that the mixer will not call the mix/write functions again.
Removes all streamers belonging to the mixer.
Returns the time, measured in samples, since the mixer was created.
Returns the sampling rate in Hertz of the output and inputs to the mixer.
A streamer is any object serving as a source of samples which are mixed into the mixer's audio buffer by implementing the Basic Streamer Protocol. From the mixer thread, the functions streamer-mix-into and streamer-write-into are called periodically to generate audio. These two functions correspond to the cases of multiple or a single streamer being mixed, respectively, and it is only necessary to implement streamer-mix-into.
Defining a method on streamer-write-into allows an optimization in the case where only one stream is being mixed into the buffer, where the streamer can (and is expected to) overwrite the specified range of the buffer rather than mixing with the existing audio. The default method on streamer-write-into simply fills the specified range of the buffer to zero before calling streamer-mix-into.
In many situations, a stream has only finite duration. In this situation, when the stream has reached its end, it can call mixer-remove-streamer from its write/mix method to remove itself from the mixer. This guarantees its mix/write methods will not be called again, and the mixer will call streamer-cleanup to ensure any resources associated with the streamer are released.
The simplest example of a streamer is a function object taking the arguments of streamer-mix-into. This is enabled by pre-existing methods on the streamer functions. Here is an example of a streamer implemented as a lambda function which produces a series of six tones with increasing frequency, then removes itself from the mixer:
(defun make-test-streamer ()
(let ((n 0)
(phase 0.0))
(lambda (streamer mixer buffer offset length time)
(declare (ignore time))
(loop for index upfrom offset
repeat length
with freq = (+ 200 (* n 200))
with dp = (* 2.0 pi freq 1/44100)
as sample = (round (* 5000 (sin phase)))
do
(mixalot:stereo-incf (aref buffer index) (mixalot:mono->stereo sample))
(incf phase dp))
(incf n)
(when (= n 6)
(mixalot:mixer-remove-streamer mixer streamer)))))
Here is an example of how to create a mixer and play the test streamer defined above:
CL-USER> (defparameter *mixer* (mixalot:create-mixer))
*MIXER*
CL-USER> (mixer-add-streamer *mixer* (make-test-streamer))
#<CLOSURE (LAMBDA (STREAMER MIXER BUFFER ...)) {1003F045A9}>
507904
CL-USER>
Here is the same streamer, implemented via classes and methods:
(defclass test-streamer ()
((n :initform 0)
(phase :initform 0.0)))
(defmethod mixalot:streamer-mix-into ((streamer test-streamer) mixer buffer offset length time)
(declare (ignore time))
(with-slots (n phase) streamer
(loop for index upfrom offset
repeat length
with freq = (+ 200 (* n 200))
with dp = (* 2.0 pi freq 1/44100)
as sample = (round (* 5000 (sin phase)))
do
(mixalot:stereo-incf (aref buffer index) (mixalot:mono->stereo sample))
(incf phase dp))
(when (= (incf n) 6)
(mixalot:mixer-remove-streamer mixer streamer))))
These generic functions define the essential behaviors of an audio streamer. Of them, only streamer-mix-into is mandatory to implement.
Generate length samples of audio from streamer. The audio should be mixed into the provided buffer (of type sample-vector) starting from offset. Time indicates the time in samples since the creation of the mixer.
The provided mix-stereo-samples function or stereo-mixf macro can be used for mixing samples.
When a streamer has completed playback, it can call mixer-remove-streamer after generating its last batch of audio to remove itself from the mixer.
Generate length samples of audio from streamer. The contents of buffer (of type sample-vector) may be invalid and must be overwritten (not mixed into) starting from offset. This method exists as an optimization for the case of the first or only playing streamer, to avoid redundant zero-filling of the buffer and unnecessary mixing. The default method fills the buffer with zeros then calls streamer-mix-into, so implementing a more specific method for this is optional. Time indicates the time in samples since the creation of the mixer
When a streamer has completed playback, it can call mixer-remove-streamer after generating its last batch of audio to remove itself from the mixer.
Notification that streamer has been removed from the mixer, and its streamer functions will not be called again. This allows the streamer to release resources no longer needed, such as open file handles or foreign memory.
Streamers are inherently pauseable, achieved by the mixer electing not to call on them to generate audio while they are paused (the mixer tracks the paused/playing state itself). Therefore, most streamers will have no reason to define more specific methods on these functions. A counterexample might be a streamer streaming audio over the network, which may want to adjust its buffering behavior when the stream is paused and unpaused.
Pause playback of the streamer through the mixer. If streamer is already paused, it remains paused.
Unpause playback of the streamer through the mixer. If streamer is already playing, it continues to play.
Returns true if playback of the streamer through the mixer is currently paused.
Some streamers may have the ability to seek to an offset during playback. These streamers must implement the Seekable protocol, comprised of the functions below. Users can call these functions to control and query stream position and length.
Returns true is the streamer is seekable. Attempting to seek or query the stream length or position may signal an error unless this returns true.
Returns the length, in samples, of the streamer. If this can not be determined, returns nil.
Seek to a position, measured in samples, on streamer. Streamers are permitted to seek approximately (for instance, rounding to the nearest frame of compressed data). Specific streamer types may take additional arguments.
Issue: Should this return a boolean indicated success/failure? It's feasible that on compressed or network streams, it might not always be possible to seek at all, or the circumstances in which it may fail may be subtle. On the other hand, these streams could always roll their own protocol using the condition system.
Returns the position, measured in samples, of playback within the streamer.
Several classes are provided for streaming audio directly from a lisp vector in memory. These vary in input format and come in regular and fast variants, where the fast version assumes the input vector is an appropriately specialized simple-array. The following functions construct these streamers:
Creates a vector-streamer streaming 16-bit mono audio from a range of vector defined by start and end.
Creates a vector-streamer streaming 16-bit mono audio from a range of vector defined by start and end. Vector is assumed to be of type (simple-array (signed-byte 16) 1).
Creates a vector-streamer streaming 16-bit stereo audio from a range of vector defined by start and end. Samples are read in an interleaved format with separate array elements for the left and right channels, such that one output sample corresponds to two input elements.
Creates a vector-streamer streaming 16-bit stereo audio from a range of vector defined by start and end. Samples are read in an interleaved format with separate array elements for the left and right channels, such that one output sample corresponds to two input elements. Vector is assumed to be of type (simple-array (signed-byte 16) 1).
Creates a vector-streamer streaming 16-bit stereo audio in joint format (see stereo-sample) from a range of vector defined by start and end.
Creates a vector-streamer streaming 16-bit stereo audio in joint format (see stereo-sample) from a range of vector defined by start and end. Vector is assumed to be of type sample-vector.
For simplicity, Mixalot deals uniformly in 16-bit stereo audio, but the sampling rate may vary. The following type definitions relate to samples and audio buffers:
(deftype mono-sample ()
'(or (signed-byte 16)
(unsigned-byte 16)))
A mono sample is typically represented as a (signed-byte 16). Occasionally, it is convenient to use an unsigned representation instead.
(deftype stereo-sample () '(unsigned-byte 32))
This is the native representation of audio in Mixalot. Two 16-bit samples are combined into a single word, with the left channel stored in the low word and the right channel stored in the high word.
(deftype sample-vector () '(simple-array stereo-sample 1))
The sample-vector type denotes specialized one-dimensional simple-arrays of stereo samples.
Various functions are available for manipulating samples. All of the following functions are declared inline, with appropriately declared types, and compiled to reasonably efficient code at the time of this writing (using SBCL 1.0.28.27/x86_64).
Construct a stereo-sample from separate left and right components, each a mono-sample.
Expand a mono-sample to a stereo-sample by duplicating it on the left and right channel.
Returns the left channel component of a stereo-sample.
Returns the right channel component of a stereo-sample.
Returns two values, the left and right components of a stereo-sample.
Clamp an integer to the range -32,768...32,767.
Compute the sum of integers x and y, clamped to the range -32,768...32,767.
Returns the pairwise sum of the stereo-samples x and y, without clipping. The sums of the individual channels may wrap, but will not overflow into the other channel.
Returns the pairwise sum of the stereo-samples x and y, clamped to -32,768...32,767.
Increment the contents of place (a stereo-sample) by sample as if by add-stereo-samples.
Increment the contents of place (a stereo-sample) by sample as if by mix-stereo-samples.
The mpg123-ffi system presents a thin wrapper around libmpg123 and should, given the source code, be self explanatory in combination with the original libmpg123 documentation. It provides bindings to most of the API, excluding the advanced parameters API and a few functions dealing with feeding MPEG frames directly into the decoder (and patches for these are welcome).
It also manages library initialization and provides helper functions for retrieving ID3 tags and decoding MP3 files. The ID3 decoding is intended to be robust in the face of badly formed or encoded input common in MP3 files, and failure to do so should be reported as a bug.
The functions are exported from the mpg123 package and comprise a thin wrapper in the sense that most of the C functions are exported directly with no extra lisp wrapper. A few exceptions involving out parameters and other awkward idioms are wrapped to use multiple return values instead; only these are documented explicitly below. For all other functions, see mpg123.lisp.
Ensures that the libmpg123 has been initialized.
An error from the mpg123 library.
An error from the mpg123 library in the context of a handle.
Examines the code returned by a call to libmpg123 (excluding those functions which act on handles). If it indicates an error, it translates the error to a descriptive string and signals an error of type mpg123-error. Circumstance is a string, included in the error report, intended to indicate which library call produced the error or what the programmer was attempting to do.
Examines the value returned by a call (in the context of the mpg123_handle pointer handle) to libmpg123. If it indicates an error, it translates the error to a descriptive string and signals an error of type mpg123-handle-error. Circumstance is a string, included in the error report, intended to indicate which library call produced the error or what the programmer was attempting to do. If no error occurred, value is returned.
Returns a list of generally available decoders.
Returns a list of decoders supported by the CPU.
Given a handle with an opened file, returns three values indicating the current format: rate (in Hertz), channels (1 or 2), and encoding (an integer, one of MPG123_ENC_... constants).
Examines ID3v1 and ID3v2 tags from an mpg123 handle, returning a plist containing a subset of the following keys:
For all keys except :track, the property value will be a string. If :track is present, it will be an integer.
Prefers taking values from ID3v2 tags, but falls back to the ID3v1 tag when necessary. Some heuristics are applied to correct or reject obviously incorrect property values. Specifically, various forms of year are canonicalized to a four-digit form if a reasonable guess can be made, and track numbers of zero are removed from the plist.
Issue: Currently only extracts track numbers from the ID3v2 TRCK field, but should definitely fall back to the ID3v1.1 track number field if this is not present.
Open an MP3 file, examine its tags using get-tags-from-handle, and return the resulting plist. Unless verbose is true, messages from the library are suppressed.
By default, the filename will be encoded as iso-8859-1 (latin-1) before being passed to C; to avoid encoding problems on non-ASCII filenames, it is recommended you obtain the filename string with this in mind. You can change this behavior using the :character-encoding keyword, which can take any valid Babel encoding specifier (another reasonable choice is :utf-8b, but again, this choice should be congruent with how the string was obtained from the filesystem).
Opens and decodes an MP3 file, returning a vector of samples, a (simple-array (signed-byte 16) 1), interleaved, with one array element per channel. The :character-encoding keyword chooses how the filename is encoded as a C string. Additionally, the rate, channels, and encoding () of the file are returned (see mpg123-getformat). Unless verbose is true, messages from the library are suppressed.
The mixalot-mp3 system builds on mpg123-ffi to implement a seekable streamer class for use with the mixer, allowing incremental decoding and playback of MP3 files from disk. The following functions reside in the mixalot-mp3 package:
A seekable Mixalot streamer for playing MP3 files.
Create and return an mp3-streamer which opens and incrementally decodes from the given filename. If the file cannot be opened, an error of type mpg123-handle-error is signalled. The output is resampled according to output-rate, which for correct playback should match the rate of the mixer the stream will be used with.
The class argument allows specifying an alternate class to make-instance. This class must be a subclass of mp3-streamer. Remaining args and output-rate are passed to make-instance as initargs.
Note: While libmpg123 can perform automatic resampling, the resulting audio quality is very poor, and in practice you shouldn't do it. ALSA's dmix does a much better job, although resampling through dmix has its own issues (namely, unpredictable and often unreasonably high CPU usage when more than one program uses the device simultaneously).
Release foreign resources held by the mp3-streamer. In normal operation, streamer-cleanup calls this automatically when the streamer is removed from its mixer. It's only necessary to call this yourself if the mp3-streamer has never been added to a mixer, but you want to dispose of it.
Returns the native sampling rate of the file behind an mp3-streamer. This may be different from the output sampling rate.
Example of creating and playing an MP3 stream:
CL-USER> (mixalot:mixer-add-streamer *mixer*
(mixalot-mp3:make-mp3-streamer "/home/hefner/test.mp3"))
#<MIXALOT-MP3:MP3-STREAMER {10029B9A91}>
8463564800
Various ideas for improvement, which might appear on an as-needed basis:
Report bugs to Andy Hefner <ahefner at gmail dot com>