Making Sound A Vision

By Peter Späth

The combination of sound and vision has been an integral part of human nature from the very beginning. In a very simple form of correspondence, people moved their mouths to produce audible speech, and they moved their hands to beat sticks, producing a percussion sound. In nature, a thunder follows a lightning strike, and the leaves of a tree move while the wind is making a blowing sound. During the course of our electronic civilization the immediate correspondence got somehow lost, for sound became something more abstract coming out of loudspeakers, without anything that is moving while the sound is playing. to compensate for this and with the age of computation being born, programmers were trying to combine sound and vision, either algorithmically producing sound given a video or some pictures, or the other way round, producing a video given the sound. While the former never left the stage of academic curiosity, probably due to the complexity of the decision what is good and what is bad music, the latter gained some considerable attention. This is both an outcome from more and more powerful hardware and the ability of the human vision system to find images interesting which show artificial patterns evolving in time. In the end there were several programs which could help with that audio visualization idea.

Ever since I became able to write computer programs, I was looking for interesting programs providing flexible and easy to understand tools for building such audio visualization projects. The need for flexibility of course came from the striving for making things interesting, the easiness, well I have to admit partly because of being lazy, but also since easy and at the same time not uninteresting things tend to have a longer lifetime. For a private project two years ago I was once again scanning the computer software market to find a software suite for realtime audio visualization, and with also an inclination to prefer my Linux Ubuntu box over my windows machine, came across the software suite VSXu from Vovoid Media  Technologies AB (R), which looked extremely promising. They provided for a way to graphically place modules on a canvas, connect them and configure them to build a visualization setup on top of OpenGL, the standard for addressing computer graphics hardware in the Linux world. Not really being aversed to programming using the OpenGL API directly, this graphical approach, though having its disadvantages, seemed appealing to me and I wanted to try using VSXu for some projects. There were some important features missing though, and a couple of improvements concerning usability, and there was some year-long inactivity concerning updates to the suite, so I dived into the code, found it in parts brilliant and in other parts understandable enough to do my own adjustments to it, and then decided to work on an own fork of this open-source software. ThMAD was born (Thinking Machines Audio Dreams).

To start with a simple ThMAD visualization project, you basically connect a pixel data producer to a renderer, probably add some modifiers to take light, material and camera (or eye) position into account, and connect all to the singleton output module. The beauty in VSXu and later in ThMAD is that you can watch the output while you are assembling things, giving a performance boost for the development of your visualization project. And most of the parameters of the modules you added can be controlled by controller modules, thus connecting to the time, to sequences, to oscillators, of course to sound input, and more. A very simple visualization project could look like

For people used to working with VSXu, considerable effort has been invested to have most VSXu visualizations work in ThMAD as well. 

The idea to write a book came while documenting existing functionalities and the adjustments I made, so others can draw a benefit from it. The book "Audio Visualization Using ThMAD" is the outcome of this idea, and I hope it can be useful for both artists and developers interested in that matter.

Best regards and happy visualizing, your Author.

About the Author

Dr. Peter Späth has worked as a IT consultant with heavy focus on Java related development for over 15 years. Recently, Peter has decided to focus on his work has an author and working in a self-paced manner on software.

Want more? Pick up Peter's book, Audio Visualization Using ThMAD: Realtime Graphics Rendering for Ubuntu Linux, now available on Apress.com.