Ask Your Question

Way to convert .wav file waveform into video .mjpeg file

asked 2013-06-11 05:56:52 -0500

odod gravatar image

Hello, I'm searching for a way to convert an array of values (44100 samples coded on 16bits each every second) into a series of plot (graph) images (24 per seconds), and then to make a video out of it so that I could have a converter from .wav to .mjpeg showing the waveform moving as if you were watching the file being read by any DAW with a fixed playhead. I've heard about OpenCV while searching for a way to do this. Do you guys think OpenCV could help me achieving this?

edit retag flag offensive close merge delete

2 answers

Sort by ยป oldest newest most voted

answered 2013-06-11 07:23:10 -0500

berak gravatar image

"Do you guys think OpenCV could help me achieving this?"

no, i'm afraid. no audio support of any kind built in

opencv is a nice machine vision/learning lib, but you probably want something from the ffmpeg universe instead

edit flag offensive delete link more

answered 2013-07-10 15:32:15 -0500

leye0 gravatar image

Very interesting question! But jumping from an apple to a chicken to a planet.

From my experience with OpenCV/audio signal/spectrum, what I will tell you is similar to berak's answer. I've put it on a sheet for you and other coders.

The path to follow is this one:

  1. Ensure you that, whatever language you are using, you have a good access to the audio buffer.

  2. To achieve it, either find a good library that can handle any wav file and convert it to a specific format. i.e. 44100HZ, mono or stereo depending on your needs, or read about the specifications for the format you're willing to analyse.

  3. Then now make sure you can access to your audio data easily.

  4. You'll have to decide what is the "time window" that each spectrum analysis image will cover. i.e. 20ms, or 100ms, or 1s, it's up to you. But what you can do is to take something like 10 capture for each second, and having these capture be only 10 or 20ms, so you don't process all the complete sound and save on performance. In the other hand, if you're not doing live conversion, maybe you can think about doing a full coverage of the sound data.

  5. That being said, for each of these buffer containing a short sound sample, pass it through an FFT algorithm. From my experience, you'll have some doubt about whether or not the snippets/codes you'll find to calculate FFT works, but it generally work if well implemented.

  6. Generally, an FFT method receives an amount of data (aka one of your sound sample) as the input, and spits an analysis of the amplitude for each frequency. It looks a bit like this: 20HZ=2, 40HZ=2, 60HZ=10, 80HZ=20, 100HZ=30, 120HZ=30, etc....... So you can easily sketch a picture of the spectrum by drawing some vertical lines.

I am very limited in my mathematics and physics knowledge, but that's the path I've always followed, either to draw spectrum in media player I've designed, or to analyse sound.

I'm pretty sure that using some imagination, you can match it with some OpenCV algorithms and do something really hot.

edit flag offensive delete link more

Question Tools


Asked: 2013-06-11 05:56:52 -0500

Seen: 975 times

Last updated: Jul 10 '13