Skip to content

Timestretching with FFT

March 19, 2012
tags: ,

Lately I’ve been building an ensemble that can apply both a time-stretch and a pitch-shift effect in Reaktor. Today I’ll outline a basic time-stretch effect using FFT, and tomorrow I’ll show a way that we can extend the algorithm (using the Frequency Detector from a few days ago) to add pitch-shifting.

First things first. Since we’ll be changing the speed of playback, we’ll need to have access to the whole file that we’ll be speeding up or slowing down. There are several reasons for this: taking real time input it would be impossible to speed up the data, not knowing what is coming next; slowing down we would need an infinitely large array, or at least one that could grow dynamically, something that we do not have access to in Reaktor.

For the purposes of this post, I’ve decided to go with a Lookup module to store the sound file. This file is read, and sent thru FFT, the data of which gets stored in some core arrays. Since core arrays have a maximum 1,000,000 values, we can store about 20 seconds of FFT data with a 44.1KHz sample rate. Storing the data is a pretty simple process – we simply read the data from the Lookup module, run it thru an ezFFT 512 module, and then store the data at the appropriate index of the arrays (one for the amplitude, another for the phase).

Now every 512 sample block in these arrays represents the spectral data for that 512 sample point in time. This is called a frame. If we simply play them back at normal speed and send them through an iFFT 512 module we’ll get back our original sound. Of course we want the file to be able to play back at normal speed so it’s worth thinking about doing that properly first. One way we could do this would be have a Clock, (which I’ll call C for simplicity) that gets increased by one every time the FFT index = 0  (IE, at the beginning of a new frame). Then we multiply C by the FFT size (512) and add the total to  the current FFT index. This number will give us the array index to read back.  Of course, this is thoroughly uninteresting – all we’ve done is save a file as FFT data and then play it back using an expensive method that also causes latency.

Fortunately, it is very easy to go from here to time-stretching. Instead of increasing C by 1 each time index = 0, we increase it by a variable, Speed. So if Speed = 1, the clock will advance as before and nothing will change. But let’s say the Speed = 0.5 and consequently at some point C = 1.5 – in this case we would read the data as if C = 1, and also read as if C = 2.  Then we can perform a linear interpolation on the values (simply crossfade between them) using the remainder, in this case 0.5.

Here’s the whole core cell which does all the work:

Core Engine

As you can see, there are only two macros between the factory ezFFT modules (the Vec2Pol and Pol2Vec are a part of the ezFFT distribution). These are incredibly simple macros, and they look like this:

Simple core cells

It’s really that easy! There are a few problems with this technique – if the speed becomes too large, around 2 or more, whole chunks of data start getting dropped entirely, there is latency associated with the iFFT module, etc. But overall it sounds pretty good, take a listen here. The sample starts out with Speed = 2, then drops to 1.5, 1, 0.5 and 0.25, to demonstrate the range of the effect.

I’ll post a copy of the ensemble with the pitchshift added to it tomorrow.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: