The colon operator really is the fastest.

Way back when I was first learning R, I ran across an old listserv post that talked about how the colon (:) operator was the fastest way to generate a sequence. I never really thought about it, but I got in the habit of always using it whenever I needed a sequence.

Anecdotally, I knew from running a few simulations that seq() should be avoided if you’re generating a lot of small sequences repeatedly, but that’s a relatively rare case. Is the colon operator really that much faster than the alternatives — seq(), seq.int(), or seq_len() — in general cases?

Turns out the answer is “yes” — most of the time. Running a simple microbenchmark script, I tested the generation of numbers from 10^1 to 10^9 for each of the four functions. Then I plotted the mean with bars representing the 2.5th and 97.5th percentiles (on a log-log plot).

If you’re generating large sequences, it really doesn’t seem to matter which function you use, but for the common cases (e.g., slicing a vector or enumerating a loop), the colon operator outperforms the others. I’m not really sure there’s a lesson here except to trust R listserv posts and use : as often as possible. Code here.

Leave a Reply

Your email address will not be published. Required fields are marked *