The cost of context switching

Whether out of habit or clarity of reading, I often find myself writing shell recipes like:

$ cat file.txt | grep foo

...instead of...

$ grep foo < file.txt

In order to read then search a file for some text -- there's something about using cat and piping it into another command that is easier for me to follow.

However, there is a price to be paid for such a pattern. But how expensive is it?

pv

Enter: pv, monitor the progress of data through a pipe.

This is a neat tool which you can insert between any piped expressions to see the flow of data between processes. It has a number of flags which can alter the output format, or even rate limit the flow of data between said processes!

$ cat file.txt | pv | grep foo

This would not only show the results from grep, but pv would also print a progress bar and data transfer rate. So now we can measure the speed of various CLI recipes.

Slow to fast

  • Using a Raspberry Pi 2b
  • Transferring 1GiB

Not sure exactly why, but yes is pretty slow:

$ yes | head --bytes $((1024**3)) | pv | cat > /dev/null
1GiB 0:01:12 [14.2MiB/s]  

Let's remove yes to see if this is any faster:

$ cat /dev/zero | head --bytes $((1024**3)) | pv | cat > /dev/null
1GiB 0:00:07 [ 133MiB/s]  

Wow. What if we remove the initial cat and have head read from /dev/zero directly?

$ head --bytes $((1024**3)) < /dev/zero | pv | cat > /dev/null
1GiB 0:00:04 [ 247MiB/s]  

And if we remove the final cat and have pv write to /dev/null directly?

$ head --bytes $((1024**3)) < /dev/zero | pv > /dev/null
1GiB 0:00:03 [ 287MiB/s]  

We've gone from 133MiB/s to 287MiB/s, a ~2x improvement just by removing removing the preceding and trailing cat calls!

Ultimately, using pv without any other processes will yield the best performance. However, since we're unable to limit the amount of data to read, I will just manually end it after 10 seconds:

$ pv < /dev/zero > /dev/null
8.32GiB 0:00:10 [ 861MiB/s]  
^C

(I suppose I could have utilized a RAM disk, but the Raspberry Pi -- limited to 1GB of RAM -- is not really an option.)


So I should really start considering having tools read/write directly to destinations, rather than using pipes liberally.