call

Both my work computer and my own personal computer have dual-core processors.

This means that in order to get the best out of them when doing large processing jobs, I need to run more than one thread at a time. And I run up against problems exploiting this parallelism, which mirror the problems that exist in general-purpose code in adapting to the new multi-threaded multi-core paradigm.

The sort of jobs that I ask my machines to do actually typically contain a *lot* of parallelism. They're things like "Compile all of these files", or "Run all of these simulations". There's tons of parallelism in that sort of job that could be exploited.

So, how easy is it to exploit that parallelism? Surprisingly difficult in the general case, as it happens.

In the specific case of compiling a whole bunch of source files, it happens to be easy, because the tool I use for that, GNU make, has specific support for exploiting parallel jobs. But that's support that's built into that special purpose tool. So, exploiting parallelism for compiling things is more or less a solved problem.

But special purpose support in a special purpose tool is not the UNIX Way[tm]. make(1) is great for compiling things, but not really that much else. Its language is very heavily geared to its task. In general, for scripting large numbers of tasks like simulations, the go-to tool is a bourne (or perhaps, these days, bash) shell script.

The shell language happens to have a neat built-in feature which sounds like it *should* be exactly what you need.

The '&' character, added to the end of a command line, will spawn a new process, and allow that process to proceed in parallel with the main thread of execution. Sounds good, right? I can do

for j in $executables; do
simulate $j &
done

and kick off all my jobs in parallel. In theory, that's exactly what I want. In practice, it's far from it.

Say I have 100 simulation jobs. The above script kicks off all 100 at the same time. I only have 2 processors, and a limited amount of RAM. So each of those jobs competes for the same 2 processors and residency in RAM. Leading to thrashing of the memory and potentially jobs dying from lack of memory.

The right way to manage parallel jobs is actually the way (more or less) that make does it: if you have two processors, there's not much point in having more than two jobs running simultaneously. So if you ask make(1) to compile 100 files and occupy only 2 processors, it will start up two jobs, and wait for one of them to complete before it starts up the next one. This is great as it means you only have to have enough RAM as it takes to run as many jobs simultaneously as it takes to occupy your processors.

And that's what I wish sh/bash would give me. An operator whose semantics are not "start a new process and continue executing this one", but "start a new process and *possibly* continue executing if resources allow."

What do you think the chances of me getting such a modification into the mainline bash are?

These semantics are one of the reasons I really like the Cilk language: its fundamental parallelism operation is a non-blocking function call, whose semantics say that the function call *may* execute in parallel, but it doesn't necessarily do so. The non-blocking function call isn't much more heavyweight than a regular function call, so it's easy for the programmer to just say "there's some parallelism here that the runtime can exploit if it wants", without swamping the system.

For my own purposes, I have a tool which acts as a cross between 'xargs' and 'make -j', taking standard input and combining it with command line arguments to produce multiple command lines, and managing those in parallel to run at a time. *That* is the UNIX way: a single small tool that does one specific thing.

So I can apply it to shell scripts with only a minimum of fuss:

for j in $executables; do
echo simulate $j
done | forkargs -j2

(Yeah, I could have just done 'echo $executables | forkargs -j3' but the first one is more useful in a shell script with more complex setup)

I still wish it was built into shell though. Ho hum.

END RANT.

So, I seem to be seriously considering this whole "learning to fly" lark.

Today, had a trial lesson at the local club, an hour in a '172. The whole thing was great fun, walking round and checking out the aircraft before we headed out, running through all the checklists etc. Once in the air it was very hands-on, and I took the controls for most of it. The lesson was a little like a smorgasbord of manoeuvres from the rest of the PPL course. We flew out to Newmarket, circled the racetrack a few times at 3000ft, lots of turns and changes of altitude etc. Then some stalls and (obviously) stall recovery, which was a lot of fun although on my first one I was perhaps a little too enthusiastic about pushing forward. Which was definitely fun. :) On the way back I got to fly the approach down to about 100ft before the instructor took it back.

Despite the lack of loops or stall turns, it was still a lot of fun, and it all "felt" good, and kinda... natural? It felt good. I'm definitely keen.

Something I realised several weeks ago, while talking with one of my colleagues in the office who recently got his PPL. He said that he couldn't wait until his daughter was old enough for him to take her flying, and get her a miniature headset and all that.

And that resonated with me; that's part of it for me. I want to try to be the best dad that Sam could possibly have. And what kid wouldn't want a parent that could take them flying?

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Recent Entries

Parallelism

Flyin'.

Profile

April 2012

Syndicate

Page Summary

Style Credit

Expand Cut Tags