Parallelism
Apr. 28th, 2012 07:46 pmBoth my work computer and my own personal computer have dual-core processors.
This means that in order to get the best out of them when doing large processing jobs, I need to run more than one thread at a time. And I run up against problems exploiting this parallelism, which mirror the problems that exist in general-purpose code in adapting to the new multi-threaded multi-core paradigm.
The sort of jobs that I ask my machines to do actually typically contain a *lot* of parallelism. They're things like "Compile all of these files", or "Run all of these simulations". There's tons of parallelism in that sort of job that could be exploited.
So, how easy is it to exploit that parallelism? Surprisingly difficult in the general case, as it happens.
In the specific case of compiling a whole bunch of source files, it happens to be easy, because the tool I use for that, GNU make, has specific support for exploiting parallel jobs. But that's support that's built into that special purpose tool. So, exploiting parallelism for compiling things is more or less a solved problem.
But special purpose support in a special purpose tool is not the UNIX Way[tm]. make(1) is great for compiling things, but not really that much else. Its language is very heavily geared to its task. In general, for scripting large numbers of tasks like simulations, the go-to tool is a bourne (or perhaps, these days, bash) shell script.
The shell language happens to have a neat built-in feature which sounds like it *should* be exactly what you need.
The '&' character, added to the end of a command line, will spawn a new process, and allow that process to proceed in parallel with the main thread of execution. Sounds good, right? I can do
for j in $executables; do
simulate $j &
done
and kick off all my jobs in parallel. In theory, that's exactly what I want. In practice, it's far from it.
Say I have 100 simulation jobs. The above script kicks off all 100 at the same time. I only have 2 processors, and a limited amount of RAM. So each of those jobs competes for the same 2 processors and residency in RAM. Leading to thrashing of the memory and potentially jobs dying from lack of memory.
The right way to manage parallel jobs is actually the way (more or less) that make does it: if you have two processors, there's not much point in having more than two jobs running simultaneously. So if you ask make(1) to compile 100 files and occupy only 2 processors, it will start up two jobs, and wait for one of them to complete before it starts up the next one. This is great as it means you only have to have enough RAM as it takes to run as many jobs simultaneously as it takes to occupy your processors.
And that's what I wish sh/bash would give me. An operator whose semantics are not "start a new process and continue executing this one", but "start a new process and *possibly* continue executing if resources allow."
What do you think the chances of me getting such a modification into the mainline bash are?
These semantics are one of the reasons I really like the Cilk language: its fundamental parallelism operation is a non-blocking function call, whose semantics say that the function call *may* execute in parallel, but it doesn't necessarily do so. The non-blocking function call isn't much more heavyweight than a regular function call, so it's easy for the programmer to just say "there's some parallelism here that the runtime can exploit if it wants", without swamping the system.
For my own purposes, I have a tool which acts as a cross between 'xargs' and 'make -j', taking standard input and combining it with command line arguments to produce multiple command lines, and managing those in parallel to run at a time. *That* is the UNIX way: a single small tool that does one specific thing.
So I can apply it to shell scripts with only a minimum of fuss:
for j in $executables; do
echo simulate $j
done | forkargs -j2
(Yeah, I could have just done 'echo $executables | forkargs -j3' but the first one is more useful in a shell script with more complex setup)
I still wish it was built into shell though. Ho hum.
END RANT.
This means that in order to get the best out of them when doing large processing jobs, I need to run more than one thread at a time. And I run up against problems exploiting this parallelism, which mirror the problems that exist in general-purpose code in adapting to the new multi-threaded multi-core paradigm.
The sort of jobs that I ask my machines to do actually typically contain a *lot* of parallelism. They're things like "Compile all of these files", or "Run all of these simulations". There's tons of parallelism in that sort of job that could be exploited.
So, how easy is it to exploit that parallelism? Surprisingly difficult in the general case, as it happens.
In the specific case of compiling a whole bunch of source files, it happens to be easy, because the tool I use for that, GNU make, has specific support for exploiting parallel jobs. But that's support that's built into that special purpose tool. So, exploiting parallelism for compiling things is more or less a solved problem.
But special purpose support in a special purpose tool is not the UNIX Way[tm]. make(1) is great for compiling things, but not really that much else. Its language is very heavily geared to its task. In general, for scripting large numbers of tasks like simulations, the go-to tool is a bourne (or perhaps, these days, bash) shell script.
The shell language happens to have a neat built-in feature which sounds like it *should* be exactly what you need.
The '&' character, added to the end of a command line, will spawn a new process, and allow that process to proceed in parallel with the main thread of execution. Sounds good, right? I can do
for j in $executables; do
simulate $j &
done
and kick off all my jobs in parallel. In theory, that's exactly what I want. In practice, it's far from it.
Say I have 100 simulation jobs. The above script kicks off all 100 at the same time. I only have 2 processors, and a limited amount of RAM. So each of those jobs competes for the same 2 processors and residency in RAM. Leading to thrashing of the memory and potentially jobs dying from lack of memory.
The right way to manage parallel jobs is actually the way (more or less) that make does it: if you have two processors, there's not much point in having more than two jobs running simultaneously. So if you ask make(1) to compile 100 files and occupy only 2 processors, it will start up two jobs, and wait for one of them to complete before it starts up the next one. This is great as it means you only have to have enough RAM as it takes to run as many jobs simultaneously as it takes to occupy your processors.
And that's what I wish sh/bash would give me. An operator whose semantics are not "start a new process and continue executing this one", but "start a new process and *possibly* continue executing if resources allow."
What do you think the chances of me getting such a modification into the mainline bash are?
These semantics are one of the reasons I really like the Cilk language: its fundamental parallelism operation is a non-blocking function call, whose semantics say that the function call *may* execute in parallel, but it doesn't necessarily do so. The non-blocking function call isn't much more heavyweight than a regular function call, so it's easy for the programmer to just say "there's some parallelism here that the runtime can exploit if it wants", without swamping the system.
For my own purposes, I have a tool which acts as a cross between 'xargs' and 'make -j
So I can apply it to shell scripts with only a minimum of fuss:
for j in $executables; do
echo simulate $j
done | forkargs -j2
(Yeah, I could have just done 'echo $executables | forkargs -j3' but the first one is more useful in a shell script with more complex setup)
I still wish it was built into shell though. Ho hum.
END RANT.