Language-level Immutablity and Shared Memory Concurrency
I've been a fan of Bartosz Milewski’s Programming Café ( or possibly Bartosz Milewski’s Programming Caff; I'm not sure if lack of an accent implies greasy chips and tea outside of the UK ). He's exploring the options of applying the leverage of type checking to concurrency problems within shared-memory systems. In a different camp, there's the Erlang crowd, typified by Steve Vinoski which says that language-level immutability and message passing is the way to go. A third perspective, probably closer to what I would like to see, are CSP based approaches such as Limbo and stackless Python, which are shared nothing and use channels to pass messages by value. The channels and pass-by-value approach I'm experimenting with in kin is close to the Limbo model, except that channels are objects with support different messages. Apart from the D extensions by Bartosz Milewski, these techniques require the language integrates with a scheduler, usually a green threads system rather than the OS scheduler.
One thing which bothers me about Erlang and pure-functional languages in general is the assumption that language level immutability has some bearing on concurrency.
It's fairly obvious that any message shared by machines on a network will be unlikely to include memory addresses in one of the processes, other than possibly as opaque values. In other words, if it's on the net, it's pass by value. Similarly most inter-process communication is pass by value, and the sharing of messages containing values and names of channels is well understood using CSP and pi-calculus. The net is unconcerned about what paradigm the languages which implement the processes communicating by message passing; it only cares that the messages are values and names. It doesn't follow at all that the implementation language for a process which operates on messages needs to present immutable variables to the programmer.
The other area functional languages have been touted has been for problems where there are shared memory concurrency within a process.
A trivial counterexample is given by lazy evaluation - in trivial implementations, expressions are passed around as thunks, consisting of a place to store the value of the expression, and a function to evaluate the value. The expression is evaluated lazily by calling the function the 'first' time the value is needed. Being a pure function, it doesn't matter as far as the value is concerned whether the function is called multiple times in different cores, but if the value returned is a non-primitive there are still write order issues with sharing the value. So you end up with similar barrier ordering problems that you have in non-pure languages, and barriers cost performance.
For strict functional languages, you still need the barriers, and other synchronisation points, if you are going to communicate values between cores. If you are passing messages by value, then it is necessary that the sender or receiver of a message cannot change the value of the message. This isn't necessarily the same as language level immutability - in a shared memory space, values sent on a channel should not change, but that can be achieved by copy-on-send or copy-on-write approaches as well as by general immutability. Having a shared memory space between threads requires synchronisation with the garbage collector - otherwise memory can be reclaimed and rewritten, breaking the immutability. None of this is particularly surprising, and the implementation of any SMP shared-memory language with automatic memory allocation will have to deal with it.
The presence of immutability at the language level allows the optimisation that a message can be passed as an immutable reference rather than by value. This introduces the pessimisation that any garbage collection of the arena that the messages' storage is allocated from has to be coordinated between all threads. If the optimisation of mostly-concurrent memory management is made, where each thread has its own arena, memory for messages is either allocated from a special shared arena when constructed, requiring some synchronisation, or messages are copied when sent.
It's fairly obvious that if you are allocating from a shared arena, then you're having to serialise some of the requests, so are losing concurrency ( as well as having to do escape analysis so you know what objects constitute the message and so require to be allocated from the shared arena ), and that if you're copying messages into a shared arena when sent then you have lost the pass-by-immutable-reference optimisation. Either way, I've yet to see a truly convincing case that immutability in itself actually gives any performance gain for concurrent code and typical message sizes ( if you have very large messages which are not referenced by the sender, then any copying scheme will be inefficient, but that seems to be a case for escape analysis rather than immutability; copy-on-send would be at a disadvantage there, but Erlang style encourages small messages anyway ).
So the problems in terms of performance are acquiring lock on the memory bus, which takes around 40 cycles for a lock cmpxchg, barriers to order access, and copying. Languages with channels as first-class constructs hide the details of locking and barriers, and can trade various copying or shared memory techniques.
I've not seen anything in Erlang that would suggest that its shared nothing messaging is less prone to programmer error than Limbo's channels. There are other reasons to use pure-functional languages in terms of proving theorems about the program, and below the language level some optimisations are facilitated, and some are harder to perform. But there seems to be a confusion between which bits of a particular combination of facilities in a language are actually supplying the gains, and my opinion is that message passing combining values and channel names ( however implemented or enforced ) is both safe, and open to analysis at the systems level by pi-calculus.
Labels: concurrency, kin, pi-calculus, programming
5 Comments:
I find that language level immutability has consequences beyond the technical aspects you focus your analysis on.
Languages and their paradigms change how we think about problems, and with immutable values you have to program in a certain way.
I think practical usage of Erlang show that it has strengths in creating scalable and robust systems.
This said, immutable values in Erlang are used for one technical advantage: an old generation will never contain pointers to value to a newer generation.
If one wants to analyze the aspect of sharing storage between processes this memory model has also beeen implemented in Erlang. As you say it has not really been benefitial since independent GC of processes is to great advantage in scaling.
Greasy chips? Tea? I beg your pardon! Bartosz's Programming Cafe is in Seattle, the capital of espresso. (Yes, I know, the Italians may think differently. Just as they do with pizza--an obviously American invention).
We are having the discussion about message passing vs. shared memory in the D community all the time. Since you can build message passing on top of shared (mutable) memory, it seems like restricting the language to message passing alone eliminates a lot of potentially useful paradigms. Single-paradigm languages tend to become niche languages (like Smalltalk and its infatuation with object-oriented programming).
> I think practical usage of Erlang show that it has strengths in creating scalable and robust systems.
True, but they seems to follow more from other aspects of OTP than language-level immutability. And it's not uncommon for people to argue that because you can build systems in Erlang, and Erlang is pure, then other pure languages such as Haskell and functional languages such Scala must also be advantageous for distributed systems. Such arguments miss the value of the patterns used in OTP, such as the supervisor tree, which I've seen again and again in successful high availability software implemented in other languages.
> This said, immutable values in Erlang are used for one technical advantage: an old generation will never contain pointers to value to a newer generation.
I agree that it does simplify things somewhat. But given the success of gc in languages with mutable values, it would be hard to argue that it's necessary.
I'm not sure anyone is arguing that Erlang is a pure functional language.
And, I think you still focus too much on the technical aspects, and not enough on the cognitive effects of having a programming model that constantly remind you of avoiding side-effects.
I'd gladly see a language with cheap concurrency WITH mutability, and see them shoot Erlang out of the water in creating killer applications.
> And, I think you still focus too much on the technical aspects, and not enough on the cognitive effects of having a programming model that constantly remind you of avoiding side-effects
I've not seen any evidence working with high reliability software that side effects are an issue. Most embedded systems interact with the scheduler to strongly guarentee that shared memory areas are handed over between processes; the systems on general processors which I've seen which are designed for high availability ( mainly nuclear and petro-chem monitoring and control ) have used separate processes each with a single main thread of control and message passing between processes. Within a single process' memory space there is no technical reason to avoid side-effects. Most high reliability systems are successfully written in subsets C and ADA, where side-effects are the main means of program execution.
My hunch is that if you have to restrict yourself then those successes are more despite C and ADA not because of them; those languages are chosen for reasons other than having concurrency models which reduce the scope of what a programmer has to think about. In terms of high reliability systems which use shared memory, the sort of sharing analysis that Bartosz Milewski is talking about adding into the language is similar to some of what the annotations and restrictions enable in separate tools ( Spark ADA, and to a lesser extent splint ). For other systems, CSP style message passing appears to have similar cognitive gains - if you can't mutate anything outside your process, you have significantly reduced the scope of the effects you have to consider.
On the other hand, I've enjoyed working extensively with Prolog and XSLT, and like pattern matching enough that am playing with implementing a mostly side-effect free OO language ( being side-effect free does help a little in the implementation of pattern matching and a lot with back-tracking search, but that would be a different post ).
But aim of this post was not to compare benefits or disadvantages of the paradigm as a whole, but about whether presenting immutability to the programmer gives a compelling technical gain in messaging, so it is intentionally fairly narrow.
Post a Comment
<< Home