QMH’s Hidden Secret
Queued Message Handlers (QMH) are an extremely common design pattern in LabVIEW and sit at the heart of many of the different frameworks available for use.
At CSLUG (my local user group) we had something of a framework smackdown with Chris Roebuck and James Powell discussing a couple of frameworks and looking at some of the weaknesses of common patterns.
James’ argument highglighted one of the most common flaws with this pattern which is clearly present in the shipping example in LabVIEW. When using a QMH you cannot guarantee that execution will happen in the order that you expect, on the data you expect.
The concept seems to work for many though, with a QMH style structure at the heart of most of the actor oriented programming and driving some of the largest LabVIEW applications around, what is the difference between success and failure?
A Thought Experiment
During James’ talk I had a bit of a personal epiphany about the QMH which involves a slightly different thought process.
This thought process starts by thinking about the QMH as a virtual machine or execution engine, not part of your application. So if this is the case what are the parts (click to enlarge):
- The Instruction Set: The different cases of the case structure define the instruction set. This is all of the possible functions that the QMH can execute.
- The Program: This is the queue, this defines what the program executes and the order in which the instructions are executed.
- The Function Parameters: The data that is enqueued with the instruction.
- Global Memory: Any local, global variables used AND any shift registers on the loop (we will come back to this)
It’s All About Scope
Scope is important, we all know that when it comes to things like global variables. Scope however is all about context and control and there are two scoping concerns at the centre of many issues with the QMH pattern.
The Program: In the typical pattern any code with the queue reference can at any time decide to enqueue instructions.
Global Memory and, in particular, the shift registers on the loop also give some global state. The shift registers are a big part of the dirty little secret. Common sense says anything on a wire is locally scoped, it cannot be modified outside of the wire, however this is about context. To the QMH this is true, the shift register data is locally scoped. However to a function/instruction inside the QMH this is not true. In the context of a function this is global as other functions can modify this data i.e. you cannot guarantee the state is the same as you left it.
So how do you use the QMH safely? You should reduce the scope of at least one of these to ensure safety.
Reducing the Scope of the Queue
This is something that is beginning to emerge in a major way.
I first saw this pattern a couple of years in a framework called TLB’ that Norm Kirchner proposed. I have since seen at least two alternatives that follow a similar pattern (that I’m not sure are published but you know who you are, thanks!)
The gist of the pattern is that we separate two structural elements apart in the QMH
- An event hander that can take external events and determine what work needs to be done in reaction to that event.
- A work queue which is something like a more traditional QMH however only the event handler can add work items.
This could look something like this in LabVIEW:
(If you look at tlb’ it has the same elements but reversed on the screen).
This has some distinct advantages:
- As long as we don’t share the original queue reference only the event structure or the QMH itself can queue work items. This gives better control over race conditions in terms of order of execution.
- This overcomes another distinct drawback of the shipping QMH example, data can easily be shared between the event handler and the QMH on the wire using the same shift register structure as before, removing the need for various hacky ways of enabling this normally (again credit to James Powell on this observation).
- Now our event handling response time is limited to the time taken to complete the work backlog, we have made our program serial again. I suspect for the simplicity, this is a cost that can be handled by most applications.
- This doesn’t really deal naturally with time based systems like DAQ, but does QMH really?
I really like this structure, parallel programming is hard! This removes many of the complexities that it introduces for event-response type applications in LabVIEW. I expect we may see more and more of these come out over the next couple of years.
Reducing the Scope of Instruction Data
The above is a nice solution to the issue of controlling execution order for QMH and I believe a distinct improvement that I’ve been hoping to write about for a while. However I feel that this solves a symptom of a deeper root cause.
A robust implementation shouldn’t care about execution order. The fact that it does points to a more fundamental flaw of many QMH examples/implementations.
We should be used to this as a fundamental problem of parallel programming (the QMH execution engine model really has a concurrent programming model). If you want a function or, in this case, QMH Instruction, how do you ensure it is safe to run in parallel without race conditions?
You never use data that can be modified outside of that function.
Global variables, local variables (in some instances), Get/Set FGVs could all be modified at any time by another item making them susceptible to race conditions.
These is all still true of a QMH function, but now we add to our race condition risks the cluster on the shift register, which could be modified by any instruction called between our instruction being queued and actually executed.
I see two major solutions to avoid this:
- Pass all relevant with data with the instruction set (i.e. in the data part of the cluster), this ensure the integrity of the execution data.
- Don’t use it as a replacement for subVIs. This is common and you can see it in the shipping example below.
I think this is a common source of problems. Sure a subVI encapsulates functionality and so does a case of a QMH. However the QMH is effectively an asynchronous call which introduces so much more complexity.
This example with Initialize Data and Initialize Panel is typical example. This functionality could easily be encapsulated into a subVI allowing greater control over the data and when the functions are executed. Instead we queue them for later and can’t know what else might have been queued before them, or between them, creating a clear risk of race conditions.
This was a bit of a meaty post which was heavily inspired by others, I’ve tried to highlight their ideas throughout the post but just to say thanks:
- The CLA Summit – A couple of presentations and lots of discussion inspired the start of this thought process. It was great, if your a CLA and want to improve I cannot recommend it highly enough.
- Central South User Group (CSLUG) – A local user group which triggered my epiphany with great presentations and discussions- see above about improving!
- Dr James Powell – Whos talk triggered said epiphany and highlighted some interesting flaws in the standard template.
- Norm Kirchner – Who I’m going to credit as the first person I saw put in the isolated work queue model, if someone showed it to him, all credit!