By Value vs By Reference In LabVIEW
After my previous post about Learning LabVIEW OOP there were some comments on by reference vs. by value which often come up when talking about OOP. I think there are two reasons that these are tightly linked to conversations about OOP.
- In “classical” OOP languages everything is by reference but in LabVIEW OOP is by value. This causes a clash when people have learned OOP from these languages.
- We do more by reference work in non-OOP LabVIEW than we sometimes like to admit.
I have been thinking about the techniques and analogs to these lately anyway so this is a bit of a meaty article covering the options that I see for implementing these and some thoughts on how they fit into teaching OOP.
By-Reference vs. By-Value
Lets first define my interpretation of these items. I like to think of them at this level as how data behaves rather than definitions of implementation.
- By-Value: If you take a wire/data in LabVIEW and change it then it changes only for that piece of code and the code that is dataflow dependent on it. There is also no way another piece of code can change the data on the wire. Branching the wire risks creating a copy.
- By-Reference: The data of is stored in one memory location. When you make changes they might affect other components that don’t have a dataflow dependency and if you read it twice in a row another piece of code could have changed it.
(This may differ from a classic computer science definition but my main concern is the behavior I see, perhaps a different term is required but stay with me!)
These strongly overlap with data communication. Another way to think of this is that by-value is communication on a data wire where as by-reference is a tag based communication method (in some situations).
So Why Use By Reference In LabVIEW?
My preference is to always lead with by-value where it works. I think this is key to what makes LabVIEW a powerful language. The data on the wire is yours to do what you want with and you don’t have to worry about side effects when you are programming. I suspect it is one of the keys that makes LabVIEW much easier for people without a software background to pick it up.
There are perfectly valid reasons reasons to use by reference though either in spite of, or because of, these side effects. The following items are cases where I look to by-ref:
- Shared Application Resources: This is not a great term but what I mean by this are resources such as an error handler or a system configuration where you want every piece of code singing from the same hymn sheet.
- Hardware Resources: This is one of the most common – DAQmx and VISA already all run on a by reference API. If I have to add to the API (attach additional data) I will use a by-ref scheme so it continues to work as you expect.
- Huge Data: If a data structure is a big proportion of your application memory you need to make sure you don’t copy it often. You see this in the IMAQ library where images are handled by reference (and the confusion it can cause!). This is the in spite of case – you don’t get any programming benefit but you have to use it due to the constraint of the system.
How To Do By Reference In LabVIEW
There are multiple techniques to get the behaviour I mentioned above. I have put down the key ones below; My favourite will be obvious!
The simplest and most dangerous technique due to race conditions. This isn’t a problem in single process languages but in LabVIEW it can get you into lots of trouble!
There is one case that I may use them which is for WORM (write once, read many) globals which can be useful for configuration data but I never use them with OOP.
Data Value References (DVRs) allow you create your own reference wire to any data type. You access the data through the in-place memory structure which protects the data – no other code can access it until you are finished with it. This is very important for preventing race conditions.
This is my preferred method for by-ref objects. I would create the class normally but change the standard methods to use DVRs of the class rather than the class itself. What I have found is:
- Good – 100% scalable. Want 1/5/20/1000? Not a problem.
- Good – The call is synchronous. Once the subVI completes the function it performs is complete.
- Good – I have heard criticisms that this can lead to having more wires on the diagram. I think this is a good thing! Wires make it obvious what is coupled to what. Variables and AEs don’t make that obvious.
- Good – The property nodes for objects support this with no extra code.
- Bad – The boilerplate is tedious! Creating the references and the in place element structures with their weird error handling
- Bad – References can be invalid at run time which is avoided with the FGV/AEs.
I’m not going to get into a debate on what term to use for these. What I mean is a non-reentrant VI which uses an uninitialised shift register to store data. Normally developers will have an enum input which defines what function is performed and/or the core FGV/AE is wrapped in another API to allow for an easier connector pane.
These are not traditionally referred to as a by-ref programming method but I would argue they are. The “reference” is which VI you are calling. That is what defines which data is modified or accessed and any changes can be seen anywhere else in the program.
When I started my company I started down the DVR route already, having experimented with both so I haven’t used this as extensively in big applications but the reason I decided not to start with them were:
- No wires so coupling is somewhat hidden (you would have to view the VI hierarchy).
- Bloated connector panes (though wrapping it in another API does help with this).
- No scalability – you either have to write an addressing scheme into the FGV or maintain multiple versions of the VI.
That said these are the preferred methods of many. It is well understood by different developers and is simpler to create than DVRs and shares the benefits of being synchronous.
It is rare that I use these but one exception is actually when this is the exact behavior I want! If I want a singleton object (for my error handler for example) I create an FGV that stores an DVR/Queue Ref and on the first call it will initialise it. That way I get the same reference everywhere in my application.
Queued Message Handlers/Actors
This is a bit more of an unusual item to add here but it can be and I believe is used for the same cases as above.
Increasingly these are being used in systems like a module. You have an actor for each instrument you want to talk to for example and you enqueue commands for it to complete.
This mode of operation is very similar to a traditional by-ref model. The “reference” is the queue reference or actor reference which ties you back to the “data” stored in the shift registers of the QMH. The QMH loop protects access to the shared resource by processing messages one by one, protecting you from race conditions.
It isn’t exactly like the other options, the key difference is that it can operate independently as well as responding to other requests which can make them hugely more powerful.
There is a major added complexity with this though which is that they are asynchronous. This means two-way communications are difficult for example, what happens if there is an error in the QMH? Also you can’t understand the time from the block diagram. I find I have to create sequence diagrams in order to understand the program flow.
Also you can’t understand the time from the block diagram. I find I have to create sequence diagrams in order to understand the program flow.
This is the reason one of the core tenets of an actor based system is that it is a request and you can’t care when or how it gets done. This rule means you must design your system in order to avoid these complexities, but I think this is a hard rule to follow consistently!
For these reasons, I avoid using these unless I need it to run independently or there is multiple classes interacting within it. I tend to use these for “processes”. For example, a DAQ system where the data just gets published onto an event when it is ready.
So When Do You Teach It?
So coming back to the previous article, when do you teach someone about this?
I would argue that this doesn’t need to be intertwined with OOP. In many cases, if they aren’t new to LabVIEW they will already be using one of these techniques. Sticking with that is the simplest route.
If they are new I believe that the decision between these is probably an architecture decision as each has pros and cons in different scenarios. It is hard to teach them all at once so I would look at your typical architectures. Does one of these tend to form the backbone or default option? (In my case it is the DVRs). In which case I would start with that. You can teach the other methods as they are required. If you suddenly need all four at once you could even hide one method behind another to get a new developer on board quickly.
Let Me Have It!
This is bit of a work in progress in my thought process which the question above prompted. I’m pretty sure my terminology isn’t great but I feel the idea is solid so please comment below or find me on twitter or the NI forums. I find it very helpful to help me understand this better.