Improving MRDCL productivity by using functions and subroutines
How to automate things that are boring to script …Read more
One of the topics where we receive most support requests for MRDCL is in the handling of numeric fields or the usage of arithmetic. In this article, I will try to cover most of the issues raised and explain how and why MRDCL works the way it does. I will deal with reading numeric values, processing values, rounding, applying values from weighting procedures and using numerics in table arithmetic (manip). I will also cover the difference between integer (ivars) and real number variables (wvars).
The first thing to explain is the concept of the undefined in MRDCL. Unlike single or multi-response variables where you specify a bit (or code) for undefined/not answered/no response, with numeric variables an undefined value can be applied. For example, if you ask someone ‘how many cans of soft drink they consumed in a week’, they may answer with a value of zero or more, or they may not know. In MRDCL, therefore, you can assign an undefined value to a variable where the answer is unknown, but it is not necessarily zero.
As far as MRDCL is concerned, undefined is valid value for a variable. It is denoted by the character u. Therefore, just as $cans/10 is a valid test in MRDCL, $cans/u is a valid test as is $cans/nu. You can’t use u in a range test like $cans/1..10, but it can be used in variable definitions or filters just like any other MRDCL syntax.
As an easy rule to remember, any arithmetic you perform where one or more of the values is undefined, the result will be undefined. For example, if a respondent tells you that they ate two bananas, two pineapples and two melons, you can add the numbers together and say that they ate six fruits. If, however, the respondent tells you that they ate two bananas, two pineapples but undefined (not known) how many melons, the sum of two plus two plus undefined will return an undefined value. This is true of MRDCL syntax and is logical. Therefore, the following MRDCL syntax would work as described:
Di $fruits=$bananas + $pineapples + $melons,
Controlling values set to undefined when reading data
Most format options in MRDCL apply to the presentation of tables. However, there are small set of formats that control how MRDCL reads numeric values from data. Like all format options, they have default settings that can be changed as often as required. The relevant format options are as follows:
The default setting for this format is zero (IFB0). This means that if a field is blank, MRDCL will read the value as zero. This can be useful when a questionnaire contains a long list of numeric values. For example, you might ask a respondent how many cans of 20 different soft drinks that they have consumed in the last week. It is possible that rather than entering zero for each of the 20 soft drinks that are not consumed, each blank represents zero. In this case, you might well want to treat blank as zero. In many cases, though, a blank might mean ‘not known’. For such cases, you would want to set a blank field to undefined. This is where you would set format IFBU and replace the default setting IFB0. You can switch this setting back and forth as much as you wish so that some fields read blanks as zero and others as undefined. This is true of all the following formats.
Less common is setting what happens where a field is invalid. Let’s say someone has entered XXX for a numeric field; you would probably want to treat that as undefined unless XXX has some special meaning. XXX is not a valid numeric value, MRDCL would consequently take the value of the IFF format. Its default is undefined (IFFU), but there may be some occasions where you want to set it to zero using IFF0.
By default, MRDCL does not accept decimal places in fields. Therefore, a value of 2.5 will be treated as an invalid field and return the value set by format IFF. If you want to accept decimal places in fields, you will need to turn on format ADF. This logical format has the default setting NADF, so you will need to specify that you want F=ADF.
Like ADF, MRDCL does not, by default, accept negative numbers in fields. It will treat the value as invalid and return the value of IFF. To accept negative numbers, you will need to use F=ANF to replace the default F=NANF.
By default, MRDCL ignores leading blanks, such that 00025 is the same as 25 preceded by three blanks. However, MRDCL does not, by default, accept trailing blanks, so 25 followed by three blanks would return the value set in format IFF. To accept left-justified numbers in fields and, hence, trailing blanks, you will need to set format ATF and replace the default value of NATF.
Real number variables that allow you to store up to 13 decimal places are referred to as wvars in MRDCL. Integer variables (ivars) cannot store any decimal places and must be whole numbers. Where a result is stored in an integer variable, it will round to the nearest whole number. Where the value ends in .5, it will round up. These examples hopefully illustrate what MRDCL will do:
Di $litres=10.4 + 10.4, (result will be 21)
Di $litres=10.2 + 10.2 (result will be 20)
Di $litres=10.2 + 10.3 (result will be 21)
One important format in MRDCL that is sometimes needed is format DBL. This format activates double-precision arithmetic. It is advisable to use this format where you are dealing with big sample sizes, big numbers or numbers with several decimal places. It is particularly important to use where you are using MRDCL to process target or rim weighting. It will slow down runs, but it will be necessary for the circumstances described. Some users prefer to set this as a default.
One of the secrets of MRDCL is that you can define variables in the manip stage. This will allow you to perform tests on tables. As a simple example, let’s say you want to remove a table where the base is less than 5. You could test for the value in the total cell (top-left cell) of the table, set the table to zero if necessary so that format BTS removes the table. You could do the same thing with individual rows or columns in conjunction with BRS and BCS rather than BTS. This syntax would work well:
This would assign the total cell value to the variable $base, so that you can set the whole table to zero if the base is less than five.
If you have any further unresolved issues with numerics, please let us know. We will update this article accordingly. If you need any further help, please contact us.