Writing tests in Smoketest is intended to enable a test developer to write tests in a way that describe themselves, without requiring the test developer to add this “narrative” themselves. To see this in action, I thought I would compare some simple DUnit tests with the equivalent using the Smoketest framework.
For this exercise we shall consider the test for a for splitting a string based on some delimiting character. The prototype for the function is:
class function WIDE.Split(const aString: UnicodeString; const aChar: WideChar; var aParts: TWideStringArray): Boolean;
In any successful call to this function there are a number of things that need to be checked to ensure correct behaviour.
First, the function should return TRUE only if one or more instances of aChar are found in the string. Second, where TRUE has been returned, the number of entries in the aParts array needs to be correct. And finally, each item in that aParts array needs to be what we would expect.
A DUnit test for this might look something like this:
var ReturnValue: Boolean; aString: UnicodeString; aChar: WideChar; aParts: TWideStringArray; begin aString := 'left*mid-left*middle*mid-right*right'; aChar := '*'; ReturnValue := WIDE.Split(aString, aChar, aParts); CheckEquals(ReturnValue, TRUE); CheckEquals(Length(aParts), 5); CheckEquals(aParts[0], 'left'); CheckEquals(aParts[1], 'mid-left'); CheckEquals(aParts[2], 'middle'); CheckEquals(aParts[3], 'mid-right'); CheckEquals(aParts[4], 'right'); end;
First off all, it’s worth mentioning here that the IDE support for DUnit was actually counter-productive in this case. The wizard fails to recognize a class function and creates swathes of boiler-plate test code for setting up and tearing down tests, instantiating the class in order to (incorrectly) call it via an instance.
But apart from that, once all extraneous and erroneous code has been cleaned out, we can get on with writing the test itself and as written above – this all looks fine, right ?
Wrong.
What’s worse is that as long as all the tests pass there is no reason to suspect that this test is actually completely and utterly wrong. The CheckEquals method places particular significance on the order of the parameters identifying the two values that are supposed to be equal. Without inspecting the parameter list for the CheckEquals() method this significance is not immediately obvious.
When the two parameters are equal (test will pass) this doesn’t matter, but in the event of a test failure if you get the order wrong then the test report will itself be misleading, reporting the actual value as the expected value and vice versa. The correct DUnit test should be written:
var ReturnValue: Boolean; aString: UnicodeString; aChar: WideChar; aParts: TWideStringArray; begin aString := 'left*mid-left*middle*mid-right*right'; aChar := '*'; ReturnValue := WIDE.Split(aString, aChar, aParts); CheckEquals(TRUE, ReturnValue); CheckEquals(5, Length(aParts)); CheckEquals('left', aParts[0]); CheckEquals('mid-left', aParts[1]); CheckEquals('middle', aParts[2]); CheckEquals('mid-right', aParts[3]); CheckEquals('right', aParts[4]); end;
And still, the only information coming from DUnit about these tests will be that an actual value either did or did not equal it’s expected value. If the test developer wishes to add some descriptive information about this they must add it as a msg parameter to the test:
CheckEquals(TRUE, ReturnValue, 'Split()'); CheckEquals(5, Length(aParts), 'No. of parts'); CheckEquals('left', aParts[0], 'aParts[0]'); CheckEquals('left', aParts[1], 'aParts[1]'); etc
This always felt backwards to me with DUnit and was one of the primary reasons for creating Smoketest and taking an entirely different approach which more closely resembles the language we would use when describing our expected test outcomes.
Let’s look at the equivalent tests in Smoketest:
Test('Split()').Expect(ReturnValue).IsTRUE; Test.Expect(Length(aParts)).Equals(5); Test('aParts')[0].Expect(aParts[0]).Equals('left'); Test('aParts')[1].Expect(aParts[1]).Equals('mid-left'); Test('aParts')[2].Expect(aParts[2]).Equals('middle'); Test('aParts')[3].Expect(aParts[3]).Equals('mid-right'); Test('aParts')[4].Expect(aParts[4]).Equals('right');
This is perhaps a little more verbose but to my mind reads far more naturally as an expression of our test expectations. We say what it is we are testing (if it needs spelling out), identify the where the value comes from that we need to test and then say how it should meet our expectations.
NOTE: The use of indexing syntax on the Test()[] expression is optional but facilitates labelling tests where tests are being applied iteratively from some collection of test vectors. For example if we had declared our expected resulting parts in a VECTORS array:
const VECTORS: array[0..4] of String = ('left', 'mid-left', 'middle', 'mid-right', 'right'); .. for i := 0 to High(VECTORS) do Test('aParts')[i].Expect(aParts[i]).Equals(VECTORS[i]);
With the context provided by the interfaces returned at each step along the way, the test developer is guided toward writing tests that are appropriate to the values being tested.
But there is another advantage to the way that tests work in Smoketest as compared with DUnit.
Fail Early. But Not Too Early
In DUnit, each CheckEquals() must pass if the following checks are to be performed. Sometimes this is desirable. If you are testing that you have an object reference before you then go on to check other properties of that object then there is little point in proceeding since those tests are simply going to fail.
This might be described as “Failing Early“.
But in other cases, the manner in which subsequent tests fail could provide useful diagnostic information to explain the initial failure.
Consider a hypothetical situation where a developer has identified a potential optimisation in the Split() function. They make their change and run the tests, but as a result of their change the Split() function creates the wrong number of items in the aParts array.
As a result in DUnit, this part of the test will fail and case the test method to halt:
CheckEquals(5, Length(aParts));
If – say – the Split() function is creating only 4 items in the aParts array, then what those 4 parts contain could contain useful information that will help the developer realise their mistake. With DUnit they won’t get this information.
With Smoketest – by default – the test of the number of items in aParts will fail but the test will continue to apply the further tests and will either output garbage, crash or halt with an ERangeCheck exception (if compiling the tests with range checking enabled) only on the test of the fifth, non-existent item in the aParts array. As a result we might see the following in our test results:
No. of parts - FAILED Expected: 5 Actual: 4 aParts[0] - FAILED Expected: 'left' Actual: 'left*m'; aParts[1] - FAILED Expected: 'mid-left' Actual: 'id-left*mi'; aParts[2] - FAILED Expected: 'middle' Actual: 'ddle*mid'; aParts[3] - FAILED Expected: 'mid-right' Actual: 'dle*mid-'; ERANGECHECK EXCEPTION
This is not the actual output, just a representation of it. And it is entirely hypothetical data of course not intended to indicate any particular type of error that might exist in a function such as Split() rather only to demonstrate that “fail early” is not always the most helpful strategy, especially when testing.
We should always of course “Fix the first problem“, but sometimes the consequential problems help us identify what that first problem is.
With Smoketest you can get DUNit-like fail early behaviour if you want it. And more.
To make this test halt if the number of items in the aParts array is not what we expect, then we simply add a qualification to the effect that this test result is a required outcome and we add it to the test itself:
Test.Expect(Length(aParts)).Equals(5).IsRequired;
As explained in an earlier post, an IsRequired result will halt the current test method if the test fails. IsCritical can be used to halt an entire test case, and IsShowStopper will halt the entire test run.
But so far all we have really seen is how Smoketest does what DUnit also does, just differently. Now for something completely different.
Greater Expressiveness
With DUnit the number and type of tests you can perform is fairly limited. The CheckEquals() method is called upon to carry a great deal of the burden of testing, often hiding the detail of a test in the expression used to calculate a result passed to that CheckEquals() method as a boolean.
Imagine a scenario where a test was interested only in whether or not some value exceeded some threshold amount but was not concerned with the precise value. In other words, that some value was greater than some other value.
In DUnit you would write this test as follows:
CheckEquals(TRUE, value > limit);
And in the event of a test failure you get the not very helpful report that TRUE was expected, not FALSE. So you are forced to add some narrative to describe the test to endow it with a meaning that is no apparent from the test itself:
CheckEquals(TRUE, value > limit, 'Value is greater than Threshold);
In Smoketest, because test expectations are specific and appropriate to the type of value being tested, there is far greater diversity and richness of expression in the tests available, enabling tests to be written in a way that describe themselves. This limit test for example would be written in Smoketest using an Integer expectation:
Test('value').Expect(value).IsGreaterThan(limit);
Not only does Smoketest guide us toward writing a more appropriate test but since the test describes itself the result is now actually more compact than DUnit where the test has to be described separately from and in addition to the test itself (and then only if the test developer could be bothered to add that description in the first place).
As I mentioned, in this particular case the DUnit IDE wizard created a whole lot of boilerplate code for setting up and tearing down this test case that was wholly inappropriate on this occasion. But there are times when you need such housekeeping and this is what I shall cover in my next Smoketest post.
Your last example may actually be a counter-example: the custom message on test failure is something I find more important than the actual test itself. For your
Test(‘value’).Expect(value).IsGreaterThan(limit);
all that a failure will give is that value wasn’t greater than IntToStr(limit) or FloatToStr(limit) since the explicit name “limit” is lost.
Also it’s not that obvious what the test is doing: you introduce a custom reverse-ordered method/operator named “IsGreaterThan” rather than use the common “>”.
In DUnit, it’s just the IMHO more immediately readable
Check(value > Limit, ‘Value is greater than limit’);
which won’t give you a magic number in case of failure, and what is checked is completely unambiguous.
And you can make even more expressive messages like
Check(value > Limit, ‘Value is greater than limit in sub-case ‘+subCaseDescription);
This is something I’m often relying on whenever a test case is exhaustive and thus needs to involve loops. Also having the description at the end of the expression helps to visually align the code.
The other thing I’m not very fond of is the fluent style, which in Delphi is problematic: can’t breakpoint in intermediate steps, stepping through is problematic/cumbersome, and call stacks can become ambiguous.
I don’t understand your complaint about “IsGreatherThan” vs the symbol > which means exactly the same thing. There is no reversing of order of operator involved, just words to replace a symbol (which in doing so enables information capture).
But you are right that my DUnit example was actully too well written to make a good case. I wrote a better DUnit test than I did with Smoketest. 🙂
You can of course capture the significance of “limit” in the message with Smoketest just as easily:
The difference is that the minimum information that you will get, if the test writer has not deigned to provide any narrative, will be more than the minimum you will get from DUnit. All of the extra work you have to put in with DUnit you can still choose to put in with Smoketest if you wish or need to in some cases, but in all cases you get far more without having to put that extra work in.
The counter example of a poorly written test from DUnit is not what you offer but rather:
The test may be more readable but the results are utterly meaningless without any additional commentary. The directly comparable test in Smoketest would be:
Which at the very least captures the two parameters involved in the test and the relationship between them in a way that a simple boolean true/false utterly fails to do. Although of course if a developer were determined to be obtuse they could equally write
But you have the choice. 🙂
As for fluent style, I too am no fan, which itself should say something about what I at least perceive to be the benefits in this case.
But your objections don’t (or shouldn’t) apply in a testing framework where the framework should be utterly transparent to the point of being invisible. You really shouldn’t need to step through any intermediate steps here. Similarly, call stacks should be of no interest above the point of entry into your code from the test itself and will be perfectly clear from that point down to any particular point of interest in any code you are debugging. Having the testing framework scaffolding cluttering up the call stack could be as much a hindrance as a boon.
And in fact there is a system of debuginfo control in my includes which deal with this problem directly.
To be able to step through the Smoketest framework code at all you have to add a conditional define to your project even if compiling your own code with debug info enabled. This itself might be seen as irksome to some but just goes to show that you can’t please everybody all the time. 🙂
>You can of course capture the significance of “limit” in the message
> with Smoketest just as easily:
>
> Test(‘Value > limit’).Expect(value).IsGreaterThan(limit);
I still fail to see why that’s better than just
Check( value > limit , ‘Value > limit’ )
Not only is the DUnit form shorter, easier to align, it also benefits from compile-time type checks and basic syntax coherency checks.
Your framework on the other hand has to recreate that with through many typed interfaces, which will have side effects, f.i. if I want to test that an Integer is in greater than a floating point value, there will be need for either a Round() or an explicit Integer to Float conversion where none would be in regular code (and it could have precision effects in some maths code, edge case, sure, but why the trouble?).
> The test may be more readable but the results are utterly meaningless
> without any additional commentary. The directly comparable test in
> Smoketest would be:
> Test.Expect(value).IsGreaterThan(limit);
That’s not very useful, as the only extra detail the framework gets is the comparison, but it loses the variable names, so you’ll get a message like “test failed 1 > 2″… still not very meaningful… So you’ll still be needing a proper description string.
For DUnit-like code, it’s simple to use JCLDebug to grab the call-stack and obtain the code line from the source file and display “Check(value>limit);” if no description is provided, and that’ll be better than just “1>2”.
>the framework should be utterly transparent to the point of being invisible.
Indeed.
> You really shouldn’t need to step through any intermediate steps here.
Not when you have thousandths of tests.
Going through the UI to check and uncheck tests only works when you have few tests, but there is a point when it just becomes more cumbersome than looking at the code of the test cases, and directly break-point there.
When tests involve loops or arrays (as in your example), checking manually a particular checkbox in the UI isn’t convenient: if the test hierarchy is too deep, you’ll be wasting time expanding and closing node, if it’s too shallow, you’ll be scrolling across long lists.
This is especially true when people didn’t write complete and explicit description for all the individual tests.
For instance, what caption do you have in the UI for your sample test?
Test.Expect(value).IsGreaterThan(limit);
Before having run the test, at best you could have “IsGreaterThan”… not very meaningful.
Placing a break-point at the Check() call is far more straightforward and simple than placing one in the tested function, and then skipping through false positives, or writing a more complex conditional condition.
You aren’t making a huge amount of sense I have to say.
“Test failed 1 > 2” is already more meaningful than “Test failed FALSE“. The latter does not carry any information at all. What is FALSE ? The moon is waxing ? Oranges are lemons ? Colour of my socks are matching ? What ?
The chances are that in the context of a particular test the semantics of the test are known and the specific labels on the variables not particularly relevant to discerning the meaning. The simple fact that a “greater than” test was performed may be sufficient. Ot it may not. It’s always difficult to draw concrete conclusions from entirely invented, hypothetical examples. 🙂
But nowhere do I suggest that Smoketest removes completely the need to provide additional commentary. Far from it. Smoketest specifically provides these facilities, including the ability to emit non-test commentary into test output.
As for conversions and roundings, yes if there are type differences involved that may influence a test outcome you of course need to bear this in mind when writing the test. But remember always that this is an invented example.
Arguably if your test is comparing two values of different types in a way that your application code would not then your test is meaningless anyway, even if it is more reliable. i.e. if you have arrived at value as an Integer to compare against a limit which is a Double, then your application code should have some mechanism for reliably comparing one against the other and it is that mechanism that should be tested. Testing any valid arbitrary expression in a test may provide warm fuzzy test results but it’s not really meaningful testing.
Does your application trunc the double before comparing ? Does it round ? Toward zero ? Away from zero ? Bankers rounding ?
Whatever transformation you apply in your test (whether DUnit or Smoketest based) had better match.
Smoketest arguably enforces type checking far more stringently by encouraging tests to be expressed against a value of a particular type using conditions of that same type. It cannot strictly enforce this of course, but the fluent style of presenting tests appropriate to the type under test at least reminds the developer of their test conditions, rather than allowing them – or actively encouraging/requiring them – to ignore such differences and rely on the compiler to intervene to create working but not necessarily meaningful test code.
As for debug concerns, I don’t see why JCLDebug couldn’t grab the exact same stack information for Smoketest code that it does for DUnit. Unless you have tried it and discovered that it does not for some reason. Compiling a test project with debug info on will emit debug info for the test project itself and any code being tested. It will simply not (in the absence of a deltics_smoketest define) emit debug info for the Smoketest framework code itself. But your test cases are not part of that framework.
Equally, you can set a breakpoint on Test.Expect(…), just as you would set a breakpoint on CheckEquals(…). I don’t know what you are talking about when you suggest that this wouldn’t be possible, which is what you appear to be saying. But when you “step-in” to that call the only code you will step into is your own. You will not have to step through the Test(), Expect() or other framework code (unless you specifically enable this with the deltics_smoketest $define).
Does Smoketest work as well under FireMonkey?
Unknown but unlikely. The Smoketest GUI is VCL only with some direct calls to GDI, and the RTL it relies on assumes Windows is the platform for various things such as thread synchronization (Windows messaging) and Unicode support.
You are welcome to try it with FireMonkey and I’d be curious to learn how you get on but Smoketest pre-dates FireMonkey by some years and since I have no intention of ever using FireMonkey myself there is no benefit to be gained from re-working the framework to support it (quite the opposite – it would likely make supporting older versions of Delphi that much harder).
ok , so what you are saying , we should not use functions to test, but objects. and the idea that we use functions, its wrong.
i OOP world i agree.
if you look at java you see a lot of places where there are garbage object, like math object.
java is to complex for objects to let you start programing fast.
in Delphi the story is a bit different.
the second point you bring is, should we stop after first failure.
you have no idea what will happen after first failure, the hard driver could be formatted, (if its a software that format staff), and you dont want to continue, unless you are absolutely sure you know what is going to happen.
if your tests have side effects, you want to stop at the first failure.
Yes, you are right – sometimes you want to stop on the first failure. But not always. Smoketest offers that flexibility, if you choose to use it. 🙂
But I have been thinking some more about this and it would be a simple matter to introduce a facility to make Smoketest behave the same way as DUnit, either globally or on a test case by test case basis, so I may do that at some point.
I must say that I do find this sort of assertion syntax almost unreadable. The best framework that I have found in terms of assertion syntax is here: https://github.com/philsquared/Catch
Which suffers the exact same problem as DUnit – the details of the tests are obscured behind the fact that the check methods work solely on boolean results, requiring you to a) additionally describe the test to make it meaningful and b) be more especially careful that what you are testing is actually a valid test in the context of your code.
As per the issue in the hypothetical example discussed elsewhere when two inputs into a test are of different types, e.g. integer and double.
This may be valid, but the chances are that if the possibility arises of arriving at an integer and a double that need to be compared, then your application code will have strategies for performing these comparisons. Having an assertion syntax that encourages you to ignore such type issues and allows any legal compiler expression risks enticing you into writing tests that are not actually testing your code meaningfully.
In this case, perhaps the application code in this context always rounds double values away from zero, in which case you could easily get a false positive from a naive
i > d
assertion.An assertion framework that encourages type-safe comparisons (but does not and cannot strictly enforce – since it can perform simple boolean tests it can be abused in the same way) might seem more inconvenient, but it helps mitigate against this sort of flawed testing. imho.