FAGZAL's Personal Blog
csongor.fagyal.com
Introduction to unit testing (in web applications (with PHP examples))
If you don’t know what unit testing is, and you are a programmer, you should probably read this post. If you do know what unit testing is, and yet you think it does not apply to you, then you are probably wrong, or you do NOT really know what unit testing is. (If after reading this post you still think it is a waste of time, then you are probably a bad programmer. Sorry!)
Bugs and development
When you finish a development cycle, and release (a part of) your code, you always try to make sure it does not contain any bugs. However, it most certainly does. Bugs are an essential part of software development, and the more complex your system is, the trickier the bugs can be.
With that said, you do not want bugs. No one wants bugs (except maybe anteaters, but they are almost never involved in software development). If you send a probe to Mars for 200 million dollars and there is an uncaught exception in your code, that is a problem. You just don’t do that. But how do you make sure you do not have any bugs?
First, there are techniques, paradigms, tools, coding conventions and whatnot to help you to code with less bugs. For example you can “use strict” in Perl. You can introduce Scrum into your development process. You can memorize “Design Patterns” – and so on. However, as a programmer, you already know that no matter what you do, bugs cannot be fully eliminated by just careful coding and design. Thus, you need testing. Testing helps to find and eliminate bugs – the emphasis put on the find part. (“The first step in avoiding a trap, is knowing of its’ existence.”)
If you can find a bug, fixing it is usually a trivial task. Today’s immensely complex software, however, which are often poisoned by feature bloat, present a very tough task for testers: there are a huge number of input vectors, corner cases, exceptions and states to check. To address this issue, one of our tools is unit testing.
What is unit testing anyway?
I am going to simplify this for the sake of clarity.
Unit testing is an (automated, even algorithmic) way for testing individual parts (“units”) of a (software) system (or component). Units usually are the smallest components that can be tested – for example a method of a class. Testing basically means that we provide some input (e.g. parameters to a method), and check if the output (e.g. the return value) is what we expect. We can also call this process validation, as with the tests we make sure that the units work according to specification. In a way, the code that tests your units can be viewed as the actual specification (of the components tested).
But that’s just too much talk and too little code.
Let me give a basic, simplified example!
We have a method called “add”, which adds two positive numbers. Testing this might look something like this:
# testing the add function if (add(1,2) != 3) echo “Basic addition not working!” else print “Basic addition works!”; if (add(0,1) != 1) echo “Cannot add zero to one!” else print “Zero addition working!”; if (add(1,2) != add(2,1)) echo “Parameter order problem” else print “Parameter order irrelevant.”; |
This is the basic idea. Your unit here is the add function. You cannot provide all possible numbers with all possible combinations, so you just provide a few input vectors that you suspect would test the system “well enough”. Of course this is a rather useless example, but it certainly shows the basic point.
It DOES apply to you!
Before going into this topic a little further, to all the web devs out there, who at this point think unit testing just “does not apply” to them, let me say this: you are wrong. It does.
I know what you are thinking. You are thinking “but hey, I am not writing a calculator, I am creating a HTML table from user input while query-ing a database, how do you test that? Do you create a test database, and parse the HTML?? That’s just stupid and time consuming!”
Actually I very often hear this argument, and I am telling you: you are right in what you are saying, but that’s only because You Are Doing It Wrong. If you think that way, you are a coder, and not an architect – the difference probably shows not only in your way of thinking, but in your monthly paycheck, too
Please read on!
Testability
The first rule of unit testing is: you must have units in order to be able to test them.
About the example above: that is just not a unit. You are not testing a feature or a page in unit testing, you are testing a unit – nomen est omen. You can, in theory (and actually in practice, too) emulate user actions on a site – submitting forms, clicking on links – by certain testing frameworks, like WWW::Mechanize, but that is not unit testing: a page is just to big to be considered a unit. Instead, you should write your code so that it contain smaller parts that can be tested. For example, you can write a module that can generate HTML from an array. That is easy to test: you just throw in various parameters, and check the resulting HTML. You can also create a module that fetches values from an SQL table according to certain parameters. That’s harder to test, but still doable. Now if both of these modules work without an error – they have been unit tested – then most likely they will work when you connect them, too.
Change your mind
Of course many programmers out there do not create such readable structures – they just write code “as is”, without much planning, and so forth. Now, that just doesn’t work. Don’t do that. If you really want to (have to?) do that, forget unit testing (and forget becoming a professional programmer.)
Regression testing
During the lifetime of a software piece, you add code, you change code, you refactor. The size of your unit tests grow with your code. That is normal, that is expected. It is also expected that sometimes you introduce bugs that your existing unit tests could not find: your tests run without errors, yet there is an error that you find.
If during development something that used to work suddenly becomes broken, that is called a regression. It is a good practice that whenever you encounter a regression, you write a test case to find it, because if something was broken once, it is likely to be broken once again. As a rule of thumb, regression tests increase the quality of your code.
Mock objects
In (web) programming, your code often depends on the environment – that is, on external data and services. For example: your code sends out an e-mail during the registration process. Or you access a database to fill up a form.
These things are hard to test, since it’s hard – sometimes even impossible – to emulate the environment your code is running in. Thus come mock objects.
A mock object is an object that “emulates” the behaviour of a real, more complex object during testing. Let’s see a trivial example, where you use a mock objects to emulate the sending of an e-mail and the storing of some data in a database!
Here, your class RegisterUser registers a user (stores its data in SQL) and sends out a confirmation e-mail. The data store and the e-mail sender objects are both passed to your class:
RegisterUser::register($userdata,new UserStore(), new EmailSender()); |
If everything goes as planned, you get a TRUE return value, otherwise an error message. Your class looks something like this:
class RegisterUser { public static function register(array $userdata, Object $store, Object $sender) { # … do something here … $saved = $store->save($userdata); if (!$saved) return “Could not save user data: $saved”; $sent = $sender->send($userdata); if (!$sent) return “Could not send confirm e-mail: $sent”; return TRUE; } } |
Your “regular” UserStore would look something like this:
class UserStore implements UserStoreInterface { public function send(array $userdata) { # send e-mail here } } |
Obviously your UserStoreInterface is something like this:
interface UserStoreInterface { public function send(array $userdata); } |
Now your mock object is something like this:
class MockUserStore implements UserStoreInterface { public function send(array $userdata) { return TRUE; # always return true } } |
When you are running your unit tests, you use the mock object(s). Otherwise, you use the actual objects. When running your unit tests, you can disregard the actual sending of the e-mail (and worry about testing that later).
(Please note: this is a rather awkward example, just to explain the basic idea behind mock objects.)
Mock objects also help you to test certain, otherwise hard-to-test behaviour. For example your website should work properly even when you lose your SQL connection. But how do you test that? Do you take down your SQL server? How do you test when your SQL server dies between two queries? Well, with mock objects you can just make that mock object emulate an error. In general, you can easily create various states during testing, which might be very hard to to achieve during the actual testing.
White-box testing
One interesting thing to note about unit testing is that the one who creates the tests usually knows many things about the code being tested. Most of the time the one who writes the code writes the unit tests, too – so he or she can create tests that really stress-test the given code. For example if your code checks an e-mail address for validity, and you know that it uses a regular expression to achieve that functionality, you know that your tests should include various wrong e-mail addresses.
Since you know many things about the code being tested, unit testing is white-box testing: you know what’s inside. Compare that to black-box testing, when, for example, you have a bunch of testers sitting in front of their computers and trying to break your site.
Unit testing is just one tool
Unit testing will NOT solve all of your problems: it’s just one tool, amongst many, to increase the quality of your code, and to make the development process less hectic. It will not write better code for you. It will not find bugs which you have not written tests for. It will not check how you connect your units (that is integration testing), or how your code works as a complete unit (system testing).
Also, sometimes unit testing cannot be used. For example:
- If you are one of a hundred programmers working on the same project, and no one else use unit testing, it is likely that you will not be able to, either.
- For very small tasks, it is often an overkill to write any kind of tests. When all you do is to send out an e-mail, writing the unit tests might take more time than writing the original code.
- If you have a huge, legacy system, with legacy code, and you are the only programmer to work on it… well, refactoring the whole codebase could be so expensive that you should just resort to your regular, scheduled hacking ![]()
- If your employer does not pay you to write unit tests. Freelancers who make bid for projects can easily run into this one.
- If you are halfway into a project, and you are on a tight schedule, probably it is not the time to introduce unit testing. It’s just going to slow you down. Writing unit tests takes some time to get used to. If you are a very good programmer, your code will probably be OK without unit tests – on the other hand, if you are a bad programmer, you will screw up your unit tests, too, and you will be doing cargo culting.
- If you are a beginner, first you should work on your programming skills. Writing unit tests will make you a better programmer if you are already a good programmer, but it will not make you a good programmer per se, if you are not one.
- You cannot unit test subjective properties, e.g. “does this widget look nice”.
- It might be really hard or impossible to test non-deterministic or heuristic algorithms.
- User interface items (and the coupled logic) can also be hard to unit test, if at all possible.
- There are algorithms which must be proven first, before they can be tested. For example if you find a new way to test whether a number is a prime, you cannot conclusively unit test it by throwing a few random primes and non-primes to it
And as always: YMMV.
Conclusion
Unit testing is a widely used method for increasing the quality of written code, but it must be understood properly to be used efficiently. It catches software bugs very early in the development process, reducing cost, and it also help you to write better code. If you have never used it, though, or you have a system not really built for testability, it might be hard to introduce into your routine – however, in most of the cases, it can be a fine addition to your existing tools.
(Please note: this post is only an introduction. There are omissions and simplifications, so study the topic in more details before you start using unit testing.)
