By David Greenlees

In a recent testing coaching session with Anne-Marie Charrett we touched on the above three subjects.  Obviously an hour and a half is no where near long enough to go into any great level of detail, so Anne-Marie left me with a challenge to write about oracles in software testing.  I will be doing that, but will also be including the preceding actions of observation and inference.  Why?  They are part of a logical flow which I have begun to knowingly use in my testing.  I highlight knowingly because I think I always have done this, but have not applied critical thinking to it at the time.

During the coaching session we played an online game which involved many observations, inferences, and oracles.  At the time, this simply seemed to be a game which I was playing until Anne-Marie reminded me to slow down my thinking and focus on each action separately.  Where I initially fell over was jumping directly to my inference without thinking about my observation.  To cut a long story short, once I was critically thinking about each of these actions, I was able to win the game!  I may have won it eventually anyway, but it would have taken much longer.

A definition for each from the World Wide Web:

  • Observation – Detailed examination of phenomena prior to analysis, diagnosis, or interpretation.
  • Inference – The process of arriving at some conclusion that, though it is not logically derivable from the assumed premises, possesses some degree of probability relative to the premises.
  • Oracle (in software testing) – Heuristic (useful, fast, inexpensive, and fallible) principles or mechanisms by which we recognize problems.

How I define them:

  •  Observation – What you see.
  • Inference – What your observation tells you.
  • Oracle (in software testing) – How you know if your inference is correct.

After the coaching session I decided I would try this approach out again.  Now, what was a simple application that I had not used before… Google Calendar!

Fig. 1 – Entry screen.

So what was my initial observation?  The calendar view defaults to ‘Week’ view.  Then I thought about it a bit more, is that truly an observation or an inference?  The observation is that the calendar was presented in ‘Week’ view, and from that I had already inferred that it was a default.  Maybe the correct observation is that my calendar had defaulted to ‘Week’ view.  I cannot be 100% certain that this would be the case for all users.  I could make the assumption, but we all know how dangerous they can be in testing.

So some more observations (the numbers for each will remain consistent throughout the article):

  1. The time zone was displayed;
  2. There was a red line on the calendar indicating the time;
  3. The Google search bar was displayed at the top of the page;
  4. The Google logo was displayed at the top left of the page; &
  5. A series of buttons were displayed above the calendar view on the right hand side.

Obviously there are many more, however I’ll move on for the purpose of this article.

So I then decided to work backwards and thinking about what inferences I would have made if I had looked at the application how I used to test:

  1. The time zone was displayed as my current time zone;
  2. The red line on the calendar matched the time zone and the correct time;
  3. I can search the web from the Google search bar;
  4. I can go to google.com via the Google logo; &
  5. I can change the calendar view via the buttons above it.

Now it was time to do some testing and see if my inferences would have been correct had I done it the old non-critical thinking way:

  1. Correct;
  2. Correct;
  3. Incorrect;
  4. Incorrect; &
  5. Correct.

Hey, 2 out of 4 isn’t so bad right?  In this case maybe not.  However what if I was observing some sort of mission critical product where lives were at stake?  Then 2 out of 4 could be very bad.  This is assuming that I had not gone on to test the product, which I would have of course!

So why were the 2 incorrect?

  1. This particular search bar was for the calendar application only.  When you search for something, it only searched in your calendar for results; &
  2. The Google logo was not actually a link, it was simply an image of the logo.

I think they were fairly strong inferences to make!  Why though?  Why was I so confident in making those inferences?  That’s right Anne-Marie, I haven’t forgotten about you… oracles!

One reference which I like to use when looking at oracles is the HICCUPS(F) mnemonic from James Bach (I believe the (F) came from Michael Bolton):

  •  History: The present version of the system is consistent with past versions of itself.
  • Image: The system is consistent with an image that the organization wants to project.
  • Comparable Products: The system is consistent with comparable systems.
  • Claims: The system is consistent with what important people say it’s supposed to be.
  • Users’ Expectations: The system is consistent with what users want.
  • Product: Each element of the system is consistent with comparable elements in the same system.
  • Purpose: The system is consistent with its purposes, both explicit and implicit.
  • Statutes: The system is consistent with applicable laws.

That’s the HICCUPPS part.  What’s with the (F)?  “F” stands for “Familiar problems”:

Familiarity: The system is not consistent with the pattern of any familiar problem.

So back to the question of why I made those inferences that turned out to be incorrect:

  1. For the search function the Product heuristic would be relevant.  For many other Google applications the search bar can be used to search the web, not just that particular application; &
  2. For the Google logo the Users’ Expectations and Comparable Products heuristics would be relevant.  My expectation as a user would be that this image would in fact be a link back to google.com, and when observing the behaviour of a comparable product this is further confirmed.

Now what were my oracles?  How did I recognise that these were in fact potential problems?

  1. Other Google applications (namely News and Gmail) were my oracles for the search function; &
  2. Bing.com (namely the Bing logo) was my oracle for the Google logo function (or lack thereof).

A simple, yet extremely valuable exercise.  These principles and practices can be applied to your everyday testing with ease.  It’s important to slow down and apply your critical thinking skills throughout.  Making inferences and assumptions can save a bucket load of time, but if they are incorrect for whatever reason you may find you’re in a bucket load of…………..

Having said that, can we actually test all assumptions?  Even if we could, would it be worth it?  I would argue that you would very quickly end up testing assumptions that have a minimal impact, and the type of quality information you’re obtaining by doing so would not be value for money.  A method I’ve found valuable in these situations is to apply a negative likelihood to the assumption.

So let’s take one of my inferences:

  1. I can search the web from the Google search bar.

This directly relates to an assumption, in fact it is an assumption.  Now in reality it would be fairly quick to test this assumption, but for the purpose of my point let’s pretend that it would take up to one day to test.  I’ll use an average contractor rate of $800 per day.  So to test this assumption we’re looking at approx. $800 (that doesn’t take into account project time duration loss, etc).

If there was a high likelihood that users are going to make the same assumption, and therefore try to search the web from this search bar, spending $800 may be a wise move.  However, is it the type of function that users will complain about if they cannot search the web?  Are they more likely to just say, “Oh, you can only search the calendar from this search box.”  And simply move on effectively making it a non-issue?

Let’s say we don’t test it to save $800 because we believe it to be such a trivial matter.  Then, when it’s released the users go mad with frustration and call/email Google with complaints and upgrade suggestions.  This could cost a lot more than $800 when considering the time it would take for staff to handle these requests, and potentially update the function to now include a web search.

It’s important to recognise your assumptions, and even more importantly to determine which of those assumptions you will spend time and money on testing.  You need to consider the context of your environment and how that relates to the product you’re testing.

So now here is a challenge for you.  Over the coming weeks while you’re testing, I’d like you to think about observations, inferences, and oracles.  By think, I mean really think.  Stop and reflect on each of these tasks one by one (similar to the process I have used above).  Once you have done that, I’ve love to hear how you went and what difference you think it made.  Email me at xtremedmgATgmailDOTcom

I’ll gather the responses and include them in a post on my Blog (with your permission of course).

References: