8.10 - Other Types of Testing

Episode Transcript

Welcome back to the Deep dive.

Today we're moving past the usual suspects, you know, unit integration tests.

We're going deeper into some other really crucial testing types in software development.

Think of it getting the inside scoop.

Our guide for this is Software Engineering a Modern Approach by Marco Tulio Valente.

Great book packed with useful stuff.

And our goal simple, quickly pull out the key ideas about these less common tests so you're up to speed fast.

OK, let's impact this.

So when people talk testing techniques, black box and white box usually come up first.

Seems basic, but what is the real difference?

And why should we, you know, care?

Yeah, good starting point.

The core difference, really it just comes down to what you know about the codes insides when you're testing it.

So black box testing, you're intentionally only looking at the outside the interface, like what's the function name, what parameters does it take?

What's it supposed to give back any errors it might throw?

You don't see the actual code lines.

It's often called functional testing too because you're just checking if it does what it's supposed to do functionally.

Think about testing a car with black box.

You just use the steering wheel, the pedals, the lights.

You wouldn't look under the hood at all.

Just does it drive?

Does it stop?

OK, so exterior view only what it does not how so white boxes pop in the hood?

Exactly.

Yeah, white box testing is the opposite.

Your tests use information about the internal code structure.

You are looking under the hood.

You're looking at the source code, maybe thinking about the different paths through the logic, the decision points, the loops, how things connect inside.

Sometimes called structural tests.

Here you might write tests specifically to make sure, say, every single line of code gets run at least once.

Or maybe every if statement gets tested for both true and false conditions.

Using the car analogy, you'd be like the mechanics checking the engine itself, the wiring, making sure each part works right?

It's an internal inspection.

Got it.

Black is outside, white is inside.

Seems clear enough, but what about unit tests?

I feel like I hear conflicting things.

Are they black or white or neither?

That's a really sharp question because unit tests are kind of slippery.

They don't have to be 1 or the other.

It actually depends entirely on how you write them.

OK, yeah.

If you write a unit test just based on the public interface, the method signature, what it promises to do, you're treating that unit like a black box, purely external view, right?

Based on the spec.

Exactly.

But let's say you write a test.

Then you run a code coverage tool and you see oh, this else branch wasn't executed.

If you then write another test specifically to hit that internal branch, now you're using knowledge of the internal structure.

So that makes it white box even though it's still a unit test?

Precisely your motivation for writing that second test came from looking inside.

So the same unit can be tested with both black box and white box approaches, even within unit testing.

That clears things up.

It's the approach, not the test type itself.

And this whole blurring lines thing, it applies to TDD too, right?

Test Driven Development because you write tests first there.

Absolutely.

TDD is a fascinating case study for this.

Kent Beck himself said something really insightful about it.

He basically said, look, TDD tests are written before the code exists, so by definition they have to be black box tests.

Initially you're testing against an interface that isn't implemented yet.

Makes sense.

But then, he added, he often gets the inspiration for the next test by looking at the code he just wrote to make the previous test pass.

Right, which is white box thinking.

Exactly.

That's the hallmark of white box testing.

So what's fascinating here?

Whereas key DD isn't strictly one or the other.

It's this cycle.

You start black box.

What should it do?

Write minimal code, then use white box insight.

How does it work now to figure out the next black box test?

What else should it do?

It really shows these categories aren't rigid boxes in practice, they're more like different ways of looking at the problem.

It's a really practical way to see it.

Not strict rules, but useful perspectives.

OK, so we know how we might look at the code, but what do we actually test it with?

What data?

Because like you said, testing everything.

Forget it.

Even for simple functions, A function taking two integers, testing every combo could take forever.

Literally.

And just picking random inputs that sounds hit or miss.

You might test the same kind of thing over and over and miss something crucial.

Here's where it gets really interesting.

You've hit on a huge challenge.

Since exhaustive testing is out, especially for black box, we need smarter ways to choose our test inputs and a key technique here is equivalence classes.

The idea is simple but powerful.

You look at all the possible inputs and divide them into groups or classes.

Within each group, all the inputs are considered equivalent in the sense that if one of them finds a bug, any other input in that same group probably would too.

OK.

Grouping similar inputs.

Exactly.

And the beauty is you only need to test one representative value from each distinct class.

This massively cuts down the number of tests you need.

Let's use that income tax example from the book.

Remember the salary ranges Range one $1930.99 to $2826.65 gets 7.5% tax.

Range two $2826.66 to $3764.68 gets 22.5%.

Range 4.

Anything above $4064.68 gets 27.5% right?

Four different tax brackets.

So using equivalence classes, you wouldn't test thousands of salaries, You'd identify these 4 ranges as your equivalence classes for valid inputs and you just one salary from each range.

Maybe 2000 dollars, 3000 dollars, $4000 and $5000.

Just four tests to cover the core logic for all valid positive salaries.

That's incredibly efficient compared to trying everything.

Just pick one from each zone.

What about the edges, The exact points where the tax rate changes?

Aren't those risky spots?

Absolutely spot on.

Those edges or boundaries are notorious bug hot spots, and that's why equivalence classes often go hand in hand with another technique, boundary value analysis.

Equivalence classes tell you which groups to test.

Boundary value analysis tells you which specific values, especially around the edges of those groups, are most critical.

The thinking is programmers often make off by 1 errors or handle comparisons incorrectly.

Right at the limits.

Yeah, I.

Can see that like using tech instead of.

Exactly that kind of thing.

So boundary value analysis says for each boundary of an equivalence class, test the boundary value itself, the value just before it, and the value just after it.

Let's take that first salary range again, $1903.99 to $2826.65.

Boundary value analysis would suggest testing these specific values.

For that lower boundary, $1930.98 just below the boundary, $1923.99 right on the boundary.

For the upper boundary, $2826 right on the boundary, $2826.66 just above the boundary.

OK, so you're testing the exact point and also making sure you don't accidentally include values just outside the range or exclude values right on the edge.

Precisely.

You're probing those transition points very carefully.

Mining equivalence classes with boundary value analysis gives you pretty good coverage without needing an impossible number of tests.

It makes total sense, though I guess defining those classes and boundaries isn't always as neat as salary ranges, right?

Like for a text field or something more complex.

That's true, it requires more thought sometimes for text, boundaries might be empty string, minimum length, maximum length, strings with special characters, etcetera.

But the core idea of partitioning inputs and checking edges still applies.

Right.

OK, so we've COVID testing from the inside, testing from the outside, picking smart inputs, but ultimately software has to work for the customer.

Let's talk about acceptance tests.

These are done by users, right?

Exactly.

This is where the rubber really meets the road.

Acceptance tests are all about the customer deciding if the software is, well, acceptable, doesn't meet their needs, will they sign off on it?

This decision determines if it ships to production or needs more work.

In Agile, you often hear that a user's story isn't truly done until it passes its acceptance tests, which are usually defined and run by the product owner or a customer proxy.

So what makes these tests different besides who runs them?

They sound like they might be manual.

Yes, that's one key difference.

Acceptance tests are typically manual tests, real people interacting with the system customers or their representatives, not usually automated scripts written by devs.

The 2nd and maybe even more important difference connects back to something we mentioned earlier, verification versus validation.

Right verification is did we build it?

Right validation is did we build the right thing?

You got it.

Most of the tests we discussed earlier, unit integration, even system tests based on specs, are primarily verification.

They check if the software matches the blueprint.

Acceptance tests are fundamentally about validation.

They check if the software actually solves the customer's real problem and meets their actual needs, which might not have been perfectly captured in the initial specs.

It's the ultimate reality check.

That's a crucial distinction.

Building the wrong thing perfectly is still building the wrong thing.

Are there different types of these acceptance tests?

Yes, there's usually a sequence.

First comes alpha testing.

This happens with real customers, but it's in a controlled environment.

Think maybe at the developer's office or on a specific testing set they control.

It's a small group, close interaction, easy to observe and get quick feedback.

OK, like a review.

Screening.

Kind of, yeah.

If the alpha tests go well, then you might move to beta testing.

Beta tests involve a much larger group of customers using the software in their own environments, on their own hardware, doing their own real tasks.

It's no longer controlled.

This is where you find issues related to diverse setups, unexpected workflows, and just general real world chaos that you can't easily simulate.

Got it.

Alpha is controlled, beta is out in the wild, more or less.

OK, OK, so far it feels like most of these tests, black box, white box, equivalence classes, even acceptance to some extent, are focused on finding functional bugs.

Does the software do what it's supposed to?

But what about everything else?

Like how fast is it?

How easy is it to use?

What happens if something breaks?

So what does this all mean for a complete testing strategy?

We need to test that other stuff too, right?

Absolutely critical point.

Focusing only on functional requirements is a huge mistake.

The non functional stuff, performance, usability, reliability is often just as important for success.

O yes, a complete strategy must include tests for these nonfunctional requirements.

Take erformance tests.

These measure how the system behaves under load.

Think about that ecommerce site getting ready for Black Friday.

They don't just need the buy button to work, they need it to work fast for millions of simultaneous users.

Otherwise, nobody buys anything.

Exactly.

So they'll run load tests, stress tests, pushing it until it breaks to see where the limit is, spike tests, sudden bursts of traffic, all to ensure it performs well under pressure.

OK.

Performance is key.

What else?

You mentioned usability.

Right usability tests.

These focus purely on the user interface and the user experience.

Is it easy to learn?

Efficient to use?

Are the buttons clear?

Is the workflow logical?

This often involves watching real users try to accomplish tasks with the software and seeing where they get stuck or confused.

It's less about code bugs and more about design flaws.

Makes sense.

A functional but unusable system isn't much good.

And the last one, failure.

We test for failure.

We do failure tests, sometimes called resilience or chaos testing.

The idea isn't to prevent all failures, That's impossible in complex systems.

It's about simulating failures to see how the system reacts.

What happens if a key database goes down, or a network connection fails, or an entire server rack loses power?

Yikes, you actually simulate that.

You do, in controlled ways.

Hopefully.

The goal is to ensure the system degrades gracefully, maybe switches to a backup, recovers properly, and doesn't, you know, corrupt all its data or crash entirely.

It's about building resilient systems that can handle the inevitable bumps in the road.

Wow.

OK, so black box, white box, equivalence classes, boundary values, alpha, beta performance, usability, failure tests.

It's quite a landscape beyond just unit and integration.

There's been a really insightful deep dive.

Thanks for joining us today and exploring this wider world of software testing.

8.10 - Other Types of Testing

Episode Transcript

Never lose your place, on any device