8.9 - End-to-End Tests

Episode Transcript

Welcome.

Welcome back to the Deep Dive, your go to shortcut for becoming incredibly well informed.

Yeah, yeah, without drowning in research papers.

That's the goal.

Today we're pulling back the curtain on something pretty crucial, but maybe a bit misunderstood in software development.

End to end tests.

These aren't just some techie detail, they're big reason why the apps you use daily actually work like they're supposed to.

Yeah, absolutely.

And understanding them well, it gives you a peek into why some software feels solid and other stuff feels, well, flaky.

It really does.

And when we say end to end, we really mean simulating like a complete user journey start to finish through the whole system.

Not just tiny pieces then.

No, not at all.

These tests are designed to give you that big picture view, making sure everything hooks together and works seamlessly from the moment you tap that icon.

Really.

OK.

Right.

So let's unpack this.

Our mission for this deep dive?

Figure out what these tests are, why they're so, so important, and then maybe get into some of the surprising complexities.

The trade-offs too.

Exactly the trade-offs.

We'll use some real world examples to make it stick.

Get ready for a few aha moments hopefully.

Sounds good.

So First things first, what are end to end tests?

You might hear other names right, like system tests.

System tests, yeah.

Or sometimes interface tests.

The key thing is where they sit in that, you know, testing period people talk about.

Right at the top.

Yeah, the very tip.

You've got tons of tiny unit tests at the bottom checking little bits of code.

Super fast, those ones.

Then fewer integration tests checking out pieces work together right?

And then way fewer of these end to end tests.

And that position tells you something.

It's about scope, but also about how resource intensive they are.

Meaning meaning you just don't run as many of them compared to the other.

OK.

And their core purpose you said simulating a user.

Exactly.

Think about how you use an app.

You log in, maybe click around, fill out a form, hit submit.

You expect something to happen.

An end to end test tries to automate that entire sequence, the whole flow.

So it's not just.

Does this button look right?

No, it's does clicking this button actually lead to the right data being saved and the right screen showing up?

It checks the whole chain reaction through all the system's layers.

It's the ultimate Does this thing actually work for a real person, Jack?

You got it.

That's the essence of.

It and here's that critical point you mentioned, the resource thing.

These are the most resource intensive automated tests.

By far.

And that's not just a technicality, right?

It has real consequences.

Absolutely.

It means they take way more effort to write.

You often need special tools, complex setups.

And they take the longest to actually run.

Oh yeah, much longer.

Think about that factory analogy.

Checking 1 bolt?

That's a unit test.

Quick checking if the whole car drives off the assembly line perfectly after visiting every single station, that's your end to end test.

Takes way more time.

More.

OK, that makes sense.

Let's make it even more concrete case study one testing a web system.

People often use something called Selenium.

Selenium is a big one.

Yeah, very common for web testing.

So if you're building, say, an online store or social media site, you need to know the whole process works.

Signing up, finding a product, buying it.

All those critical user paths and Selenium lets you build tests that act like, well, like robots using the website.

Like a very precise, maybe slightly literal robot user.

Pretty much these robots, these automated scripts do what a human would.

They open the browser.

Navigate Pages.

Fill in forms, click the buttons, wait for the page to load and then check if the result is what they expected.

So it's like scripting out.

OK computer, pretend you're a user, go here, do this, click that.

Now tell me if it worked.

That's exactly it.

It's how you programmatically verify that whole journey.

We saw an example, right?

How about a Google search?

Yeah, a simple but good one.

The test simulates someone using Firefox going to Google, typing software into the search bar, hitting enter.

Standard stuff.

Right, but then the crucial part, the test checks if the title of the search results page is exactly what it should be.

So even that simple action tests a lot the browser, the website search function getting the results back.

The whole chain for that specific task, but.

It sounds straightforward, but you mentioned challenges.

Yes, it's fascinating because while the idea is intuitive, actually building and maintaining these tests is significantly harder than unit tests, Even harder than integration tests sometimes.

Why is that?

Well, think about it.

A unit test runs in a nice, clean, predictable environment.

Just code talking to code.

Right, isolated.

An end to end test for a web app.

It's interacting with a real browser over the real Internet.

With a complex dynamic web page.

It's messier.

Unpredictable.

Can be the source material highlights a few things.

First, the tool itself, like the Selenium API, the commands you use to control the browser.

It's just more complex.

OK, you're telling it how to deal with visual layouts, things loading at different speeds, mouse hovers.

It's not just simple commands.

It's like giving instructions in a busy changing room versus a quiet library.

Gotcha.

What else?

Second, they have to handle what are called interface events, like maybe a button only appears after a three second animation finishes.

Right, a human would just wait.

Exactly.

But the test script, if you don't explicitly tell it to wait for that button, it might try to click too early, fail, and you get a false alarm, a test failure that isn't a real bug.

So you have to program in patients basically.

You have to anticipate those delays, those dynamic changes, time outs, elements appearing slowly.

It needs to handle that.

OK.

And the third challenge you mentioned fragility.

Yes, this is a big one.

They are much more susceptible to breaking because of tiny, sometimes purely cosmetic changes in the user interface.

Like what the example given was Google changing the internal name of its search input field maybe from queue to search query?

And that breaks the test.

Instantly, because the test script is looking for an element named Q and it's not there anymore, the whole test fails, even though the search functionality itself might be perfectly fine.

Wow, so even moving a button slightly it could?

Break it or changing its ID or its color if the test was checking that.

For some reason this makes them brittle.

You make a small UI tweak and suddenly you have failing end to end tests to fix.

That's a maintenance headache.

I can imagine I heard about a team chasing a failure that was just caused by a random pop up ad blocking a button sometimes.

Exactly that kind of real world messiness.

It's those unpredictable quirks that make E2E web testing tough.

OK, so with all that, the complexity, the fragility, the maintenance, are they actually worth the hassle?

That's the $1,000,000 question, isn't it?

And the answer is generally yes, but with caveats.

It comes down to the alternative, which is doing all that testing manually.

Imagine a person having to click through every single possible user journey on multiple browsers after every single tiny code change.

That sounds awful, yeah.

Impossible to keep up.

With it's incredibly slow, error prone and just doesn't scale.

So compared to that, automated end to end tests, despite their flaws, are still hugely valuable.

So it's a trade off.

They're expensive and brittle, but the alternative is worse for core functionality checks.

Precisely.

The insight here is you use them strategically.

You don't try to test every single obscure corner case with an E to E test.

That would cripple you.

You focus them.

You focus them on the most critical user paths.

The login flow, the checkout process, the main post, a message feature.

Let them be your final high level check that the most important stuff hangs together.

That final sanity check before you release.

Exactly, when you successfully use an app online, chances are an E to E test ran that exact same journey automatically to make sure it worked before it got to you.

That's their real impact, providing that confidence in the core experience.

OK, but what about systems without a visual front end?

Like behind the scenes stuff.

Great question.

Yes, E2E tests are still very relevant and it actually highlights different aspects.

Let's take her second case study, testing a compiler.

A compiler like the software that turns programming code into something a computer can run.

How does that have an end to end journey?

It does, and interestingly, the E2E tests here tend to be conceptually simpler than for web systems.

Simpler.

Why?

Because a compiler's interface isn't a graphical web page with buttons and animations, it's typically much simpler.

It takes an input.

File the source code.

Right, and it produces an output.

File the runnable program or maybe error messages.

Exactly.

It's a file in, file out process.

Much more predictable, less prone to those visual shifts that break web tests.

So how do you test that end to end well?

You create a collection of test programs, little pieces of code written in the language the compiler is supposed to understand.

OK, and these test programs are specifically designed to use different features of the language.

Loops, functions, weird edge cases.

To really exercise the compiler.

Precisely, and for each test program you define exactly what input it should receive when run, and crucially, what output you expect it to produce.

And the output format should be simple.

Ideally, yeah, like a list of strings or numbers.

Something easy for the test script to automatically check against the actual output.

Got it.

So what are the steps in the test itself?

Pretty straightforward mirroring what the compiler does.

First, the test calls the compiler to compile the test program P.

Does it build correctly?

That's step one.

Step 2, if it compiles, the test runs the resulting program using that predefined input data.

And finally, Step 3.

The test captures the actual output from the running program and compares it against the expected output you defined earlier.

And if they match, the test passes.

Correct.

And that whole sequence compile, run, verify output is an end to end test for the compiler because it uses all its major parts, parsing the code, optimizing it, generating the machine code and ensuring that code runs correctly.

It touches everything.

The whole pipeline.

But there had to be challenges here too, right?

It sounds simpler, but maybe not easy.

Oh, definitely challenges.

The big one here compared to unit tests is pinpointing why a test failed.

OK, so the test fails.

Right, it tells you say program X did not produce the expected output.

You know something is wrong in the compiler, but where?

Because the test covered the whole process.

Exactly.

Was the bug in the parser, the optimizer, the code generator?

The end to end test itself often doesn't tell you, it just says the final result is wrong.

So debugging is harder.

It's like knowing the car failed inspection but not knowing if it was the brakes, emissions or headlights without further checks.

That's a great analogy.

You have to do more investigation, maybe run more focused tests to trace that high level failure back to the specific buggy function within the compiler's code.

Unit tests are much better at pinpointing the exact location of a bug.

That makes sense.

Any other hidden challenges for compilers?

Well, actually creating all those test programs can be a huge task in itself.

Think about a complex programming language.

You need test cases that cover every single feature, every weird interaction between features, every possible syntax error it should catch.

That requires deep language expertise and a lot of careful work just to create the inputs and expected outputs.

Just building the test suite is a major project.

It really can be.

Meticulously defining the correct output for hundreds or thousands of intricate test programs is non trivial.

So wrapping this up then, yeah, end to end tests, whether for a flashy website or a background compiler, are about simulating that complete journey.

That complete user interaction or system process, yeah.

They're powerful, they're comprehensive, and they're really vital for making sure the whole system, not just the parts, actually works as intended.

They connect the dots.

Absolutely.

And yes, they are resource intensive.

They take more effort, they run slower, they can be brittle, especially with UIS, but their ability to validate that entire system flow makes them pretty much indispensable in modern software development.

They provide that final crucial layer of confidence.

The yes, this actually works for a user check.

Couldn't set it better.

They really are essential for the reliable software we often take for granted.

Well, thanks for joining us on this deep dive into end to end testing.

We hope this gives you a clearer picture of why these tests matter so much.

8.9 - End-to-End Tests

Episode Transcript

Never lose your place, on any device