The expense of an integration test can be extremely high. Consider the contentment app. This app makes drawings 1) that distribute across time, as if they were being drawn live in front of you, 2) that are generated stochastically, 3) with a "pixel-inaccessible" framework.
Now, it’s important to understand that none of these problems are insurmountable. Before you tell me how you’d surmount them, let me tell you how I could. 1) screw time. Rig it so it draws as fast as possible and wait for the end to assert. 2) screw stocastic, rig your prng so that you control the number sequence. 3) screw pixel-inaccessible, treat the medium like h/w and put a tee in front of it, assert against the tee.
All eminently doable, at some expense. So should it be done?
I prattle on about the money premise, and I want to be sure you understand how it applies here. Please do attend, as this is absolutely central to learning how to make TDD work for you.
The money premise says we’re in this to "ship more value faster".
Reminder: "value" could mean any number of things, including fewer bugs, better runtime, or more cases or features. All that matters about the definition of "value" for this purpose is that it is dependent on us changing even modestly complex logic in code.
Suppose you surmounted the three difficulties above. What would you have to assert against at the end? Depending on your technique for trapping the output, you’d have either an ASCII log with a bunch of labelled doubles in it, or a literal image snapshot. You could either eyeball a given run and say "that will do, don’t change this caught behavior," which we call "bless", or you could figure out approximate values for those doubles in advance and assert against them, which we call "predict".
Either way, you will have spent a great deal of money getting those assertions, yes? The three surmounted challenges are not cheap, and tho blessing is cheaper than predicting — we’re talking about a lot of data here — neither of those is very cheap either.
What do the literal assertions you write actually assert conceptually about the contentment app?
That, say, given a blank screen, a specific sequence from the prng, instructions to human-draw over time a line from one corner to the other, at the end there is indeed a line from approximately one corner to approximately the other.
Wow. You haven’t proven very much. A real script involves dozens of such lines, as well as text. A real script uses a real prng. A real script is inherently stochastic beyond your control because it depends on the timing of the not-owned-by-you multi-threading in the rendering.
(aside, not to mention the odds are good that your test is quite fragile, and will break when you change things in the code that do not actually matter to the user.)
I could prove all the things you proved without any rig at all. I could write the script that does that and run it and look, in a fraction of the time, and know just as much as that automated integrated test tells me.
In order to get very high confidence that my code could be safely changed, I would need thousands of these extremely expensive tests, and thousands of hours of predicting or blessing work besides.
Now, your app may not be graphical like mine, it may not be performing its main function distributed across time like mine, and it may not be stochastic like mine. Or. Well. Maybe it is like mine. If you’re working database to browser in a huge sprawling database with thousands of users and hundreds of screens and complex workflow and the kind of bizarre business logic enterprise apps demand, maybe it is like mine.
Writing an integration test for it would be investing a very great deal in proving a fraction of a fraction of a fraction of the program’s viability. Selenium, a test server, the slow runtimes of the intranet, the golden master db, and most particularly the effectively infinite cyclomatic complexity of the app as seen entirely from the outside.
Don’t do that.
Put simply, the money premise says that we do TDD because we want more value faster. Integration tests in most complex apps do not provide more value faster. As a direct result, in TDD we write very few integration tests, and suggest them very rarely.