Like every website that deals with traffic spikes, the one I’m working on these days does a lot of caching. This past week I’ve spent a lot of time reviewing the caching code as well as tuning the database, to get the site working efficiently on a newly upgraded virtual private server.
The following occurred to me: as wonderful and necessary as caching is, it’s fundamentally a workaround. The core problem is having insufficient resources. Given enough CPU and memory, you wouldn’t ever need to cache. It’s when those resources are insufficient for a particular traffic load that caching becomes immensely helpful. That’s why it’s a workaround: it practically addresses the problem, but it doesn’t really solve it. And it’s not a perfect solution: simple caching mechanisms usually introduce a lag time in the currency of content.
Why does this matter? Because caching shouldn’t substitute for efficient code. That is, uncached operations should still try to make the best use of resources as possible. Otherwise, caching turns into a panacea, luring you into a false sense of security about how well the guts of the application really perform. Ideally, caching should always be added as an afterthought on top of already well abstracted code.
Caching, and it’s purely software equivalent of memoization, can sometimes reduce to order of magnitude of an algorithm dramatically, so, sometimes, caching IS the efficient code, e.g., Fibonacci sequences, dynamic programming.
Your article is still on target, though, let me emphasize that. Nothing is ever a panacea that cures everything.
That’s exactly why I don’t call StaticGenerator a “caching” solution. It generates the output so the code doesn’t have to run the same thing over and over, ad nauseam.
Good post. :)
Thanks guys.
I should have been more clear: the specific scenario I had in mind was using cache as a means to avoid hitting a database. This may become clear in my next post. =)