This is a little old, but there is a good article on optimization in Queue this month.
I really want to take a small issue with a couple of points here, though...
Software Layering
Many software developers become fond of using layering to provide various levels of abstraction in their software. While layering is useful to some extent, its incautious use significantly increases the stack data cache footprint, TLB (translation look-aside buffer) misses, and function call overhead. Furthermore, the data hiding often forces either the addition of too many arguments to function calls or the creation of new structures to hold sets of arguments. Once there are multiple users of a particular layer, modifications become more difficult and the performance trade-offs accumulate over time. A classic example of this problem is a portable application such as Mozilla using various window system toolkits; the various abstraction layers in both the application and the toolkits lead to rather spectacularly deep call stacks with even minor exercising of functionality. While this does produce a portable application, the performance implications are significant; this tension between abstraction and implementation efficiencies forces us to reevaluate our implementations periodically. In general, layers are for cakes, not for software.
vs...
Algorithmic Antipathy
For many software developers, algorithms are something they studied back during their college days, and thankfully not something with a lot of relevance to their day jobs. During Solaris 10 development, Solaris engineers fixed a long list of performance problems across the kernel and user libraries. Toward the end of the release, we spent some time reviewing just what had been improved and by how much—and what was the underlying cause of the performance problem. Interestingly enough, all the really big improvements (above, say, 200 percent) resulted from changes in algorithms. Over and over again, all the other performance fixes—using specialized SIMD processor instructions such as SSE2 or VIS, inserting memory prefetch instructions, cycle shaving—paled in significance compared with simply going back and rethinking the locking algorithms and/or data structures.
A key part of algorithm selection is having a realistic benchmark or workload in hand to support making decisions based on actual results rather than intuition or folklore. This means the most effective time to do performance and scalability work is in the earlier phases of the project, perhaps the exact opposite of what usually happens. All the clever compilation options are pretty useless when dealing with O(n2) algorithms for large values of n. Poor algorithms are the number 1 (and probably numbers 2 and 3 as well) cause of poor software system performance.
I guess this goes to one of those "unexpressables" in software design. Layers are good as long as the layers are replaceable. Abstraction needs to be near total in each layer of the software -- as much as possible -- so that after the fact algorithm changes don't require a complete restructuring of the application. Granted, the nature of the software here (SunOS) vs what "we" typically work on is vastly different. However, if the abstraction is so flawed that you end up passing around too much seemingly unrelated data and replicating structures, then maybe the layers at which you are abstracting were improperly drawn.
Chatter
1 sec ago
2 days 13 hours ago
4 days 9 hours ago
6 days 6 hours ago
2 weeks 17 hours ago
2 weeks 4 days ago
2 weeks 4 days ago
2 weeks 4 days ago
2 weeks 6 days ago
2 weeks 6 days ago