tag:blogger.com,1999:blog-7826786244968404549.post989807870429825876..comments2023-09-05T23:58:35.122-07:00Comments on Simple Architectures for Complex Enterprises: Why Big IT Systems FailRoger Sessionshttp://www.blogger.com/profile/16946430426943308823noreply@blogger.comBlogger2125tag:blogger.com,1999:blog-7826786244968404549.post-68121751477822532892015-10-26T04:51:16.576-07:002015-10-26T04:51:16.576-07:00XTRAN guru,
I assume that by decoupling, you mean ...XTRAN guru,<br />I assume that by decoupling, you mean moving from a call-based system to a message-based system. Is that right? If so, I would disagree that this will reduce complexity. A message between systems counts as a dependency, which increases the dependency related complexity. The problem is that when we break a large tightly coupled system into a number of small loosely coupled systems, we go from a system that has high functional complexity (lots of functionality) to a system that has high dependency complexity (lots of dependencies.) This is why so many SOA projects fail. <br /><br />I don't think you are right that most small projects fail. At least, that is not what the research I have seen shows. I do agree that once small projects grow into large projects, they fail, but that is because they are now large projects, not because they were once small projects.<br /><br />As far as decoupling being an approach that reduces the exponential growth, if that was true we would expect to see all SOAs delivered successfully, but that doesn't happen. They seem to follow the same failure rates as any other architecture. Now of course <i>most</i> SOA projects succeed, but that is because most are small.<br /><br />I completely agree with you on the need for good coding practice. But I don't think that is the answer to the problem I am discussing. Large systems fail not because of bad code, but because of bad architecture. Good architecture can mitigate the problems of bad code. Good code can't mitigate the problems of bad architecture.<br /><br />As far as your 500K LOC system being stable, I think that is great and certainly is to your credit as an excellent coder and capable architect. I'm not sure that qualifies as a big system. I usually consider a big system a $10M+ system. A 500K LOC system is probably more like a $5M system. So it may be that we agree, but we are just looking at different scales.<br /><br />I agree with you on that the exponential rise in complexity is due to "bad architecture." I would phrase this as architecture that has not incorporated a complexity management strategy. I think the problem is that we don't differentiate between small systems and large systems. We try to take practices that work on small systems and apply them to big systems, and that is when we find the limitations of those practices.Roger Sessionshttps://www.blogger.com/profile/16946430426943308823noreply@blogger.comtag:blogger.com,1999:blog-7826786244968404549.post-59891300939446352202015-10-25T10:19:53.669-07:002015-10-25T10:19:53.669-07:00Roger -- actually, we have known for decades how t...Roger -- actually, we have known for decades how to prevent exponential increase in the complexity of large IT systems. It's called by various names, but the essence is modularity and decoupling.<br /><br />"Small IT systems usually deliver successfully" -- I wish that were true, but it isn't. The state of our "discipline" is so sad that even small projects are mostly badly done. And many do fail over time, although it may take longer than for larger projects. Of course, small projects turn into large (and messy) projects through accretion, and since decoupling isn't usually practiced in that process, they fall victim to the same ills as projects that started out larger.<br /><br />"Large IT systems...are hard to change" -- that's because they are usually tightly coupled and rife with cloned code, sloppy coding practices, and badly designed and undocumented interfaces (as with small projects).<br /><br />"The complexity of an IT system increases at a rate we describe as exponential", "the exponential increase is driven by the number of dependencies in the system" -- that can only happen due to bad design and implementation practices. I recommend what I call the "cocktail shaker" approach -- a combination of concurrent top-down functional decomposition and bottom-up primitives identification, meeting in the middle with the functional decomp being implemented using the primitives. Some primitives will be domain-specific, while others are aspect-specific (cross-cutting). Some may have existing implementations, in the form of in-house or commercially available / freeware run libraries and/or classes / methods, while others will need to be implemented (and added to the shop's stock).<br /><br />The result of the modularity and decoupling that approach provides is that complexity (including dependencies) increases much less than exponentially. In fact, what sometimes happens is that, as the problem being solved becomes better understood, the solution's complexity actually decreases, as it undergoes what I call the "collapse into elegance" -- the underlying structure of the problem becomes more apparent and a more elegant solution can be achieved through refactoring.<br /><br />An added advantage of good coding practice, and of the decoupling it provides, is that small teams can be given pieces of the work to do, secure in the knowledge that what they do won't break anyone else's code. So the complexity of managing the development process also increases less than exponentially, and may actually decrease.<br /><br />"We can't prevent the IT system from getting more complex (that is impossible)" -- in terms of overall inherent complexity, yes; in terms of difficulty of maintenance and enhancement, no. Good architecture and the resulting decoupling will prevent that.<br /><br />I agree that the problem you have described exists. But the solution has been known for many years, and always works when practiced properly. So the problem is caused not by any inherent property of IT systems, but by an appallingly low standard of craftsmanship and professionalism in the software development industry.<br /><br />An example -- I created (originally in 1984), and currently babysit, a system comprising about 1/2 million net LOC, in thousands of modules, with thousands of included files. I spend about 1% of my development time doing maintenance, compared to 60%-80% in the industry generally. Why? Because a) I maintain extremely high standards of code and documentation quality, b) I never "cut corners", and c) I refactor as soon as it's needed. And the system, while experiencing major growth in functionality, actually gets stronger and more robust over time. So of course its overall inherent complexity increases, but from a software development point of view, that complexity is bounded by decoupling so it can be treated piece-wise, providing essentially no increase in difficulty of maintenance and enhancement.XTRAN guruhttps://www.blogger.com/profile/00949791402612273518noreply@blogger.com