20090529

Why Windows Suffers From Bloat

Yesterday, I was trying to install a popular pocketpc weather application on my 'smart' mobile phone, which runs Windows CE 5.0. Linux is almost running on it without problems, so hopefully I won't be using WinCE much longer, but rather Android. In any event, the application required something in the range of 5 to 10 megabytes of program storage space, which was not available on my device because most of the space was already filled with other proprietary software. Nevermind the fact that my device has 128 MB for program storage and 64 MB of RAM, which is more than enough in my humble opinion, but 5 to 10 megabytes for a simple program to read the weather? I would say that is just slightly excessive.

Let me explain a little bit about resource utilization in computer systems. I feel that I can make the generalization to 'computers' here because a mobile device is basically a small computer anyway, with relatively less storage and working memory. Being a long-time member of the Linux ecosystem, I feel that I have been accustomed to (spoiled by?) the always-comfortable feeling that my permanent storage as well as working memory are 'very large' when compared with the amount I actually need to run any application. I never have to worry about running out of hard-disk space when I install a new program, nor do I have to worry about running out of RAM or experiencing large-scale system-slow-down if I have many programs running simultaneously. Both aforementioned problems plague most windows users who I know.

Why does that happen? The answer is technically a bit involved, but it can be explained with a very simple analogy; Open Source Software (OSS) shares code and closed source software (CSS) doesn't.

OSS developers are free to use, modify, and redistribute source (and binary) code. One of the benefits of this philosophy, is that several applications (however unrelated they seem) can share the same code for common tasks. For example, a media player would need code (some algorithm) to sort and list all of your favourite tracks in alphabetical order. Similarly, a spreadsheet application would need similar code to sort a list of names in alphabetical order. In the OSS world, both of these applications have the potential to use the same code to sort a list alphabetically (as a general example). The developer of the media player, the developer of the spreadsheet, as well as the developer of the alphabetical sort code are all able to help each other and improve a the alphabetical sort algorithm. They exist-in and contribute-to a common ecosystem where everyone benefits.

On the other hand, in the Windows world, similar programs developed by different companies are in a state of economic competition. For example, two different tax programs compete for customers, and (usually) the 'better' product wins. However, the problem even exists between companies that develop completely different applications for the simple reason that many programs use require the same generic algorithms for sorting lists, etc. Therefore, the closed-source software (CSS) ecosystem breeds an inherent distrust between its members, for fear that a competitor might 'steal' the algorithm and thus the potential revenue which that algorithm could generate.

Ok, fine, but how does this relate to computer memory and storage space?

In the Windows world, every program (each written by a different company) would naturally have a secret place where they store their code for alphabetical sorting.When the media player program is installed on your windows computer, there is a special file, or library, that stores the sorting algorithm. For every program that uses similar code, the storage space is duplicated, and we're only considering an algorithm to sort names alphabetically! When one considers the many thousands of algorithms that are stored, the storage utilization starts to look very inefficient. Even worse - its not just the storage (hard disk) space that's affected, but also the working memory (RAM) of the computer!!

In the Open Source Software world, this code resides in one place for the whole world to use and modify. Similarly, the code only needs to be installed in one place to a Linux computer, in a single file for any program that requires an alphabetical sorting algorithm. This is essentially the same thing that happens while the program is running in memory; regardless of the number of programs that reference the code, it only exists once in RAM (context is saved elsewhere). The same algorithm (code) requires a fraction of the working memory in a Linux computer as it does in a Windows computer, for the same number of programs. Sharing is good !!

The benefit of using dynamically-linked libraries (shared objects) vs. statically linked libraries, is old news for most of the world, including Microsoft. Developers benefit from code reuse, common bug / fix propogation, and of course reduced memory usage, among many other things. Ironically, Windows has supported DLL's for a very long time. However, in spite of the many benefits, most 3rd party commercial application developers will likely continue to use their own stacks instead of a communal one, so that their 'intellectual property' is not sacrificed. Microsoft has partially rectified this problem with the introduction of .NET, C#, and managed code, but there are still plenty of legacy applications out there using VC++, MFC, and the Win32 API that will never be migrated.

I would assume that increased code sharing is at least linearly proportional (in some useful range) to resource utilization efficiency. Dynamic sharing is much more predominant in an Open Source Software environment. Therefore Open Source Software environments exhibit dramatically more efficient resource utilization.