For much of the last 25 years, Computer Scientists have looked for ways to make use of the unused resources of workstation machines. During this time the capacities of machines have greatly increased, as have the number that are available and under-utilized. But while the focus of this work gradually shifted from local area networks to web-scale systems, the last few years have seen a shift back to the small scale. Here’s why.
The Workstation Revolution
In the late eighties, a group of researchers from the University of Wisconsin were trying to parallelize computation over workstation machines.
They observed large collections of machines, connected but under-utilized, and many users whose demand for computation far exceeded their current capacity. If the unused capacity could be harnessed, it provided a greater degree of parallelism than was previously available, and could do so with existing resources.
Their approach worked, and the resulting system, Condor , is still used today. Indeed, the two fundamental problems of distributed desktop computing are the same now as they were 25 years ago:
- The desktop is run for the benefit of its interactive user, so background processes shouldn’t affect that user’s activities.
- The desktop is unreliable, and can become unavailable at any time.
Condor was designed such that computations could be paused, making it possible to minimize impact on user activity during busy periods. It was also designed such that computations could be re-run from a previous checkpoint, ensuring the failure of a machine didn’t critically impact results.
It created a pool of computational capacity which users could tap into, which was perfect for local-area networks, because long running computations could be run on remote machines and paused whenever the machine was in use. Since these machines typically resided within a single organization, administrators could ensure that the machines rarely became unavailable for reasons other than user activity.
The Internet Generation
The nineties saw an explosion in the number of workstations that were available, connected, and unused. With the advent of the World Wide Web, the problem of resource utilization had gone global.
The characteristics of this network were vastly different to the LANs used by Condor. Most importantly, with machines outwith the control of any one organization, they were less reliable than ever. Worse still, results could not be trusted. For example, a user with a home workstation may have had a lot of unused capacity, but unlike the lab machines typically used by Condor, there was no guarantee that this machine would be available for any length of time. Even if it was available, with the machine being at a remote site, we now had to consider that results could be faked or corrupted.
More positively, there were now so many of these machines available that any system able to use a tiny fraction of their resources would be able to harness great computational power. In fact, there were now so many machines available that systems such as Distributed.net  and Seti@Home  could take advantage of redundant computation.
Both systems broke computations down into relatively small problems, and sent duplicates of these problems to many machines. The problems were small enough that even a short period of activity would allow the remote machine to complete, and the number of duplicate computations was great enough to ensure that (a) some of the machines would eventually finish the computation, and (b) enough of these machines would finish that their results could be compared, ensuring that corrupted results were ignored.
Forging beyond computation, P2P systems such as Napster  and BitTorrent  allowed people to use the storage capacity of their workstations to share and distribute data. Like Seti@Home did for computation, the key to these systems was that redundant copies of data were stored across many machines, meaning the system’s operation was not dependent on the continued availability of only a few workstations.
Until recently, this was as far as these systems came. But the ubiquity of multi-machine home networks and an ever increasing demand for storage has created a new focus — storage on local area networks.
Napster and BitTorrent work well on a large scale, when the sheer number of users makes it unlikely that popular items will be inaccessible, but poorly on a small scale where they provide no guarantees that all data will be backed up.
Workstations and LANs now have the capacity to be used for storage, without affecting the activities of the user of a machine (problem 1). New storage systems are capable of ensuring that there are always enough copies of data to prevent data loss (problem 2).
Companies such as AetherWorks(disclaimer: this is who I work for), AeroFS , and BitTorrent Sync  are working to make use of this under-utilized storage space.
Why do we do this? Many small companies have storage needs, but no desire for a server (and regulatory compliance issues with the cloud), while many others have one or two servers but no redundancy. The ability to provide backup without additional capital expenditure makes that additional backup or server a much more realistic prospect.
Resource utilization has gone local again. This time the focus is on storage.
 BitTorrent Sync
This is a repost of a blog I wrote over on the AetherWorks Blog earlier this year.