Unsolved Developer Mysteries
By stern on May 28, 2009
What are the top developer problems we haven't run into yet? I gave an answer in three parts.
1. Unstructured data management and non-POSIX semantics. Increasingly, data reliability is taking
the shape of replication handled by a data management layer, using RESTful syntax to store, update,
and delete items with explicit redundancy control. If you're thinking of moving an application
into a storage cloud, you're going to run into this. Applications thriving on
syntax are wonderful when you have a highly reliable POSIX environment against which to run them.
And no, don't quote me as saying POSIX filesystem clusters are dead - the
Sun Storage 7310C is an existence proof to the contrary. Filesystems we loved
as kids are going to be
around as adults, and probably with the longevity of
the mainframe and COBOL: they'll either engineer or survive the heat death
of the universe. There is an increasing trend, however, toward WebDAV, Mogile, SimpleDB, HDFS and other
data management systems that intermediate the block level from the application. New platforms,
not at the expense of old ones.
2. Software reliability trumps hardware replacement. An application analog to the first point. Historically, we've used high availability clusters, RAID disk configurations and redundant networks to remove single points of failure, and relied on an active/active or active/passive application cluster to fail users from one node over to a better, more healthy one. But what if the applications are highly distributed, recognize failure, and simply restart a task or request as needed, routing around failure? IP networks work (quite well) in that sense. It requires writing applications that package up their state, so that the recovery phase doesn't involve recreating, copying or otherwise precipitating state information on the new application target system. There's a reason REST is popular - the ST stands for "state transfer". And yes, this worked really well for NFS for a long time. Can I get an "idempotent" from the crowd?
3. Parallelism. If not bound by single thread, what would you waste, pre-compute, or do in parallel? This isn't about parallelizing loops or using multi-threaded libraries; it's about analyzing large-scale compute tasks to determine what tasks could be partitioned and done in parallel. I call this "lemma computing" -- in more pure mathematics, a lemma is a partial result that you assume true; someone spent a lot of time figuring out the lemma so that you can leverage the intermediate proof point. When you have a surfeit of threads in a single processor, you need to consider what sidebar computation can be done with those threads that will speed up the eventual result bound by single-thread performance. This isn't the way we "think" computer science; we either think single threaded or multiple copies of the same single thread.