Elegant distributed applications

2009-02-02

"Elegance is the attribute of being unusually effective and simple"

Heard this first applied to theories in Physics. Any not-over-complicated theory that effectively explained the facts was - as I understood - considered an "elegant" theory. (My original interest - and college degree - was in Physics. I wanted to build starships.)

Came up, long ago, with rules for "elegant" distributed applications. The first significant distributed application I spent time with was the early FileNet system. This was well over twenty years ago. Strong interest in scaling to large (by the standards of the time) deployments, forced careful thought about performance. Pretty much everything I have worked on before and since has had a network in the middle, so I've had time and reason to think about the subject. Changing technology does not change the inherent nature of distributed systems, so the relevant set of notions will pretty much always apply.

(What continues to surprise me is that individuals and outfits still get these same bits wrong!)

In the interest of hitting the usual points once....

When building distributed applications, there are some basic principles you should always keep in mind.

Minimize the amount of data crossing the network. : The capacity of the network is always limited. Capacity is large in the usual development setup, when both server and client are on the same segment. In real use the available capacity is almost always less, and sometimes much less.

Minimize the number of round-trips across the network. : Networks always have latency. This has nothing to do with the speed of a network (in terms of bits/second). I have written about this before. For an interactive application the ideal is one round-trip per user action.

Shift computation from the server to the clients, where practical : There are almost always more clients than servers. If you can shift computation to the clients, you will get better overall throughput.

From the above principles, for building web applications you can derive further guidelines.

Code complexity belongs on the server. Code in the client should be simple. : Code to be executed on the client must be shipped across the network. The less you have to ship across the network, the better. If you have an application that calls for code-complexity on the client, you want to ship the code once, and should be looking at solutions like Java WebStart. Otherwise you want to ship as little code as possible across the network. This fits perfectly with the use of compact scripts in the client using Javascript.

The fact that Javascript is interpreted on each and every load (and not compiled) serves only to reinforce this guideline.

Large iterations belong mainly on the server, not the client. : Compiled code is usually much more efficient than interpreted code. The server can (or should) use compiled code. In the case of web applications, the client code is interpreted code.

Large data belongs on the server, not the client. : The (sometimes) narrow network channel, and the relative efficiency of compiled code - both argue for keeping large data on the server.

Use the strengths of the web browser. : Native code is faster than interpreted code. The web browser is smart, and incorporates a large set of behaviors in native code. Use of built-in behaviors can mean smaller Javascript and faster execution.

In the present, Javascript offers an elegant solution to the need for a scripting language in both client and server. Made the mistake(?) of responding to a recent post. Seems that each time something like this comes up, we have an almost fixed set of notions coming back. After a few iterations - just not very interesting.

Some quotes, with names omitted to protect the guilty.

Arrays have no semantics. They are not first-class collections. Do not use them in any public API, regardless of the language you use. Wrap arrays with a public type that exposes semantics.

The semantics of arrays as collections and iterations are simple and perfectly suited for small scripts. For the most part, you do not need anything more. The domain for scripting is small, concise, and hideously flexible code. More elaborate solutions might make sense in large server-side code. Client-side script or structures shipped between client and server should only be as elaborate as is needed - and no more.

You’re right in the sense that JS is “good enough” for most of basic usages, but almost useless for writing bigger software. It’s the reason why there’s been recently a lot of higher level languages that generates JS code. Either Java (GWT) or haXe (http://haxe.org)

You should not be writing large code in Javascript. You must use Javascript in the client (the web browser), but in-browser script should not be large. You should consider Javascript as "glue" code on the server-side - small code, few iterations, with huge flexibility - to allow special-case customization without re-coding. Javascript (as with ELisp in Emacs and AutoLisp in AutoCAD) is a scripting language. You do not want to write the bulk of a large application in Javascript. Large code is a non-goal for a scripting language.

I still dislike JavaScript, and likely always will. It has some pretty fundamental flaws.

Javascript evolved almost as a hack (if not quite). Early versions were less capable. Early examples were uninspired (or worse). The present iteration preserves past mistakes. Ignore the mistakes, and use the good parts. The good parts are ... very good, as suits a scripting language.

JavaScript has no place on the server.

Quite the opposite - as the scripting language known to the largest group, and fated to be well-known over a long period - Javascript can serve exceptionally in the role that scripting languages have long met in large, successful applications. Not for large use, but with a valued place. Often when faced with the need to adapt a large application to a specific customer/site needs, there are always cases when a simple list of options is not enough (and rarely-used options serve only to make the application obscure to all customers). There is always a part of the problem space best met by a scripting language deeply integrated with your application.

Historically we have always had a zoo of suitable scripting languages, with insufficient reason to choose between them. In an odd way, the rise of Javascript in the web browser does us a favor, as Javascript is sufficient, and now most widely known of all scripting languages. There are times when a single logical solution is best for all involved.

Javascript is good enough.