Preston L. Bannister { random memes }

2009.01.30

Scripting inspired by Monad – for Unix

Filed under: Software — Preston @ 5:47 pm

Scripting on Windows has always been pretty lame compared to Unix. The usual command.exe or cmd.exe shells on Windows are pretty pathetic compared to the Bourne Shell (which was released back in 1977!). Lacking a good shell, and with a population of GUI developers less familiar with the command line, Microsoft never really got a clue. The Cygnus ports (and the like) of GNU command line utilities and bash likely lessened the demand somewhat.

A few years back a group at Microsoft came up with Monad … which was a pretty cool idea, if a little too fat. Took years before they eventually shipped (renamed as Powershell).

Unix has tended to be about simple ideas with a lot of mileage. While the notion of building something like Monad on Unix is pretty interesting (I’d use Javascript via Rhino on the JVM), what I really wanted was a simpler notion that better “fit” the Unix-tools mindset.

One of the really cool bits about Monad was the ability to stream structured data – typically XML. Unix tools typically work on character streams – and only that. The character streams could quite naturally include structured data … but there seemed to be something missing.

Turns out there is a simple/elegant solution that could be retrofit to existing shells and tools, and fits very well within the usual Unix-ish way of doing things. The notion is so simple, it is a bit funny. This makes a round-trip of sorts….

The first generation of web servers were almost entirely running on Unix. The first common-model for dynamic content was CGI, which was simply (and logically) a slightly warmed-over of Unix shell scripting. Running a shell script for each incoming web request was not especially efficient, so we got a whole zoo of alternatives (ASP, JSP, mod_perl, etc.) but the programming model is still based on CGI – which in turn is derived from Unix shell scripting.

When a web browser makes a request of a web server, the request includes a “Content-Type” header to indicate the MIME type of the request data, and a set of “Accept” headers to indicate acceptable data-types for the response.

When a shell script is used for CGI, the HTTP headers turn into environment variables – for example:

HTTP_ACCEPT=text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
HTTP_ACCEPT_LANGUAGE=en-us,en;q=0.5
HTTP_ACCEPT_ENCODING=gzip,deflate
HTTP_ACCEPT_CHARSET=ISO-8859-1,utf-8;q=0.7,*;q=0.7

Unix tools that got used for CGI applications (Perl, Python, PHP, etc.) were all taught to ingest HTTP requests and generate HTTP responses. The code exists and is generally quite mature.

How could we pipe structured data between processes, automatically negotiating the data-type where possible?
In exactly same way a web browser negotiates the content-type with the web server!

There are a relatively small number of missing bits needed to make shell run scripts/tools in a similar fashion to CGI.

  1. The shell needs to read HTTP response headers from the output of a prior process, and translate the headers into environment variables, before invoking the next process in a pipeline.
  2. Some convention so tools know whether to generate a plain character stream, or structured data. The presence of the HTTP_ACCEPT environment variable is probably sufficient.
  3. … er, maybe that’s all.

There are many ways we could enable this behavior, but it could be as simple as:

HTTP_ACCEPT=text/xml ps -e | fold | spindle | whatever

… to invoke use of XML in a single pipeline, or:

export HTTP_ACCEPT=text/xml
ps -e | fold | spindle | whatever

… to invoke use of XML over an entire script.

Could use a shim without changing the shell. Something like:

HTTP_ACCEPT=text/xml ps -e | cgi perl a.pl | cgi php -f b.php

An exercise for another day….

2009.01.29

Fun baby factory

Filed under: Humor — Preston @ 11:58 pm

Octuplets’ mother already has twins, four other children
The woman who gave birth to octuplets this week already has six young children and never expected that the fertility treatment she received would result in eight more babies, her mother said Thursday.

The woman, who has not been publicly identified, had embryos implanted last year, and “they all happened to take,” Angela Suleman said, leading to the eight births Monday. “I looked at those babies. They are so tiny and so beautiful.”

She acknowledged that raising 14 children is a daunting prospect.

Neighbors said she and her six children — ages 7, 6, 5, 3 and 2-year-old twins — live there with her mother. Her marital status is unknown. Family members did not answer the door, but when a reporter called the home asking for Suleman, she spoke briefly.

Um, hello – six kids in six years, then she went to a fertility clinic? Why??

Have to wonder if this woman’s head is in a very strange place.

2009.01.26

RSS-Atom

Filed under: Software, Web — Preston @ 12:35 pm

A small proposal – the Atom Publishing Protocol should be named and referred to as “RSS-Atom”.

Why? Because when you subscribe to a feed from a website, you are often offered a list of choices.

RSS-0.92
RSS-2.0
Atom

This is bad user interface design. The end user – who has no deep notion of what these things are and how they differ – has no means of knowing which to pick. A well-designed user interface will hide the redundant choices, and select the most appropriate (probably Atom, going forward). Over the entire universe of applications, we can pretty much count on the fact that not all user interfaces are well-designed. If we can help the end user, we should.

The naming used should result in the list of choices:

RSS-0.92
RSS-2.0
RSS-Atom

Now even with indifferently-done user interfaces the user is given a sufficient clue. Clearly the choices are all of the same “kind”. Sorted alphabetically “RSS-Atom” will appear after any “RSS-version-number, which is a clue to the user that Atom is a better choice.

A small aid to hundreds of millions (soon billions?) of users is a big deal.

2009.01.17

Sam Ruby: Contributions Welcome

Filed under: html@w3c — Preston @ 4:44 pm

Sam Ruby: Contributions Welcome.

Yep. I think Sam in the HTML5 working group is a very good thing.

2009.01.16

Malfunctioning mindset – the HTML5 working group

Filed under: html@w3c — Preston @ 1:27 am

For the record, I think that making Sam Ruby a chair of the HTML5 working group is a good sign, in a dismal process.

Credit Sam with pointing out bits that are just wrong. The DOCTYPE section is a good/bad example.

A DOCTYPE is a mostly useless, but required, header.

DOCTYPEs are required for legacy reasons. When omitted, browsers tend to use a different rendering mode that is incompatible with some specifications. Including the DOCTYPE in a document ensures that the browser makes a best-effort attempt at following the relevant specifications.

To a web developer the DOCTYPE header is essential to insure a specific interpretation of HTML. If you think that HTML4 is better than HTML3 (or earlier), then the DOCTYPE is pretty damn important. In some ideal (non-existent) world, the first specification of HTML was perfect, and all browser implementations perfectly implemented the HTML exactly as in that same specification. In that world the DOCTYPE header would not be needed. In the real world, the DOCTYPE header is extremely useful, because it allows the web developer to invoke the better behaviors offered by later browser implementations, when those later implementations improve on what came before.

Seems I wrote about the DOCTYPE a year ago. My opinion has not changed.

Somehow the W3C HTML working group has walked through the looking glass. The difference between compatible and incompatible is “mostly useless”. Good is bad, and the trivial is vitally important.

Have doubts about that last? The HTML working group is still arguing over the alt attribute – a stupid waste of time. (If it would do any good, I’d be far more diplomatic.)

The constant references to XML and SGML in the HTML5 spec need to be removed. HTML is not XML. HTML is not SGML. HTML is not and never will be either XML and SGML. An appendix describing a mapping to XML would be useful. A history section describing the relationship to SGML and XML would be informative. Pretty much every other reference is a waste of time.

The massively verbose sections on parsing are a waste of time. The HTML standard is defined in terms of ASCII, and ASCII only, which makes much of the following needlessly verbose.

A DOCTYPE must consist of the following characters, in this order:

1. A U+003C LESS-THAN SIGN (<) character.
2. A U+0021 EXCLAMATION MARK (!) character.
3. A string that is an ASCII case-insensitive match for the string “DOCTYPE”.
4. One or more space characters.
5. A string that is an ASCII case-insensitive match for the string “HTML”.
6. Optionally, a DOCTYPE legacy string (defined below).
7. Zero or more space characters.
8. A U+003E GREATER-THAN SIGN (>) character.

In other words, <!DOCTYPE HTML>, case-insensitively.

Then there is the response from Ian Hickson. To be clear – I give Ian a lot of credit for tackling a large and difficult job (though Google’s apparent sponsorship dilutes that just a bit), and for trying to be moderate in dealing with the HTML5 working group. Credit aside, it seems that Ian does not have the right mindset (in which he is not at all alone).

For example:

Regarding the alt=”" attribute: we don’t want to say that alt=”" should be optional, because that would be an accessibility nightmare. It isn’t optional, it shouldn’t be optional.

In reality the alt attribute is optional. You cannot force web developers to use the attribute, or to use the attribute well. Web browsers cannot reject HTML documents that lack alt attributes. Few web applications will be written for accessibility, and the treatment of the alt attribute in the HTML5 standard cannot change that. Better to fit the spec to reality, than insist on a fantasy.

Also:

Regarding the extensibility mechanisms — HTML5 already has literally over half a dozen extensibility mechanisms: [link]

The existing support for extensibility in HTML is lame. What we need is something simpler and more fundamental. Here I have to admit weakness in that I have no particular skill as teaching. If you do not already deeply get why Lisp and Scheme are remarkable, then I cannot offer an adequate explanation. (Who does?)

I wish Sam luck. Boy does him ever have a big job in front of him.

2009.01.04

Vacation 2008.12

Filed under: Personal — Preston @ 6:05 pm

Final scores…

  • Twice 850-odd miles driving, not counting side trips. The trip out was nice.
  • Two kids who slept most of the trip, both ways.
  • Three days my father spent skiing in Telluride with my 2 kids, and one sister. My father has since recovered.
  • One day of falling snow, around dawn, on the drive back only. Driving in the dark in falling snow on icy roads is novel and thus fun. I grew up in southern California.
  • Two spin-outs behind, watched in the rear-view mirror, one of which may have been Colorado highway patrol. Never saw a spin-out on an icy road before.
  • Two cars passed when off the road, not in control (them not me), west of Grand Junction in Colorado.
  • One Utah highway patrol who turned on his lights briefly when we topped a rise from opposite directions. Might have had something to do with speed on a clean, fast section of road.
  • Two more cars off the road – coming from the other direction on the I-70 – in the Fishlake mountains in Utah.
  • Two very large snow-plows facing my tiny minivan in a snow-covered parking lot, when I came out of the “Panoramaland” rest stop. The driver was friendly and polite. I removed the minivan from their path.
  • Seven(!) snow plows working on the I-70 in Utah. These folk do not mess around – the road was in good shape.
  • Many hours peacefully driving down the I-70 and I-15 in gradually thickening traffic.
  • One Utah highway patrol car who swooped in from behind, and pulled over the car I was following.
  • Three Nevada highway patrol cars in the median (two facing toward, one away) who offered no objection as we flew by.
  • One “cancel” button on the cruise control that works very well.
  • One hour lost grinding through slow and rude traffic on the twisted and warped portion of the Interstate through Las Vegas.
  • I hate Las Vegas – an ugly place.
  • Another hour lost getting to and past Primm, Nevada. No obvious reason for the slowdown, but it makes money in Primm. Might be intentional, and not likely the California Highway Patrol would catch them (or try).
  • I hate Nevada, or at least parts.
  • Hours of thick traffic and poor drivers on the I-15 in California. Not nice.
  • One SUV that very nearly had a spectacular accident, right in front of me, on the fast/last downhill of the Cajon Passмебели. Another foot to the left would have taken his/her car off the shoulder. At that speed I my guess is a spin followed by a tumble, and maybe a bounce or two. Bummer.
  • The two mostly-sleeping kids missed all the witnessed mishaps.
  • One trouble-free and more than usually eventful trip.

2009.01.02

Oops? Tripping over a math problem.

Filed under: General — Preston @ 9:25 am

Just finished reading Variable Star, a book written by Spider Robinson from a partial story outline found recently in Robert Heinlein’s papers.

Have to admit to a bit of confusion here. Both Heinlein and Robinson are among my favorite authors, but I would never have tagged Robinson as “the new Heinlein”. (Then again, my interest in “science fiction” has much diminished over the years.) The book works out to a very decent Spider Robinson story with overtones of Heinlein.

There is one part of the story where the math strikes me as badly wrong. Given the list of credits given at the end, I do not know how this could have been missed by other reviewers. Maybe I’m dead wrong. Will try to setup the problem without giving away too much of the plot. (Might be a silly concern. At this point in time, is there anyone likely to read this article before reading the book?)

Given two ships: One large, traveling at near the speed of light, and unable to slow or stop. One small and capable of traveling at ~20 times light-speed. The larger ship is on-course and provisioned for a destination many light-years away. The larger ship is designed for and carries ~50 times as many folk as the smaller ship.

How long would it take the smaller ship to off-load passengers from the larger? The answer in the book is many years of continuous shuttling, and I am pretty sure that is wrong.

To take a first-order guess at the answer, assume that minimum round trip time for the smaller ship is one day – when the larger ship is passing the destination at minimum distance. A ship designed for 10 passengers on long trips can probably handle twice as many for short trips (a very rough guess), so figure 20 passengers per trip (for short trips). After 20 days the larger ship will be about 20 light-days from the destination, so the smaller ship will add about a day to it’s travel time. We can (very roughly) average the travel time, and say that over that 20 day period round trips by the smaller ship between the larger ship and the destination take about 1.5 days. Over 21 days you could make about 14 round trips. Assuming the same logic applies when the larger ship is both approaching and just past the destination (another massive assumption) then in 42 days the smaller ship could make 28 round trips, and off-load 560 passengers.

Would be pretty tedious for the pilot(s), but it looks as though you could off-load the larger ship in a few months (with room for large error in either direction, of course) – and almost certainly in less than a year. The trick is you do most of the shuttling when the larger ship is close to the destination.

I can understand why this problem was not looked at too closely – but without giving away plot, I cannot say more here. :)