Preston L. Bannister { random memes }

2008.12.30

Mystery houses decrypted!

Filed under: General, Humor, Images — Preston @ 8:22 am

About a year ago work started on a new housing tract in a once-empty field across the road from my father’s house in Colorado. Walking around the newly built houses, I could not figure out the placement of rooms and entrances. There seemed to be pattern, but I could not figure what the pattern meant. Documented my puzzlement in the faceless set of photos.

This was the house that cracked the code.

As the house on a corner, the scarcity of windows, and the main entrance that looks more like a side entrance – looks rather odd. Last night we took a walk that wandered through the same tract of houses – when suddenly the odd patterns made sense. The clue is to look at that same house when facing the garage, and imagine the house in a dense southern California development, with houses packed tightly on both sides.

This follows the usual pattern for recent dense-suburban southern California homes. The garage is in front, occupying nearly the entire street-facing side of the house. The width of the house (viewed from the street) is scarcely wider than the garage, which allows crowding the greatest number of houses on one street. The front entrance is around the side of the garage, and scarcely visible from the street as the next house (typically) is only a few feet away.

My guess is the builder took existing plans meant for dense-suburban homes – probably plans they had from prior projects – and stretched the plans to add more square footage (to sell into a market for bigger houses). The development is named Stone Ridge (though on flat ground), and the above pictures most closely match the Iron Ridge floor plan.

Once I had figured out the pattern, deciphering the other houses was easy.


Take this example. Note the narrow/useless “front porch”? The front of the house in the original plan was what is now the right side (in the photo). The room on the right was the garage. The useless front porch was originally the walkway from the front of the house to the “front” door. Push in a couple stretched-out rooms, and you have a house suitable for dropping into a dense-suburban development.

Note that the pictures are almost a year old, and none of the houses have sold since. Don’t know if the it’s due to the economy, or due to the random floorplans – but my guess is that even ordinary folk notice the odd layouts.

2008.12.29

GPS in an Android phone

Filed under: General — Preston @ 7:47 pm

Got a T-Mobile G1 phone a bit over a week ago. I can write software for my phone, so this counts as a new toy. :)

Drove out to south-western Colorado from southern California, yesterday. Was curious how the GPS in the phone would perform. The answer: not very well. Could be the GPS hardware in the phone is not very good. Something about the heuristic in the GPS software is almost definitely not right. There were long periods where the GPS seemed unable to determine the current location. There were times when the GPS would seem to be at the right location – then would suddenly jump to a few miles from home (hundreds of miles from my then-present position). Could be that the GPS has trouble in a moving car.

Another downside is that T-Mobile has poor coverage on much of the route (the I-15 to the I-70, then south in western Colorado). When the GPS could find my position, often Google Maps could not get a connection to download map tiles.

There is a need to download and cache map tiles over a pre-planned route.

If the GPS heuristic for dealing with moving vehicles and spotty satellite reception is poor (assuming that was the problem), then there is some hope a later software release from Google will offer improvements.

2008.12.21

Performance parsing CSV data

Filed under: Software — Preston @ 5:34 pm

Not sure how exactly, but I ran across an article that claimed parsing of CSV data was necessarily CPU-bound. I was pretty sure that with reasonably efficient code, there was no reason this had to be true. Still, proof is better than opinion, so I took the feed-readers code from a prior exercise, and adapted the code to parse CSV files.

You can grab the sources for the test CSV-parser-1 program from Subversion. The test program does a full CSV parse as described in RFC 4180 (including handling quoted fields with embedded line breaks) and a bit more, but does nothing with the parsed data.

Results from test runs – on my HP laptop (Intel T9300 Core 2 Duo CPU @ 2.5GHz with 4GB memory):

preston@mercury:~/workspace/CSV-parser-1$ time Release/CSV-parser-1 -n 0 in/1g.txt
TIME Sun Dec 21 16:45:43 2008
Scanning: in/1g.txt
Done with: in/1g.txt
TIME Sun Dec 21 16:45:47 2008
Elapsed (ms): 4249, total (MB): 981
Scanned 230 MB/s

real	0m4.254s
user	0m3.500s
sys	0m0.656s

The above is for a ~1GB file, fully cached in memory (do repeated test runs until the times stabilize).

preston@mercury:~/workspace/CSV-parser-1$ time Release/CSV-parser-1 -n 0 in/4g.txt
TIME Sun Dec 21 16:49:31 2008
Scanning: in/4g.txt
Done with: in/4g.txt
TIME Sun Dec 21 16:51:02 2008
Elapsed (ms): 91449, total (MB): 3944
Scanned 43 MB/s

real	1m31.455s
user	0m41.271s
sys	0m10.549s

The above is for a ~4GB file, not cached in memory. The result is very clear – an efficient CSV file parser can ingest data much faster than the data can be read off ordinary disks (a bit over five times faster). Even a fast RAID would be hard-pressed to deliver data faster than it could be parsed.

Of course, in “real” applications, any processing performed on the parsed CSV data will likely dominate the runtime. Application-specific processing could easily saturate more the one CPU. The problem partitions into most-efficient read-and-parse of CSV data from disk (which is what this example does), and distribution of application-specific processing across multiple CPUs (which this example can do in the same manner as feed-workers … and which may or may not suit your application).

Insert the usual caveats here. The example program has seen only basic testing. There were other applications (minimally) active. The C++ code was written for reuse, and has not been run through a profiler. You could tweak the code to get slightly better performance, but probably not any large improvements.

The same test run on a desktop (slower CPUs, faster disk):

preston@brutus:~/workspace/CSV-parser-1$ time Release/CSV-parser-1 -n 0 in/1g.txt
TIME Sun Dec 21 17:21:19 2008
Scanning: in/1g.txt
Done with: in/1g.txt
TIME Sun Dec 21 17:21:27 2008
Elapsed (ms): 7530, total (MB): 981
Scanned 130 MB/s

real	0m7.535s
user	0m6.136s
sys	0m1.388s

The above times are for a file cached in memory.

preston@brutus:~/workspace/CSV-parser-1$ time Release/CSV-parser-1 -n 0 in/4g.txt
TIME Sun Dec 21 17:23:31 2008
Scanning: in/4g.txt
Done with: in/4g.txt
TIME Sun Dec 21 17:25:00 2008
Elapsed (ms): 89020, total (MB): 3944
Scanned 44 MB/s

real	1m29.084s
user	0m26.994s
sys	0m7.004s

The above times are for a file not cached in memory. The results are entirely consistent with the first set of runs.

Clearly, an efficient CSV file parser can process data faster than a single disk can deliver. The SSD’s (solid-state disks) currently on the market seem to manage sustained read rates in the range of 40-100MB/s, so a single-process parser should be able to fully saturate the disk.

If you are doing large-scale processing of CSV data, your most-efficient approach is most likely to use a single (efficient!) reader-parser thread, and then roughly as many application-specific processing threads (or processes) as you have CPUs.

2008.12.09

Metaphors and reality

Filed under: General — Preston @ 7:23 am

Physics is taught and understood through a series of metaphors. Events and processes beyond the range of human perception and experience are are described and understood via metaphors. If the metaphors are close enough to reality, then we are able to derive useful results. When the wrong metaphors are in use (and this happened before) the science ends up in a blind alley.

There is always a chance that some of our present metaphors are wrong (or at least a poor choice). There is a distinct chance that at some point Physics will run up against an aspect of reality that cannot be expressed – even via metaphor – in terms digestible by the human mind. Some of our science may already be up against just such a limit. At some point the human mind will have to evolve to something greater, before further progress can be made.

A few more metaphors that may or may not prove useful….

The prior speculation leads to some derivatives.

If when looking far out into the universe we can see all distant present variants that do not change our present sum, then the further out we look the “fuzzier” the image would appear. If this were true (a very big if) then the distance to which we could see clearly would represent the distance to which some sort of fairly-immediate interaction is possible. (Hello warp-drive?)

If the weave-of-variants metaphor is in fact a good match to reality, then the diffraction pattern observed in the classic double-slit experiment may represent an interaction between photons across the weave of variants. In effect a single photon takes every possible path, each in it’s on variant, and interaction across variants establishes the diffraction pattern. (Though interaction-between-photons is itself a metaphor of which I am wary.) Are there other interactions across variants? Is this metaphor in any way testable or useful?

Again, without some sort of test, the above are no more than speculations.

2008.12.06

The Other Tiger

Filed under: General — Preston @ 5:40 pm

Been re-reading some old Arthur C. Clarke stories (largely written before I was born). I did not care very much for most of Clarke’s later stories. Whether the fault was his or mine I cannot say. The notions in Science Fiction writings do not affect me nearly so much now as when I first “discovered” science fiction in the early 1970’s. Clarke was one of my first most-favorite authors (only partly because I went alphabetically through the science fiction section of the local public library). I remember as a teenager finding Clarke’s stories quite exciting.

I find the collection of Clarke’s short stories exactly to my taste, if no longer exciting. Did get a bit of a jolt when I re-read “The Other Tiger”. The notions in that story had struck me as “right”, and become so deeply embedded in my thought, I no longer remembered the source.

Of late I have tended to return to an extension to that same line of thought. The starting notion is that if the universe is infinite, all possible combinations of events must occur. (Yes, I have heard of the Big Bang Theory. The theory may yet be proven wrong. It may be that the universe is effectively infinite … but we will get to that later.)

If all possible combinations occur, there are infinitely many Earths with an infinite number of variations. You might think there could be an infinite number of identical Earths as well … but Nature seems to tend to favor simple solutions, so I suspect that each identical variation occurs exactly once. You could imagine a sort of dimension-of-variations with each possible combination strung out along the dimension as a sort of standing wave. (It would of course not be anything like a single-measure dimension … but again, something to return to later.)

You could imagine that dimension viewed over time as an enormous tree, with branchings each time an event or combination could vary. The tree is quite nearly infinite (or a near-infinite count of near-infinite numbers).

Or is it?

Does it really matter when an isotope decays a hundred light years away? Sometimes yes, but mostly not (a most overwhelming “mostly”). Perhaps that tree is more like a weave. Many past combinations lead to our present, so viewed over time the massive branching out of variations is matched by a massive branching in of past variants that make no difference to our present state or future.

You could view this as computation, where many different combinations of past-values could compute to the present-sum. In fact, when viewed as a computation, the size of an infinite universe – when filtered to all meaningful, unique combinations, suddenly becomes finite! So starting with the assumption of an infinite universe, you end up with the conclusion that the universe is finite. (Perhaps there is hope for the Big Bang Theory after all.)

Does the entire universe branch for each quantum variation? Seems a bit like overkill. If the variation makes no difference to our local sum, might they all exist in our “present”? The further away, the larger variations could be without effecting our local sum.

There is an outside chance we might already have proof. If an astronomer took two pictures at the extreme edge of the visible universe, and the pictures came out different, a good scientist would (quite reasonably!) assume a small error in the aim of the instrument. Those two differing pictures could be views into distinct distant “present” variants. Or perhaps the further out we look, the “fuzzier” images become, as we can see all “present” distant variants that do not change our local sum?

Looked at that way, the “weave” of variants might vary over the dimensions of space that we can perceive.

Of late I have been bothered by the question of granularity. It is easy to assume that a quark popping into existence a hundred light years away does not cause a local branch – but where exactly (a poor word in this context) does the fork occur? How much can the past variants differ, without changing the present sum?

I would prefer to believe the variants could only be very small, or very far away. But … I could be wrong. Could the past variants be nearby and macroscopic?

When you and I remember a past event differently, and it makes no difference to our present, or to our future actions, could it be that we are both right? Discounting the unreliable nature of human memory, could it be that some portion of the time our memories of past events differ because we did each experience (slightly) differing events?

Lacking a test, this is no more than an entertaining speculation. But … the question of granularity bugs me. A lot.

Not long after this notion had occurred to me, I started to notice that some of the music (played from the my collection stored on my iPod) sounded different than I remembered. Now I think it far more likely that this is due to a faulty bit in my memory, or the not-very-good sound system in my car, but … what if my memory is right? Could it be that the proof is all around us, but we have gotten used to discarding those bits that did not fit into our metaphor-of-the-world to which we are accustomed?

There is of course no proof for any of this, so the above is just an entertaining speculation. But … could it be true?

2008.12.03

Isolating the UI Thread

Filed under: General — Preston @ 7:43 pm

I got stuck overhauling a badly implemented Swing application, not so long ago. Seems I end up doing desktop GUI applications every several years. Been doing this since pre-Windows days, so long ago learned the rule that long-running tasks should never be performed on the UI thread (or the equivalent). Have also seen (many times!) that most programmers are not aware of the rule and the underlying reason. The Swing application (to my complete lack of surprise) performed pretty much everything on the UI thread.

Refactoring an existing application to move long-running tasks off to a background queue, is not easy. The flow of control that leads to disk, network, or database operations can be rather indirect. You could have long-running operations still performed on the UI thread (via an indirect path) of which you are not aware. Even for an application written from scratch with full knowledge of the relevant principles, there is some chance an indirect control path might unexpectedly place a potentially long-running task on the UI thread.

You could put tests in various places to check that (say) disk and network operations are not performed on the UI thread, but that adds complexity and expense to existing code, is prone to error, and only catches problems after the fact. Not exactly an elegant or minimal solution.

What we really want is an efficient solution that makes the usual errors impossible. You want a file, network, or database operation on the UI thread to fail to get past the compiler – or at the very least to fail on the first invocation.

The world of server-side Java web applications offers a solution. Java web application servers (like Jetty or Tomcat) use custom class loaders to isolate distinct web applications. Classes loaded into one web application are completely unknown (and unshared) with other web applications hosted on the same application server.

The same notion – with a twist – could be applied to desktop GUI applications. The UI thread could use a class loader that knew about Swing (or the equivalent), and knew nothing about classes that did file, network, or database operations. The non-UI threads would not have access to any of the GUI (Swing or the like) related classes.

The project setup would be a little more complicated, but any errors could be caught at compile-time, without adding any additional complexity or runtime expense to existing classes.

This seems to imply a rather significant of reorganization of the stock Java classes.

Ran across this article, that takes a less interesting run at a similar problem.

Should Java Assert that Network I/O Can’t Occur on the UI Thread?
Doing network I/O on the user interface (UI) thread is bad. Most developers know that and can tell you why; unfortunately, it’s still done. At this year’s JavaOne, one of the keynote JavaFX demos bombed because the network was slow, something that would be forgivable had the entire application’s UI not frozen, which required it to be restarted, only to trip up again a few minutes later.

I believe to notion of using class loaders to completely isolate the class name-space is a more efficient solution.