Preston L. Bannister { random memes }

2005.01.31

Building a better string class

Filed under: Software — Preston @ 10:38 pm

To sum up – the aim of the preceding exercise was to check my past assumptions. Rather than use the standard C++ string classes, I have for a number of years chosen to use a very thin wrapper class, with buffer allocation off a free-list. At the end the measurements show performance is indeed better than the standard C++ string classes, with one minor surprise.

The minor surprise was that the x86 string instructions are no longer optimal (at least on an AMD Athlon CPU). There are two alternates – either write inlined C++ code to perform the equivalent operation (better with small strings), or use the C library string functions (better with large strings). To force the use of string functions rather than string intrinsics in optimized code, you need to present the following pragma to the Microsoft C++ compiler.

#pragma function(strcpy,strcat)

Other than the use of the pragma, it looks as though the existing string classes in my applications do not need to be changed.

Building a better string class – prior exercises

Notions that go into choosing a string class in this form include…

  • The string class is meant to be a thin, efficient wrapper around conventional NUL-terminated C strings.
  • The string class is safer than C strings as the common operations that can lead to overflow (strcpy() and strcat()) upsize the string buffer as needed.
  • This is not an attempt hide or completely encapsulate C strings.
  • This is not a re-invention of C strings.
  • Constant strings are passed as “const char*” – to which the string class readily converts.
  • The string class is not meant to be so general as to suit every application in existance, and is not meant for inclusion in a library. Rather you would simply take the base version and add any functions needed.
  • The string class is optimal for strings of “usual” size (less than 256 characters).

The end result is small, efficient, and easy to adapt as needed. Suits my purpose at least :) .

2005.01.30

Building a better string class – sentinels and wrap up

Filed under: Software — Preston @ 10:46 pm

[Summary]

One final addition to the ZString class – sentinels. One common programming error is buffer over-run, writing past the end of an allocated array, and possibly overwriting a following item. We can catch this common programming error by allocating an extra byte on both ends of the string buffer and placing known values (sentinels) in each byte. The sentinels are checked when the ZString instance is freed. This is especially important when using a free list as any normal heap validity tests on deallocation are not exercised.

Only the “debug” version of the code makes use of sentinels.

ZString.cpp

#include <stdlib.h>
#include <string.h>
#include "ZString.h"

#ifndef ASSERT
#include <assert.h>
#define ASSERT(X) assert(X)
#endif

//
//  Out-of-line string methods and free list maintenance.
//

enum {
    STRING_BUFFER_SIZE  = 256,
    BOB_SENTINEL        = 0x66,
    EOB_SENTINEL        = 0x99
};

static void* g_pFreeList;

void ZString::recycle()
{
#ifdef _DEBUG
    ASSERT(BOB_SENTINEL == (255&*(sBuffer-1)));
    ASSERT(EOB_SENTINEL == (255&*(sBuffer+cbBufferMax)));
#endif
    if (STRING_BUFFER_SIZE != cbBufferMax) {
        delete (sBuffer-1);
        sBuffer = 0;
        return;
    }
    *((void**)sBuffer) = g_pFreeList;
    g_pFreeList = sBuffer;
    sBuffer = 0;
    cbBufferMax = 0;
}

void ZString::upsizeTo(int n)
{
    // Allocate oversize strings.
    if (STRING_BUFFER_SIZE <= n) {
        // round up to a quanta
        n = ((n + STRING_BUFFER_SIZE + 1) / STRING_BUFFER_SIZE) * STRING_BUFFER_SIZE;
#ifdef _DEBUG
        char* p = new char[n+2];
        p[0]    = (char) BOB_SENTINEL;
        p[1+n]  = (char) EOB_SENTINEL;
        ++p;
#else
        char* p = new char[n];
#endif
        ::strcpy(p,sBuffer);
        recycle();
        sBuffer = p;
        cbBufferMax = n;
        return;
    }
    cbBufferMax = STRING_BUFFER_SIZE;
    // Grab a buffer from the free list if present (the usual case).
    if (g_pFreeList) {
        sBuffer = (char*) g_pFreeList;
        g_pFreeList = *((void**)g_pFreeList);
    } else {
        // Allocate a stock-sized buffer.
#ifdef _DEBUG
        sBuffer = new char[2+STRING_BUFFER_SIZE];
        *sBuffer++                      = (char) BOB_SENTINEL;
        *(sBuffer+STRING_BUFFER_SIZE)   = (char) EOB_SENTINEL;
#else
        sBuffer = new char[STRING_BUFFER_SIZE];
#endif
    }
    *sBuffer = 0;
}

Building a better string class – what advantage in copy-on-write for std::string?

Filed under: Software — Preston @ 5:27 pm

[Summary] [Continued]

The prior set of measurements in – Building a better string class – comparing class implementations – did not really measure the advantage (if any) of the use of copy-on-write in the std::string implementation. Did some measurements to see whether this helps improve the relative standing of std::string.

First, the same “mixed” measurement as in the prior tests, but using the test strings as classes (rather than const char*).
Second, let’s do everything we can to give std::string the advantage (assuming copy-on-write is an advantage) by measuring only assignments.

RATE ZString string mixed calls 126.9 MB/second (1000.0 MB in 7.9 seconds)
RATE C++ string mixed calls 39.7 MB/second (1000.0 MB in 25.2 seconds)
RATE ZString string assign calls 144.9 MB/second (1000.0 MB in 6.9 seconds)
RATE C++ string assign calls 92.3 MB/second (1000.0 MB in 10.8 seconds)

The results seem to match up with my expectations. With mixed operations the std::string class performs relatively poorly. With an assignment-only test std::string does quite a lot better, but is still slower.

This matches up to my expectation as the amount of code executed to implement the copy-on-write “optimization” is in fact slower than simply copying the buffer. At least this is true for small strings (what I see as the more usual case). Tried increasing the test strings from 26 to 62 characters to see if std::string might fare better with longer strings.

RATE ZString string mixed calls 229.6 MB/second (1200.0 MB in 5.2 seconds)
RATE C++ string mixed calls 88.3 MB/second (1200.0 MB in 13.6 seconds)
RATE ZString string assign calls 268.7 MB/second (1200.0 MB in 4.5 seconds)
RATE C++ string assign calls 221.9 MB/second (1200.0 MB in 5.4 seconds)

Apparently not :) .

Repeating the same test, but this time compiled with GNU C++ on Linux.

RATE ZString string mixed calls 192.3 MB/second (1200.0 MB in 6.2 seconds)
RATE C++ string mixed calls 108.8 MB/second (1200.0 MB in 11.0 seconds)
RATE ZString string assign calls 225.1 MB/second (1200.0 MB in 5.3 seconds)
RATE C++ string assign calls 324.3 MB/second (1200.0 MB in 3.7 seconds)

Interesting! The GNU std::string implementation is apparently better than Microsoft’s, and does indeed yield an advantage – at least when doing only assignments.

Repeating the same test on Linux, but this time with the (original) smaller strings.

RATE ZString string mixed calls 134.2 MB/second (1000.0 MB in 7.5 seconds)
RATE C++ string mixed calls 59.6 MB/second (1000.0 MB in 16.8 seconds)
RATE ZString string assign calls 155.5 MB/second (1000.0 MB in 6.4 seconds)
RATE C++ string assign calls 152.0 MB/second (1000.0 MB in 6.6 seconds)

For smaller strings, even with the better GNU implementation, a simple buffer copy edges out copy-on-write in assignment.

The test program and ZString class sources follow.

string.cpp

#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <string.h>

// Optimization that seems to work best (at least with an Athlon).
//#pragma function(strlen,strcpy,strcat)

#include "ZString.h"
#include <string>

// test strings

//const int nLoop = 10000000;
//static const char sOut1[] = "abcdefghjklmnopqrstuvwxyz";
//static const char sOut2[] = "ABCDEFGHJKLMNOPQRSTUVWXYZ";

const int nLoop = 5000000;
static const char sOut1[] = "abcdefghjklmnopqrstuvwxyz0123456789ABCDEFGHJKLMNOPQRSTUVWXYZ";
static const char sOut2[] = "ABCDEFGHJKLMNOPQRSTUVWXYZ0123456789abcdefghjklmnopqrstuvwxyz";

int nLength = ::strlen(sOut1);

void report_times(const char* s,int dt,int cb)
{
    //printf("TIME %s %d ticks for %d characters written\n",s,dt,cb);
    double ts = ((double)dt) / CLOCKS_PER_SEC;
    double mb = ((double)cb) / 1000000;
    double rate = mb / ts;
    printf("RATE %s %0.1f MB/second (%0.1f MB in %0.1f seconds)\n",s,rate,mb,ts);
}

//
//  Perform a function for a fixed number of iterations and return time.
//

int dtLoop = 0;

typedef void (*doit)(const char*,const char*);

int time_function(doit fn)
{
    clock_t t0 = ::clock();
    for (int i=0; i<nLoop; ++i) {
        const char* s1 = sOut1 + (15 & i);
        const char* s2 = sOut2 + nLength - (15 & i);
        (*fn)(s1,s2);
    }
    return (int)(::clock() - t0) - dtLoop;
}

//
//  Time C string operations.
//

int nTotal = 0;

void do_total(const char* s1,const char* s2)
{
    nTotal += 4 * nLength;
}

void do_z_string_mixed_calls(const char* _s1,const char* _s2)
{
    ZString s1 = _s1;
    ZString s2 = _s2;
    ZString sWork;
    sWork.strcpy(s1);
    sWork.strcat(s2);
    sWork.strcpy(s2);
    sWork.strcat(s1);
    sWork.strcpy(s1);
    sWork.strcat(s1);
    sWork.strcpy(s2);
    sWork.strcat(s2);
}

void do_cpp_string_mixed_calls(const char* _s1,const char* _s2)
{
    std::string s1 = _s1;
    std::string s2 = _s2;
    std::string sWork;
    sWork.assign(s1);
    sWork.append(s2);
    sWork.assign(s2);
    sWork.append(s1);
    sWork.assign(s1);
    sWork.append(s1);
    sWork.assign(s2);
    sWork.append(s2);
}

void do_z_string_assign_calls(const char* _s1,const char* _s2)
{
    ZString s1 = _s1;
    ZString s2 = _s2;
    ZString sWork;
    sWork.strcpy(s1);
    sWork.strcpy(s2);
    sWork.strcpy(s2);
    sWork.strcpy(s1);
    sWork.strcpy(s1);
    sWork.strcpy(s1);
    sWork.strcpy(s2);
    sWork.strcpy(s2);
}

void do_cpp_string_assign_calls(const char* _s1,const char* _s2)
{
    std::string s1 = _s1;
    std::string s2 = _s2;
    std::string sWork;
    sWork.assign(s1);
    sWork.assign(s2);
    sWork.assign(s2);
    sWork.assign(s1);
    sWork.assign(s1);
    sWork.assign(s1);
    sWork.assign(s2);
    sWork.assign(s2);
}

int main(int ac,char** av)
{
    dtLoop = time_function(do_total);
    report_times("ZString string mixed calls",time_function(do_z_string_mixed_calls),nTotal);
    report_times("C++ string mixed calls",time_function(do_cpp_string_mixed_calls),nTotal);
    report_times("ZString string assign calls",time_function(do_z_string_assign_calls),nTotal);
    report_times("C++ string assign calls",time_function(do_cpp_string_assign_calls),nTotal);
    return 0;
}

ZString.h

#ifndef __ZSTRING_H__
#define __ZSTRING_H__

//
//  Simple string class.
//

class ZString
{

protected:
    char* sBuffer;
    int cbBufferMax;

public:
    ZString() { upsizeTo(1); }
    ZString(const char* s) { upsizeTo(::strlen(s)); ::strcpy(sBuffer,s); }
    ~ZString() { recycle(); }

protected:
    void recycle();
    void upsizeTo(int);
    void sizeTo(int n) {
        if (cbBufferMax <= n) {
            upsizeTo(n);
        }
    }

public:
    operator const char*()          { return sBuffer; }
    char* getBuffer()               { return sBuffer; }
    char* getBuffer(int n)          { sizeTo(n); return sBuffer; }
    int getBufferSize()             { return cbBufferMax; }

public:
    void operator=(const char* s) {
        strcpy(s);
    }

public:
    void strcpy(const char* s) {
        sizeTo(::strlen(s));
        ::strcpy(sBuffer,s);
    }
    void strcpy(const char* s,int n) {
        sizeTo(n);
        ::strncpy(sBuffer,s,n);
        sBuffer[n] = 0;
    }
    void strcat(const char* s) {
        int n1 = ::strlen(sBuffer);
        int n2 = ::strlen(s);
        sizeTo(n1 + n2);
        ::strcpy(sBuffer+n1,s);
    }
    void strcat(const char* s,int n) {
        int n1 = ::strlen(sBuffer);
        sizeTo(n1 + n);
        ::strncpy(sBuffer+n1,s,n);
        sBuffer[n1+n] = 0;
    }
    void strlwr() {
        ::strlwr(sBuffer);
    }
    void strupr() {
        ::strupr(sBuffer);
    }

};

#endif

ZString.cpp

#include <stdlib.h>
#include <string.h>
#include "ZString.h"

//
//  Out-of-line string methods and free list maintenance.
//

enum { STRING_BUFFER_SIZE = 256 };
static void* g_pFreeList;

void ZString::recycle()
{
    if (STRING_BUFFER_SIZE != cbBufferMax) {
        delete sBuffer;
        sBuffer = 0;
        return;
    }
    *((void**)sBuffer) = g_pFreeList;
    g_pFreeList = sBuffer;
    sBuffer = 0;
    cbBufferMax = 0;
}

void ZString::upsizeTo(int n)
{
    // Allocate oversize strings.
    if (STRING_BUFFER_SIZE <= n) {
        // round up to a quanta
        n = ((n + STRING_BUFFER_SIZE + 1) / STRING_BUFFER_SIZE) * STRING_BUFFER_SIZE;
        char* p = new char[n];
        ::strcpy(p,sBuffer);
        recycle();
        sBuffer = p;
        cbBufferMax = n;
        return;
    }
    cbBufferMax = STRING_BUFFER_SIZE;
    // Grab a buffer from the free list if present (the usual case).
    if (g_pFreeList) {
        sBuffer = (char*) g_pFreeList;
        g_pFreeList = *((void**)g_pFreeList);
    } else {
        // Allocate a stock-sized buffer.
        sBuffer = new char[STRING_BUFFER_SIZE];
    }
    *sBuffer = 0;
}

Building a better string class – comparing class implementations

Filed under: Software — Preston @ 3:13 am

[Summary] [Continued]

Having checked my assumptions earlier in – Continued – building a better string class – assumptions revised – time to compare C++ string implementations.

The ZString1 class (below) represents the string class used in most of the C++ code I have written in the past decade. Nothing really exotic here – the class is simply a thin wrapper around the standard C string functions (made safe), with the string buffer allocated off a free list.

The ZString2 class simply replaces the use of string functions with inlined C++ code. From the earlier exercise, we gained the expectation that this version might be slightly faster than the calls to standard C string functions.

The micro-benchmark numbers tell pretty much the expected story.

RATE C string calls 262.1 MB/second (1000.0 MB in 3.8 seconds)
RATE ZString1 string calls 151.7 MB/second (1000.0 MB in 6.6 seconds)
RATE ZString2 string calls 158.0 MB/second (1000.0 MB in 6.3 seconds)
RATE C++ string calls 67.4 MB/second (1000.0 MB in 14.8 seconds)

The C string calls are fastest – at least when we are allocating the string buffer off the stack, and do not have to check for buffer overflow. This is traditional (and somewhat unsafe) practice in C programs, but not always practical.

The thin wrapper classes do markedly better than the standard C++ string class. To determine how much of the difference is due to the use of a free list, made the ZString classes always-allocate instead of using a free list, and re-ran the test.

RATE C string calls 266.3 MB/second (1000.0 MB in 3.8 seconds)
RATE ZString1 string calls 84.3 MB/second (1000.0 MB in 11.9 seconds)
RATE ZString2 string calls 88.8 MB/second (1000.0 MB in 11.3 seconds)
RATE C++ string calls 67.4 MB/second (1000.0 MB in 14.8 seconds)

Even without the use of a free list, the ZString classes do better (nearly double the performance) compared to the standard C++ strings. The use of the free list adds a substantial further boost to performance.

There is another set of reasons to consider using a free list for frequently allocated structures. Occasionally you can get patterns of allocation and deallocation that cause heap fragmentation, and this can become a problem in long-running programs. By greatly decreasing the number of heap allocations, you likely improve the stability of your program. This is not just theory – I have seen this more than once in actual customer use.

Adding the pragma so that the compiler generates string function calls instead of inline string intrinsics yields only slightly different numbers.

RATE C string calls 289.4 MB/second (1000.0 MB in 3.5 seconds)
RATE ZString1 string calls 145.8 MB/second (1000.0 MB in 6.9 seconds)
RATE ZString2 string calls 158.0 MB/second (1000.0 MB in 6.3 seconds)
RATE C++ string calls 67.4 MB/second (1000.0 MB in 14.8 seconds)

Tried running the same code through GNU C++ on Linux.

RATE C string calls 304.0 MB/second (1000.0 MB in 3.3 seconds)
RATE ZString1 string calls 170.1 MB/second (1000.0 MB in 5.9 seconds)
RATE ZString2 string calls 187.6 MB/second (1000.0 MB in 5.3 seconds)
RATE C++ string calls 81.7 MB/second (1000.0 MB in 12.2 seconds)

The relative results are pretty much in line with the Windows results (note the Linux box has a slightly faster CPU).

Tried the same code with Cygwin GNU C++ under Windows.

RATE C string calls 240.0 MB/second (1000.0 MB in 4.2 seconds)
RATE ZString1 string calls 72.0 MB/second (1000.0 MB in 13.9 seconds)
RATE ZString2 string calls 61.0 MB/second (1000.0 MB in 16.4 seconds)
RATE C++ string calls 4.9 MB/second (1000.0 MB in 202.9 seconds)

Yikes! Something is radically less efficient! The relative performance story is still the same.

Note that I am not claiming this micro-benchmark tells the performance story for every possible application using strings. Rather the intent here is to give a starting point. Also there is no intent here to handle anything other than single-byte character strings.

Note also that I am not trying to reproduce here every function offered by the standard C++ string classes. Rather I start with something like the ZString class as a base and add functions as-needed for the individual application. (I have a preference for thin/minimalistic classes).

The conclusion that I draw from all this is that my original string class does indeed still greatly out-perform the standard C++ string class (at least for the sort of usage I see most often). The string class can be slightly improved by disabling the string intrinsics, or by instead using equivalent inlined C++ code. In fact this micro-benchmark likely understates the performance advantage, if your application performs relatively more string allocations and relatively fewer string copy operations.

ZString.h

#ifndef __ZSTRING_H__
#define __ZSTRING_H__

//
//  Simple string class.
//

class ZString
{

protected:
    char* sBuffer;
    int cbBufferMax;

public:
    ZString() { upsizeTo(1); }
    ~ZString() { recycle(); }

protected:
    void recycle();
    void upsizeTo(int);
    void sizeTo(int n) {
        if (cbBufferMax <= n) {
            upsizeTo(n);
        }
    }

public:
    operator const char*()          { return sBuffer; }
    char* getBuffer()               { return sBuffer; }
    char* getBuffer(int n)          { sizeTo(n); return sBuffer; }
    int getBufferSize()             { return cbBufferMax; }

};

#endif

ZString1.h

#ifndef __ZSTRING1_H__
#define __ZSTRING1_H__

#include "ZString.h"

//
//  Simple string class.
//

class ZString1 : public ZString
{

public:
    void operator=(const char* s) {
        strcpy(s);
    }

public:
    static int strlen(const char* s) {
        return ::strlen(s);
    }
    int strlen() {
        return strlen(sBuffer);
    }
    void strcpy(const char* s) {
        sizeTo(strlen(s));
        ::strcpy(sBuffer,s);
    }
    void strcpy(const char* s,int n) {
        sizeTo(n);
        ::strncpy(sBuffer,s,n);
        sBuffer[n] = 0;
    }
    void strcat(const char* s) {
        int n1 = strlen();
        int n2 = strlen(s);
        sizeTo(n1 + n2);
        ::strcpy(sBuffer+n1,s);
    }
    void strcat(const char* s,int n) {
        int n1 = strlen();
        sizeTo(n1 + n);
        ::strncpy(sBuffer+n1,s,n);
        sBuffer[n1+n] = 0;
    }
    void strlwr() {
        ::strlwr(sBuffer);
    }
    void strupr() {
        ::strupr(sBuffer);
    }

};

#endif

ZString2.h

#ifndef __ZSTRING2_H__
#define __ZSTRING2_H__

#include "ZString.h"

//
//  Simple string class.
//

class ZString2 : public ZString
{

public:
    void operator=(const char* s) {
        strcpy(s);
    }

public:
    static int strlen(const char* s) {
        int n = 0;
        while (*s++) ++n;
        return n;
    }
    int strlen() {
        return strlen(sBuffer);
    }
    void strcpy(const char* s) {
        sizeTo(strlen(s));
        char* p = sBuffer;
        while (*p++ = *s++);
    }
    void strcpy(const char* s,int n) {
        int n2 = strlen(s);
        if (n2 < n) n = n2;
        sizeTo(n);
        char* p = sBuffer;
        int i = 0;
        while (i < n) { p[i] = s[i]; ++i; }
        p[i] = 0;
    }
    void strcat(const char* s) {
        int n1 = strlen();
        int n2 = strlen(s);
        sizeTo(n1 + n2);
        char* p = sBuffer + n1;
        while (*p++ = *s++);
    }
    void strcat(const char* s,int n) {
        int n1 = strlen();
        int n2 = strlen(s);
        if (n < n2) n2 = n;
        sizeTo(n1 + n2);
        char* p = sBuffer + n1;
        int i = 0;
        while (i < n2) { p[i] = s[i]; ++i; }
    }
    void strlwr() {
        ::strlwr(sBuffer);
    }
    void strupr() {
        ::strupr(sBuffer);
    }

};

#endif

ZString.cpp

#include <stdlib.h>
#include <string.h>
#include "ZString.h"

//
//  Out-of-line string methods and free list maintenance.
//

enum { STRING_BUFFER_SIZE = 256 };
static void* g_pFreeList;

void ZString::recycle()
{
    if (STRING_BUFFER_SIZE != cbBufferMax) {
        delete sBuffer;
        sBuffer = 0;
        return;
    }
    *((void**)sBuffer) = g_pFreeList;
    g_pFreeList = sBuffer;
    sBuffer = 0;
    cbBufferMax = 0;
}

void ZString::upsizeTo(int n)
{
    // Allocate oversize strings.
    if (STRING_BUFFER_SIZE <= n) {
        // round up to a quanta
        n = ((n + STRING_BUFFER_SIZE + 1) / STRING_BUFFER_SIZE) * STRING_BUFFER_SIZE;
        char* p = new char[n];
        ::strcpy(p,sBuffer);
        recycle();
        sBuffer = p;
        cbBufferMax = n;
        return;
    }
    cbBufferMax = STRING_BUFFER_SIZE;
    // Grab a buffer from the free list if present (the usual case).
    if (g_pFreeList) {
        sBuffer = (char*) g_pFreeList;
        g_pFreeList = *((void**)g_pFreeList);
    } else {
        // Allocate a stock-sized buffer.
        sBuffer = new char[STRING_BUFFER_SIZE];
    }
    *sBuffer = 0;
}

string.cpp

#include <stdlib.h>
#include <stdio.h>
#include <time.h>
#include <string.h>

// Optimization that seems to work best (at least with an Athlon).
//#pragma function(strlen,strcpy,strcat)

#include "ZString1.h"
#include "ZString2.h"
#include <string>

// test strings
static const char sOut1[] = "abcdefghjklmnopqrstuvwxyz";
static const char sOut2[] = "ABCDEFGHJKLMNOPQRSTUVWXYZ";
int nLength = ::strlen(sOut1);

void report_times(const char* s,int dt,int cb)
{
    //printf("TIME %s %d ticks for %d characters written\n",s,dt,cb);
    double ts = (double)dt / CLOCKS_PER_SEC;
    double mb = (double)cb / 1000000;
    double rate = mb / ts;
    printf("RATE %s %0.1f MB/second (%0.1f MB in %0.1f seconds)\n",s,rate,mb,ts);
}

//
//  Perform a function for a fixed number of iterations and return time.
//

const int nLoop = 10000000;
int dtLoop = 0;

typedef void (*doit)(const char*,const char*);

int time_function(doit fn)
{
    clock_t t0 = ::clock();
    for (int i=0; i<nLoop; ++i) {
        const char* s1 = sOut1 + (15 & i);
        const char* s2 = sOut2 + nLength - (15 & i);
        (*fn)(s1,s2);
    }
    return (int)(::clock() - t0) - dtLoop;
}

//
//  Time C string operations.
//

int nTotal = 0;

void do_total(const char* s1,const char* s2)
{
    nTotal += 4 * nLength;
}

void do_c_string_calls(const char* s1,const char* s2)
{
    char sWork[256];
    ::strcpy(sWork,s1);
    ::strcat(sWork,s2);
    ::strcpy(sWork,s2);
    ::strcat(sWork,s1);
    ::strcpy(sWork,s1);
    ::strcat(sWork,s1);
    ::strcpy(sWork,s2);
    ::strcat(sWork,s2);
}

void do_z1_string_calls(const char* s1,const char* s2)
{
    ZString1 sWork;
    sWork.strcpy(s1);
    sWork.strcat(s2);
    sWork.strcpy(s2);
    sWork.strcat(s1);
    sWork.strcpy(s1);
    sWork.strcat(s1);
    sWork.strcpy(s2);
    sWork.strcat(s2);
}

void do_z2_string_calls(const char* s1,const char* s2)
{
    ZString2 sWork;
    sWork.strcpy(s1);
    sWork.strcat(s2);
    sWork.strcpy(s2);
    sWork.strcat(s1);
    sWork.strcpy(s1);
    sWork.strcat(s1);
    sWork.strcpy(s2);
    sWork.strcat(s2);
}

void do_cpp_string_calls(const char* s1,const char* s2)
{
    std::string sWork;
    sWork.assign(s1);
    sWork.append(s2);
    sWork.assign(s2);
    sWork.append(s1);
    sWork.assign(s1);
    sWork.append(s1);
    sWork.assign(s2);
    sWork.append(s2);
}

int main(int ac,char** av)
{
    dtLoop = time_function(do_total);
    report_times("C string calls",time_function(do_c_string_calls),nTotal);
    report_times("ZString1 string calls",time_function(do_z1_string_calls),nTotal);
    report_times("ZString2 string calls",time_function(do_z2_string_calls),nTotal);
    report_times("C++ string calls",time_function(do_cpp_string_calls),nTotal);
    return 0;
}

2005.01.28

Tax-time and Staples “Easy” Rebates

Filed under: Web — Preston @ 1:45 pm

I am impressed – Staples Easy Rebates is a web application from a retail outfit that works and is not overly flashy. Not to mention a really good idea.

It is tax-time again and I needed to buy tax software. Pretty much everyone is offering mail-in rebates.

I hate mail-in rebates. The one time I tried claiming a mail-in rebate was just too much of a hassle to be worth the partial amount that came back. Staples makes rebates really “easy”. Where you see this button – – on their site, you can skip the “mail-in” part and claim your rebate over the web.

Not only that – the web application they use to process the rebates is quite well done. The web interface is easy to use. They send an immediate follow-up email, and a progress notice the next day. None of this is exotic – just seldom done as well.

I will bet the suppliers are a lot less enthused. They make money when rebates are seldom claimed, so “easy” rebate processing means more money going back to customers.

Update 2/21/2005 — the rebates showed up in today’s mail. Guess this really works :) .


Once I’d sent in my taxes, Quicken made it easy to take a look at where my money has gone in the past year. Pretty straightforward story – I pay taxes, own a house (in over-priced southern California), and am divorced with three kids. That accounts for pretty much everything :( .

2005.01.25

No Bugs?

Filed under: Personal — Preston @ 7:56 pm


One of the critters starting to make a comeback after the rains.

Spent the day (and likely the remainder of the night) getting a prototype web application working, Well, it worked already in the same sense as a car works when it’s motor is running in the garage. Want to get this application out on the road…

( Speaking of which – time to run out to Trader Joe’s for supplies. Have to have cream for the coffee :) )

2005.01.24

The American SS?

Filed under: Politics — Preston @ 1:40 am

From McCain expects inquiry on new espionage unit

Defense Secretary Donald H. Rumsfeld has created a new espionage unit called the Strategic Support Branch, according to the news report, but McCain, speaking on CBS’s “Face the Nation” said he doubts Rumsfeld has broken any laws.

The “Special Support” branch? So our government now has a semi-secret unit with the initials “SS”. Ouch.

2005.01.23

Speed versus complexity in User Interface interaction

Filed under: Software — Preston @ 9:53 pm

Reflecting on user interface design and programming, there is a cluster of notions I would like to get across.

When writing a GUI framework to run on the original 4.77Mhz Intel 8088 based PCs, after writing the initial set of primitives I found performance unsatisfactory. I am a big believer in making the lower level framework very fast (then and now), so that the higher level code can be less concerned about efficiency. This was not an obtainable goal with generalized graphics operations on 8088 (too many clock counts for shift operations).

So I flipped the problem around. Instead of generalized graphics I came at the problem differently. Graphics operations are at base moving bits from one location to another. The 8088 string instructions were the fastest means of moving bits around. By choosing the right constraints (mainly byte-alignment in source and destination) I could write graphic operations sufficient for a GUI framework (at the time), and deliver very high performance levels.

It was nearly ten years later before Windows applications could count on similar performance levels (though of course the operations supported were far more general).

At about the same time I found an early shareware editor called Dewar Advanced Screen Editor (DAED). The interesting thing about using this editor was not the features offered (basic even for the time) but rather the fluid, fast, predictable response time. I had turned up my key-repeat rate up to much higher than standard levels (to something like 100 repeats/second through a third-party TSR), and because of the fast response of the editor, could simply zip from one location to another using simple cursor movement commands. Because the simple movement commands were (very!) fast and predictable, there was much less need for the more complex movement commands. When moving vertically you could judge where you were in a file simply by watching the text slide smoothly past.

In contrast, when later programming on Unix I became accustomed to using the complex movement commands available in Emacs, as this was the fastest way to move around in a file when the screen was painted by characters squirted across a serial line. Complex movement commands make more sense when display and response time is relatively slow. Note also on time-shared, multi-tasked systems the response time to simple movement commands is unpredictable. On a single-task unshared system you can learn to judge the distance per second moved through a file by holding down a movement key. This sort of learning can only take place when the response is immediate, predictable, and you can watch the shape of the text flowing past (fluid movement with no flicker).

We have an entire generation of programmers and designers who – outside of video games – are accustomed to relatively slow (certainly not fluid) response from their user interfaces. This lack of experience means the notion does not occur to most of the folks as even a possibility.

There is something fundamentally different about a user interface with fast, fluid response.

Note that Windows does not let you set key repeat rates above the standard 31 repeats/second. You cannot explore notions that are impossible within the common framework (Windows).

Not sure the interrelationships in the above is at all clear…

Next Page »