A minimal library for handling C-style strings.


§1. Programs using Foundation store text in text_stream structures almost all of the time, but old-style, null-terminated char * array strings are still occasionally needed.

We need to handle C strings long enough to contain any plausible filename, and any run of a dozen or so lines of code; but we have no real need to handle strings of unlimited length, nor to be parsimonious with memory.

The following defines a type for a string long enough for our purposes. It should be at least as long as the constant sometimes called PATH_MAX, the maximum length of a pathname, which is 1024 on Mac OS X.

define MAX_STRING_LENGTH 8*1024
typedef char string[MAX_STRING_LENGTH+1];

§2. Occasionally we need access to the real, unbounded strlen:

int CStrings::strlen_unbounded(const char *p) {
    return (int) strlen(p);
}

§3. Any out-of-range access immediately halts the program; this is drastic, but an attempt to continue execution after a string overflow might conceivably result in a malformatted shell command being passed to the operating system, which we cannot risk.

int CStrings::check_len(int n) {
    if ((n > MAX_STRING_LENGTH) || (n < 0)) Errors::fatal("String overflow\n");
    return n;
}

§4. The following is then protected from reading out of range if given a non-terminated string, though this should never actually happen.

int CStrings::len(char *str) {
    for (int i=0; i<=MAX_STRING_LENGTH; i++)
        if (str[i] == 0) return i;
    str[MAX_STRING_LENGTH] = 0;
    return MAX_STRING_LENGTH;
}

§5. We then have a replacement for strcpy, identical except that it's bounds-checked:

void CStrings::copy(char *to, char *from) {
    CStrings::check_len(CStrings::len(from));
    int i;
    for (i=0; ((from[i]) && (i < MAX_STRING_LENGTH)); i++) to[i] = from[i];
    to[i] = 0;
}

§6. String comparisons will be done with the following, not strcmp directly:

int CStrings::eq(char *A, char *B) {
    return (CStrings::cmp(A, B) == 0)?TRUE:FALSE;
}

int CStrings::ne(char *A, char *B) {
    return (CStrings::cmp(A, B) == 0)?FALSE:TRUE;
}

§7. On the rare occasions when we need to sort alphabetically we'll also call:

int CStrings::cmp(char *A, char *B) {
    if ((A == NULL) || (A[0] == 0)) {
        if ((B == NULL) || (B[0] == 0)) return 0;
        return -1;
    }
    if ((B == NULL) || (B[0] == 0)) return 1;
    return strcmp(A, B);
}

§8. And the following is needed to deal with extension filenames on platforms whose locale is encoded as UTF-8.

void CStrings::transcode_ISO_string_to_UTF8(char *p, char *dest) {
    int i, j;
    for (i=0, j=0; p[i]; i++) {
        int charcode = (int) (((unsigned char *)p)[i]);
        if (charcode >= 128) {
            dest[j++] = (char) (0xC0 + (charcode >> 6));
            dest[j++] = (char) (0x80 + (charcode & 0x3f));
        } else {
            dest[j++] = p[i];
        }
    }
    dest[j] = 0;
}

§9. I dislike to use strncpy because, and for some reason this surprises me every time, it truncates but fails to write a null termination character if the string to be copied is larger than the buffer to write to: the result is therefore not a well-formed string and we have to fix matters by hand. This I think makes for opaque code. So:

void CStrings::truncated_strcpy(char *to, char *from, int max) {
    int i;
    for (i=0; ((from[i]) && (i<max-1)); i++) to[i] = from[i];
    to[i] = 0;
}

§10. Text storage. The following is convenient for parking a read-only string of text somewhere safe in memory. Since the length can't be extended, it's usually unsafe to write to the result. (Inform tools very seldom use this, because C strings are almost always best avoided.)

typedef struct string_storage_area {
    char *storage_at;
    int capacity;
    CLASS_DEFINITION
} string_storage_area;

§11.

char *CStrings::park_string(char *from) {
    string_storage_area *ssa = CREATE(string_storage_area);
    ssa->capacity = (int) CStrings::strlen_unbounded(from) + 1;
    ssa->storage_at = Memory::malloc(ssa->capacity, STRING_STORAGE_MREASON);
    strcpy(ssa->storage_at, from);
    return ssa->storage_at;
}

§12. And here we free any SSAs needed in the course of the run.

void CStrings::free_ssas(void) {
    string_storage_area *ssa;
    LOOP_OVER(ssa, string_storage_area)
        Memory::I7_free(ssa->storage_at, STRING_STORAGE_MREASON, ssa->capacity);
}