To characterise the relevant differences in behaviour between the various programming languages supported.


§1. Introduction. The conventions for writing, weaving and tangling a web are really quite independent of the programming language being written, woven or tangled; Knuth began literate programming with Pascal, but now uses C, and the original Pascal webs were mechanically translated into C ones with remarkably little fuss or bother. Modern LP tools, such as noweb, aim to be language-agnostic. But of course if you act the same on all languages, you give up the benefits which might follow from knowing something about the languages you actually write in.

The idea, then, is that Chapters 1 to 3 of the Inweb code treat all material the same, and Chapter 4 contains all of the funny little exceptions and special cases for particular programming languages. (This means Chapter 4 can't be understood without having at least browsed Chapters 1 to 3 first.)

Really all of the functionality of languages is provided through method calls, all of them made from this section. That means a lot of simple wrapper routines which don't do very much. This section may still be useful to read, since it documents what amounts to an API.

§2. Parsing methods. We begin with parsing extensions. When these are used, we have already read the web into chapters, sections and paragraphs, but for some languages we will need a more detailed picture.

PARSE_TYPES_PAR_MTID gives a language to look for type declarations.

enum PARSE_TYPES_PAR_MTID
VOID_METHOD_TYPE(PARSE_TYPES_PAR_MTID, programming_language *pl, web *W)
void LanguageMethods::parse_types(web *W, programming_language *pl) {
    VOID_METHOD_CALL(pl, PARSE_TYPES_PAR_MTID, W);
}

§3. PARSE_FUNCTIONS_PAR_MTID is, similarly, for function declarations.

enum PARSE_FUNCTIONS_PAR_MTID
VOID_METHOD_TYPE(PARSE_FUNCTIONS_PAR_MTID, programming_language *pl, web *W)
void LanguageMethods::parse_functions(web *W, programming_language *pl) {
    VOID_METHOD_CALL(pl, PARSE_FUNCTIONS_PAR_MTID, W);
}

§4. FURTHER_PARSING_PAR_MTID is "further" in that it is called when the main parser has finished work; it typically looks over the whole web for something of interest.

enum FURTHER_PARSING_PAR_MTID
VOID_METHOD_TYPE(FURTHER_PARSING_PAR_MTID, programming_language *pl, web *W)
void LanguageMethods::further_parsing(web *W, programming_language *pl) {
    VOID_METHOD_CALL(pl, FURTHER_PARSING_PAR_MTID, W);
}

§5. SUBCATEGORISE_LINE_PAR_MTID looks at a single line, after the main parser has given it a category. The idea is not so much to second-guess the parser (although we can) but to change to a more exotic category which it would otherwise never produce.

enum SUBCATEGORISE_LINE_PAR_MTID
VOID_METHOD_TYPE(SUBCATEGORISE_LINE_PAR_MTID, programming_language *pl, source_line *L)
void LanguageMethods::subcategorise_line(programming_language *pl, source_line *L) {
    VOID_METHOD_CALL(pl, SUBCATEGORISE_LINE_PAR_MTID, L);
}

§6. Comments have different syntax in different languages. The method here is expected to look for a comment on the line, and if so to return TRUE, but not before splicing the non-comment parts of the line before and within the comment into the supplied strings.

enum PARSE_COMMENT_TAN_MTID
INT_METHOD_TYPE(PARSE_COMMENT_TAN_MTID, programming_language *pl, text_stream *line, text_stream *before, text_stream *within)

int LanguageMethods::parse_comment(programming_language *pl,
    text_stream *line, text_stream *before, text_stream *within) {
    int rv = FALSE;
    INT_METHOD_CALL(rv, pl, PARSE_COMMENT_TAN_MTID, line, before, within);
    return rv;
}

§7. Tangling methods. We take these roughly in order of their effects on the tangled output, from the top to the bottom of the file.

The top of the tangled file is a header called the "shebang". By default, there's nothing there, but SHEBANG_TAN_MTID allows the language to add one. For example, Perl prints #!/usr/bin/perl here.

enum SHEBANG_TAN_MTID
VOID_METHOD_TYPE(SHEBANG_TAN_MTID, programming_language *pl, text_stream *OUT, web *W, tangle_target *target)
void LanguageMethods::shebang(OUTPUT_STREAM, programming_language *pl, web *W, tangle_target *target) {
    VOID_METHOD_CALL(pl, SHEBANG_TAN_MTID, OUT, W, target);
}

§8. Next is the disclaimer, text warning the human reader that she is looking at tangled (therefore not original) material.

enum SUPPRESS_DISCLAIMER_TAN_MTID
INT_METHOD_TYPE(SUPPRESS_DISCLAIMER_TAN_MTID, programming_language *pl)
void LanguageMethods::disclaimer(text_stream *OUT, programming_language *pl, web *W, tangle_target *target) {
    int rv = FALSE;
    INT_METHOD_CALL_WITHOUT_ARGUMENTS(rv, pl, SUPPRESS_DISCLAIMER_TAN_MTID);
    if (rv == FALSE)
        LanguageMethods::comment(OUT, pl, I"Tangled output generated by inweb: do not edit");
}

§9. Next is the disclaimer, text warning the human reader that she is looking at tangled (therefore not original) material.

enum ADDITIONAL_EARLY_MATTER_TAN_MTID
VOID_METHOD_TYPE(ADDITIONAL_EARLY_MATTER_TAN_MTID, programming_language *pl, text_stream *OUT, web *W, tangle_target *target)
void LanguageMethods::additional_early_matter(text_stream *OUT, programming_language *pl, web *W, tangle_target *target) {
    VOID_METHOD_CALL(pl, ADDITIONAL_EARLY_MATTER_TAN_MTID, OUT, W, target);
}

§10. A tangled file then normally declares "definitions". The following write a definition of the constant named term as the value given. If the value spans multiple lines, the first-line part is supplied to START_DEFN_TAN_MTID and then subsequent lines are fed in order to PROLONG_DEFN_TAN_MTID. At the end, END_DEFN_TAN_MTID is called.

enum START_DEFN_TAN_MTID
enum PROLONG_DEFN_TAN_MTID
enum END_DEFN_TAN_MTID
INT_METHOD_TYPE(START_DEFN_TAN_MTID, programming_language *pl, text_stream *OUT, text_stream *term, text_stream *start, section *S, source_line *L)
INT_METHOD_TYPE(PROLONG_DEFN_TAN_MTID, programming_language *pl, text_stream *OUT, text_stream *more, section *S, source_line *L)
INT_METHOD_TYPE(END_DEFN_TAN_MTID, programming_language *pl, text_stream *OUT, section *S, source_line *L)

void LanguageMethods::start_definition(OUTPUT_STREAM, programming_language *pl,
    text_stream *term, text_stream *start, section *S, source_line *L) {
    int rv = FALSE;
    INT_METHOD_CALL(rv, pl, START_DEFN_TAN_MTID, OUT, term, start, S, L);
    if (rv == FALSE)
        Main::error_in_web(I"this programming language does not support @d", L);
}

void LanguageMethods::prolong_definition(OUTPUT_STREAM, programming_language *pl,
    text_stream *more, section *S, source_line *L) {
    int rv = FALSE;
    INT_METHOD_CALL(rv, pl, PROLONG_DEFN_TAN_MTID, OUT, more, S, L);
    if (rv == FALSE)
        Main::error_in_web(I"this programming language does not support multiline @d", L);
}

void LanguageMethods::end_definition(OUTPUT_STREAM, programming_language *pl,
    section *S, source_line *L) {
    int rv = FALSE;
    INT_METHOD_CALL(rv, pl, END_DEFN_TAN_MTID, OUT, S, L);
}

§11. Then we have some "predeclarations"; for example, for C-like languages we automatically predeclare all functions, obviating the need for header files.

enum ADDITIONAL_PREDECLARATIONS_TAN_MTID
INT_METHOD_TYPE(ADDITIONAL_PREDECLARATIONS_TAN_MTID, programming_language *pl, text_stream *OUT, web *W)
void LanguageMethods::additional_predeclarations(OUTPUT_STREAM, programming_language *pl, web *W) {
    VOID_METHOD_CALL(pl, ADDITIONAL_PREDECLARATIONS_TAN_MTID, OUT, W);
}

§12. So much for the special material at the top of a tangle: now we're into the more routine matter, tangling ordinary paragraphs into code.

Languages have the ability to suppress paragraph macro expansion:

enum SUPPRESS_EXPANSION_TAN_MTID
INT_METHOD_TYPE(SUPPRESS_EXPANSION_TAN_MTID, programming_language *pl, text_stream *material)
int LanguageMethods::allow_expansion(programming_language *pl, text_stream *material) {
    int rv = FALSE;
    INT_METHOD_CALL(rv, pl, SUPPRESS_EXPANSION_TAN_MTID, material);
    return (rv)?FALSE:TRUE;
}

§13. Inweb supports very few "tangle commands", that is, instructions written inside double squares [[Thus]]. These can be handled by attaching methods as follows, which return TRUE if they recognised and acted on the command.

enum TANGLE_COMMAND_TAN_MTID
INT_METHOD_TYPE(TANGLE_COMMAND_TAN_MTID, programming_language *pl, text_stream *OUT, text_stream *data)

int LanguageMethods::special_tangle_command(OUTPUT_STREAM, programming_language *pl, text_stream *data) {
    int rv = FALSE;
    INT_METHOD_CALL(rv, pl, TANGLE_COMMAND_TAN_MTID, OUT, data);
    return rv;
}

§14. The following methods make it possible for languages to tangle unorthodox lines into code. Ordinarily, only CODE_BODY_LCAT lines are tangled, but we can intervene to say that we want to tangle a different line; and if we do so, we should then act on that basis.

enum WILL_TANGLE_EXTRA_LINE_TAN_MTID
enum TANGLE_EXTRA_LINE_TAN_MTID
INT_METHOD_TYPE(WILL_TANGLE_EXTRA_LINE_TAN_MTID, programming_language *pl, source_line *L)
VOID_METHOD_TYPE(TANGLE_EXTRA_LINE_TAN_MTID, programming_language *pl, text_stream *OUT, source_line *L)
int LanguageMethods::will_insert_in_tangle(programming_language *pl, source_line *L) {
    int rv = FALSE;
    INT_METHOD_CALL(rv, pl, WILL_TANGLE_EXTRA_LINE_TAN_MTID, L);
    return rv;
}
void LanguageMethods::insert_in_tangle(OUTPUT_STREAM, programming_language *pl, source_line *L) {
    VOID_METHOD_CALL(pl, TANGLE_EXTRA_LINE_TAN_MTID, OUT, L);
}

§15. In order for C compilers to report C syntax errors on the correct line, despite rearranging by automatic tools, C conventionally recognises the preprocessor directive #line to tell it that a contiguous extract follows from the given file; we generate this automatically.

enum INSERT_LINE_MARKER_TAN_MTID
VOID_METHOD_TYPE(INSERT_LINE_MARKER_TAN_MTID, programming_language *pl, text_stream *OUT, source_line *L)
void LanguageMethods::insert_line_marker(OUTPUT_STREAM, programming_language *pl, source_line *L) {
    VOID_METHOD_CALL(pl, INSERT_LINE_MARKER_TAN_MTID, OUT, L);
}

§16. The following hooks are provided so that we can top and/or tail the expansion of paragraph macros in the code. For example, C-like languages, use this to splice { and } around the expanded matter.

enum BEFORE_MACRO_EXPANSION_TAN_MTID
enum AFTER_MACRO_EXPANSION_TAN_MTID
VOID_METHOD_TYPE(BEFORE_MACRO_EXPANSION_TAN_MTID, programming_language *pl, text_stream *OUT, para_macro *pmac)
VOID_METHOD_TYPE(AFTER_MACRO_EXPANSION_TAN_MTID, programming_language *pl, text_stream *OUT, para_macro *pmac)
void LanguageMethods::before_macro_expansion(OUTPUT_STREAM, programming_language *pl, para_macro *pmac) {
    VOID_METHOD_CALL(pl, BEFORE_MACRO_EXPANSION_TAN_MTID, OUT, pmac);
}
void LanguageMethods::after_macro_expansion(OUTPUT_STREAM, programming_language *pl, para_macro *pmac) {
    VOID_METHOD_CALL(pl, AFTER_MACRO_EXPANSION_TAN_MTID, OUT, pmac);
}

§17. It's a sad necessity, but sometimes we have to unconditionally tangle code for a preprocessor to conditionally read: that is, to tangle code which contains #ifdef or similar preprocessor directive.

enum OPEN_IFDEF_TAN_MTID
enum CLOSE_IFDEF_TAN_MTID
VOID_METHOD_TYPE(OPEN_IFDEF_TAN_MTID, programming_language *pl, text_stream *OUT, text_stream *symbol, int sense)
VOID_METHOD_TYPE(CLOSE_IFDEF_TAN_MTID, programming_language *pl, text_stream *OUT, text_stream *symbol, int sense)
void LanguageMethods::open_ifdef(OUTPUT_STREAM, programming_language *pl, text_stream *symbol, int sense) {
    VOID_METHOD_CALL(pl, OPEN_IFDEF_TAN_MTID, OUT, symbol, sense);
}
void LanguageMethods::close_ifdef(OUTPUT_STREAM, programming_language *pl, text_stream *symbol, int sense) {
    VOID_METHOD_CALL(pl, CLOSE_IFDEF_TAN_MTID, OUT, symbol, sense);
}

§18. Now a routine to tangle a comment. Languages without comment should write nothing.

enum COMMENT_TAN_MTID
VOID_METHOD_TYPE(COMMENT_TAN_MTID, programming_language *pl, text_stream *OUT, text_stream *comm)
void LanguageMethods::comment(OUTPUT_STREAM, programming_language *pl, text_stream *comm) {
    VOID_METHOD_CALL(pl, COMMENT_TAN_MTID, OUT, comm);
}

§19. The inner code tangler now acts on all code known not to contain CWEB macros or double-square substitutions. In almost every language this simply passes the code straight through, printing original to OUT.

enum TANGLE_LINE_UNUSUALLY_TAN_MTID
INT_METHOD_TYPE(TANGLE_LINE_UNUSUALLY_TAN_MTID, programming_language *pl, text_stream *OUT, text_stream *original)
void LanguageMethods::tangle_line(OUTPUT_STREAM, programming_language *pl, text_stream *original) {
    int rv = FALSE;
    INT_METHOD_CALL(rv, pl, TANGLE_LINE_UNUSUALLY_TAN_MTID, OUT, original);
    if (rv == FALSE) WRITE("%S", original);
}

§20. We finally reach the bottom of the tangled file, a footer called the "gnabehs":

enum GNABEHS_TAN_MTID
VOID_METHOD_TYPE(GNABEHS_TAN_MTID, programming_language *pl, text_stream *OUT, web *W)
void LanguageMethods::gnabehs(OUTPUT_STREAM, programming_language *pl, web *W) {
    VOID_METHOD_CALL(pl, GNABEHS_TAN_MTID, OUT, W);
}

§21. But we still aren't quite done, because some languages need to produce sidekick files alongside the main tangle file. This method exists to give them the opportunity.

enum ADDITIONAL_TANGLING_TAN_MTID
VOID_METHOD_TYPE(ADDITIONAL_TANGLING_TAN_MTID, programming_language *pl, web *W, tangle_target *target)
void LanguageMethods::additional_tangling(programming_language *pl, web *W, tangle_target *target) {
    VOID_METHOD_CALL(pl, ADDITIONAL_TANGLING_TAN_MTID, W, target);
}

§22. Weaving methods. This metnod shouldn't do any actual weaving: it should simply initialise anything that the language in question might need later.

enum BEGIN_WEAVE_WEA_MTID
VOID_METHOD_TYPE(BEGIN_WEAVE_WEA_MTID, programming_language *pl, section *S, weave_order *wv)
void LanguageMethods::begin_weave(section *S, weave_order *wv) {
    VOID_METHOD_CALL(S->sect_language, BEGIN_WEAVE_WEA_MTID, S, wv);
}

§23. This method allows languages to tell the weaver to ignore certain lines.

enum SKIP_IN_WEAVING_WEA_MTID
INT_METHOD_TYPE(SKIP_IN_WEAVING_WEA_MTID, programming_language *pl, weave_order *wv, source_line *L)
int LanguageMethods::skip_in_weaving(programming_language *pl, weave_order *wv, source_line *L) {
    int rv = FALSE;
    INT_METHOD_CALL(rv, pl, SKIP_IN_WEAVING_WEA_MTID, wv, L);
    return rv;
}

§24. Languages most do syntax colouring by having a "state" (this is now inside a comment, inside qupted text, and so on); the following method is provided to reset that state, if so. Inweb runs it once per paragraph for safety's sake, which minimises the knock-on effect of any colouring mistakes.

enum RESET_SYNTAX_COLOURING_WEA_MTID
VOID_METHOD_TYPE(RESET_SYNTAX_COLOURING_WEA_MTID, programming_language *pl)
void LanguageMethods::reset_syntax_colouring(programming_language *pl) {
    VOID_METHOD_CALL_WITHOUT_ARGUMENTS(pl, RESET_SYNTAX_COLOURING_WEA_MTID);
}

§25. And this is where colouring is done.

enum SYNTAX_COLOUR_WEA_MTID
INT_METHOD_TYPE(SYNTAX_COLOUR_WEA_MTID, programming_language *pl,
    weave_order *wv, source_line *L, text_stream *matter, text_stream *colouring)
int LanguageMethods::syntax_colour(programming_language *pl,
    weave_order *wv, source_line *L, text_stream *matter, text_stream *colouring) {
    for (int i=0; i < Str::len(matter); i++) Str::put_at(colouring, i, PLAIN_COLOUR);
    int rv = FALSE;
    programming_language *colour_as = pl;
    if (L->category == TEXT_EXTRACT_LCAT) colour_as = L->colour_as;
    theme_tag *T = Tags::find_by_name(I"Preform", FALSE);
    if ((T) && (Tags::tagged_with(L->owning_paragraph, T))) {
        programming_language *prepl = Analyser::find_by_name(I"Preform", wv->weave_web, FALSE);
        if ((L->category == PREFORM_LCAT) || (L->category == PREFORM_GRAMMAR_LCAT))
            if (prepl) colour_as = prepl;
    }
    if (colour_as)
        INT_METHOD_CALL(rv, colour_as, SYNTAX_COLOUR_WEA_MTID, wv, L,
            matter, colouring);
    return rv;
}

§26. This method is called for each code line to be woven. If it returns FALSE, the weaver carries on in the normal way. If not, it does nothing, assuming that the method has already woven something more attractive.

enum WEAVE_CODE_LINE_WEA_MTID
INT_METHOD_TYPE(WEAVE_CODE_LINE_WEA_MTID, programming_language *pl, text_stream *OUT, weave_order *wv, web *W,
    chapter *C, section *S, source_line *L, text_stream *matter, text_stream *concluding_comment)
int LanguageMethods::weave_code_line(OUTPUT_STREAM, programming_language *pl, weave_order *wv,
    web *W, chapter *C, section *S, source_line *L, text_stream *matter, text_stream *concluding_comment) {
    int rv = FALSE;
    INT_METHOD_CALL(rv, pl, WEAVE_CODE_LINE_WEA_MTID, OUT, wv, W, C, S, L, matter, concluding_comment);
    return rv;
}

§27. When Inweb creates a new , it lets everybody know about that.

enum NOTIFY_NEW_TAG_WEA_MTID
VOID_METHOD_TYPE(NOTIFY_NEW_TAG_WEA_MTID, programming_language *pl, theme_tag *tag)
void LanguageMethods::new_tag_declared(theme_tag *tag) {
    programming_language *pl;
    LOOP_OVER(pl, programming_language)
        VOID_METHOD_CALL(pl, NOTIFY_NEW_TAG_WEA_MTID, tag);
}

§28. Analysis methods. These are really a little miscellaneous, but they all have to do with looking at the code in a web and working out what's going on, rather than producing any weave or tangle output.

The "preweave analysis" is an opportunity to look through the code before any weaving of it occurs. It's never called on a tangle run. These methods are called first and last in the process, respectively. (What happens in between is essentially that Inweb looks for identifiers, for later syntax colouring purposes.)

enum ANALYSIS_ANA_MTID
enum POST_ANALYSIS_ANA_MTID
VOID_METHOD_TYPE(ANALYSIS_ANA_MTID, programming_language *pl, web *W)
VOID_METHOD_TYPE(POST_ANALYSIS_ANA_MTID, programming_language *pl, web *W)
void LanguageMethods::early_preweave_analysis(programming_language *pl, web *W) {
    VOID_METHOD_CALL(pl, ANALYSIS_ANA_MTID, W);
}
void LanguageMethods::late_preweave_analysis(programming_language *pl, web *W) {
    VOID_METHOD_CALL(pl, POST_ANALYSIS_ANA_MTID, W);
}

§29. And finally: in InC only, a few structure element names are given very slightly special treatment, and this method decides which.

enum SHARE_ELEMENT_ANA_MTID
INT_METHOD_TYPE(SHARE_ELEMENT_ANA_MTID, programming_language *pl, text_stream *element_name)
int LanguageMethods::share_element(programming_language *pl, text_stream *element_name) {
    int rv = FALSE;
    INT_METHOD_CALL(rv, pl, SHARE_ELEMENT_ANA_MTID, element_name);
    return rv;
}

§30. What we support.

int LanguageMethods::supports_definitions(programming_language *pl) {
    if (Str::len(pl->start_definition) > 0) return TRUE;
    if (Str::len(pl->prolong_definition) > 0) return TRUE;
    if (Str::len(pl->end_definition) > 0) return TRUE;
    return FALSE;
}