How executable functions are expressed in textual inter programs.
- §1. Code packages
- §4. Contexts
- §8. Function calls
- §10. Val, ref, lab and cast
- §12. Labels and assembly language
§1. Code packages. To recap from Textual Inter: an Inter program is a nested hierarchy of packages. Some of those are special _code packages which define functions, special in several ways:
- ● Their names can be used as values: that's how functions are called. See inv below.
- ● Their names can optionally have types: see Data Packages in Textual Inter for details.
- ● They cannot have subpackages. Conceptually, a code package is a single function body. Packages are not used for "code blocks", and there are no nested functions.
- ● They cannot contain constant, variable, and similar instructions found in data packages. Instead they can only contain the set of instructions which are the subject of this section (and which are allowed only in _code packages).
§2. The basic structure of a function body like this is that it begins with some local variable declarations, and then has its actual content inside a code block, like so:
package double _code local x code inv !return inv !plus val x val x
As with its global analogue, variable, a local instruction can optionally specify a type:
local (int32) x
There can be at most one code instruction at the top level. This is incorrect:
package fails _code code inv !enableprinting code inv !print val "I am dismal.\n"
and should instead be:
package succeeds _code code inv !enableprinting inv !print val "I am glorious.\n"
§3. Surprisingly, perhaps, it's legal not to have a code block at all. This function works:
package succeeds _code
But of course it does nothing. If the return value of such a function is used, it will be 0.
§4. Contexts. At any point inside a function body (except at the very top level), the instruction used is expected to have a given "category", decided by the "context" at that point. These categories have names:
- ● code context. This means an instruction is expected to do something, but not produce a resulting value.
- ● val context. This means an instruction is expected to produce a value.
- ● ref context. This means an instruction is expected to provide a "reference" to some storage in the program. For example, it could indicate a global variable, or a particular property of some instance.
- ● lab context. This means an instruction is expected to indicate a label marking a position in that same function.
In a code block, the context is initially code. For example:
package double _code local x top level has no context code top level has no context inv !jump context is code lab .SkipWarning context is lab inv !print context is code val "It'll get bigger!\n" context is val .SkipWarning context is code inv !store context is code ref x context is ref inv !plus context is val val x context is val val x context is val inv !return context is code val x context is val
In this function, the code block contains five instructions, each of which is read in a code context. Each of those then has its own expectations which set the context for its child instructions, and so on. For example, inv !store expects to see two child instructions, the first in ref context and the second in val context.
Those uses of inv !something are called "primitive invocations". They are like function calls, but where the function is built in to Inter and is not itself defined in Inter. Each such has a "signature". For example, the internal declaration of !store is:
primitive !store ref val -> val
So its signature is ref val -> val. This expresses that its two children should be read in ref and val context, and that its result is a val. (As in most C-like languages, stores are values in Inter, though in practice those values are often thrown away.)
The standard built-in stock of primitive invocations is described in the next section, on Inform Primitives.
§5. How is all this policed? Whereas typechecking of data is often weak in Inter, signature checking is taken much more seriously. If the context is code, then the only legal primitives to invoke are those where the return part of the signature is either void (no value) or val (a value, but which is thrown away and ignored, as in most C-like languages). Otherwise, ref context requires a ref result, and similarly for val and lab.
For example, !return has the signature val -> void, which makes it legal to use in a code context as in the above example. But these two attempts to use it would both be incorrect:
inv !return inv !printnumber inv !return val 10
The first fails because it tries to use !return as if it were void -> void, i.e., with no supplied value; the second fails because it tries to use it as if it were val -> val.
§6. Some primitives have code as one or more of their arguments. For example:
primitive !ifelse val code code -> void
This evaluates the first argument (a value), then executes the second argument (a code block) if the value is non-zero, or alternatively the third if it is zero. There is no result. For example:
inv !ifelse val x code inv !printnumber x code inv !print "I refuse to print zeroes on principle."
§7. Rather like code, which executes a run of instructions as if they were a single instruction, evaluation makes a run of evaluations. Thus:
inv !printnumber evaluation val 23 val -1 val 12
prints just "12". The point of this is that there may be side-effects in the earlier evaluations, of course, though there weren't in this example.
Another converter, so to speak, is reference, but this is much more limited in what it is allowed to do.
inv !store reference val x val 5
is exactly equivalent to:
inv !store ref x val 5
This is not a very useful example: but consider —
inv !store reference inv !propertyvalue val Odessa val area val 5000
which changes the property area for Odessa to 5000. The signature of !propertyvalue is val val -> val, and ordinarily it evaluates the property. But placed under a reference, it becomes a reference to where that property is stored, and thus allows the value to be changed with !store. This:
inv !store inv !propertyvalue val Odessa val area val 5000
would by contrast be rejected with an error, as trying to use a val in a ref context.
reference cannot be applied to anything other than storage (a local or global variable, a memory location or a property value), so for example:
reference val 5
is meaningless and will be rejected. There is in general no way to make, say, a pointer to a function or instance using reference. It is much more circumscribed than the & operator in C.
§8. Function calls. This seems a good point to say how to make function calls, since it's almost exactly the same. This:
inv !printnumber inv double val 10
prints "20". Note the lack of a ! in front of the function name: this means it is a regular function, not a primitive.
§9. Function calls work in a rather assembly-language-like way, and Inter makes much less effort to type-check these for any kind of safety: so beware. It allows them to have any of the signatures void -> val, val -> val, val val -> val, ... and so on: in other words, they can be called with any number of arguments.
In particular, even if a function is declared with a type it is still legal to call it with any number of arguments. Again: beware.
Those arguments become the initial values of the local variables. So for example, if:
package example _code local x local y
then:
- ● a call with no arguments results in x and y equal to 0 and 0;
- ● a call with argument 7 results in x and y equal to 7 and 0;
- ● a call with arguments 7 and 81 results in x and y equal to 7 and 81;
- ● a call with three or more arguments has undefined results and may crash the program altogether.
§10. Val, ref, lab and cast. We have seen many examples already, but:
- ● val V allows us to use any simple value V in any val context. For what is meant by a "simple" value, see Data Packages in Textual Inter.
- ● ref R allows us to refer to any variable, local or global, in a ref context.
- ● lab L allows us to refer to any label declared somewhere in the current function body, in a lab context.
§11. The val and ref instructions both allow optional type markers to be placed, so for example:
val (int32) x ref (text) y
Where no type marker is given, the type is always considered unchecked.
Types of val or ref tend not to be checked or looked at anyway, so this feature is currently little used. For many primitives, some of which are quite polymorphic, it would be difficult to impose a typechecking regime anyway. But the ability to mark val and ref with types is preserved as a hedge against potential future developments, when Inter might conceivably be tightened up to typecheck explicitly typed values.
Similarly unuseful for the moment is cast. This instruction allows us to say "consider this value as if it had a different type". For example, if we are using an enumerated type city, we could read the enumeration values as numbers like so:
cast int32 <- city val (city) Odessa
Right now this is no different from:
val (int32) Odessa
but we keep cast around as a hedge against future developments, in case we ever want to typecheck strictly enough that val (int32) Odessa is rejected as a contradiction in terms.
§12. Labels and assembly language. Like labels in C, these are named reference points in the code; they are written .NAME, where .NAME must begin with a full stop .. Labels are not values; they cannot be stored, or computed with, or cast. They can only be used in a lab instruction.
§13. Two uses of inv have already been covered: to call an Inter function, and to invoke a primitive operation. The third is to execute an "assembly-language opcode". What we mean by that is the direct use of the instruction set on the target virtual machine we are expecting our program to run on.
This has always been a feature of Inform 6 code. For example, some real-number arithmetic functions in BasicInformKit are written to use heavy amounts of Glulx assembly language, in order to access functionality not present in the Inform language itself. Here is a sample:
@fdiv sp $40135D8E log10val; $40135D8E is log(10) @floor log10val fexpo; @ftonumn fexpo expo;
Those "opcodes" beginning @ are part of the instruction set for the Glulx virtual machine: real number arithmetic is impossible on the smaller Z-machine, so we couldn't meaningfully compile this code to that platform, and that is just is well because it has a completely different instruction set of opcodes from Glulx anyway. Still, there's no denying that Inter code using assembly immediately becomes less portable. This is why it is always better to use Inter primitives if possible.
Still, BasicInformKit must be compiled to Inter code somehow. We clearly need to deal with those opcodes somehow. The standard Inform-provided kits use two different sets of opcodes, as noted: the Z-machine and Glulx instruction sets. One conceivable way to deal with this would have been to provide primitives equivalent to every opcode in either set (or at least every opcode used in the standard Inform kits). But that would hugely increase the set of primitives, and also incur a certain amount of awkward repetition.
Instead, the Inter specification goes to the opposite extreme. It makes no assumptions about what assembly opcodes do, or do not, exist. Inter allows absolutely anything, and would be quite happy to accept, say, inv @flytothemoon, even though this opcode does not exist in any known system of assembly language.1
And so the above is in fact compiled to:
inv @fdiv assembly stack val 0x40135D8E val log10val inv @floor val log10val val fexpo inv @ftonumn val fexpo val expo
And when the building module performed that compilation, it knew nothing about @fdiv and the rest: it just took on trust that this is meaningful.
1 This can actually be useful, since it means people experimenting with new hybrid forms of Inform can devise extra opcodes of their own. ↩
§14. So, how apparently generous: the Inter specification allows us to invoke opcodes with arbitrary names. But that does not, of course, mean that those opcodes can be compiled to code which does anything useful. The final code-generation module probably won't know what to do with our hypothetical @flytothemoon opcode.
In practice, therefore, final knows how to deal with the Z-machine instruction set when compiling for Z via Inform 6, and how to deal with the Glulx instruction set when compiling either for Glulx via Inform 6 or a native executable via a C compiler like clang. Any further code-generators are also likely to follow Glulx conventions. So: if you really must use assembly language in your Inter code, good advice would be —
- (1) Use the Glulx instruction set, for better chances of portability.
- (2) Only use those opcodes which are also used in the standard Inform kits somewhere, since those will probably be implemented.
§15. If we look at this example in more detail:
inv @fdiv assembly stack val 0x40135D8E val log10val
we see some general features of assembly language. Inter allows any number of child instructions to be supplied — here, there are three. Since Inter knows nothing about the meaning of @fdiv, it has no way to know how many are expected. They should all be usages of val, lab, or assembly.
val and lab we have seen already. assembly is a sort of punctuation instruction which allows various oddball syntaxes of Z-machine or Glulx assembly to be imitated in Inter. There are only seven possible assembly instructions. Two are very common:
- ● assembly stack is probably the most common, either reading or writing to the top of the virtual machine's stack.
- ● assembly store_to indicates that a storage location follows (either assembly stack or a local or global variable). This is only used in Z-machine assembly language; Glulx assembly doesn't have this marker.
The other five apply only to "branch instructions", which perform some test and then either return from the current function or make a jump to a label (a "branch"), depending on the outcome of the test. By default the instruction branches on a successful test. But alternatively it can:
- ● assembly branch_if_false.
- ● assembly return_true_if_true.
- ● assembly return_false_if_true
- ● assembly return_true_if_false
- ● assembly return_false_if_false
So for example the Z-machine instruction @random sp -> i; compiles to Inter as:
inv @fdiv assembly stack assembly store_to val i
And note the use of val i, not ref i, even though the variable is being written to here. Even Inter's normal rules of category checking do not apply to assembly language, the lowest of the low.