To construct noun-phrase subtrees for assertion sentences.


§1. Hierarchy of noun phrases. Noun phrase nodes are built at four levels of elaboration, which we take in turn:

§2. Raw nounphrases (NP1). A raw noun phrase is always a single UNPARSED_NOUN_NT. The following always matches any non-empty text:

<np-unparsed> ::=
    ...                 ==> { 0, Diagrams::new_UNPARSED_NOUN(W) }

§3. This "balanced" version, however, requires any brackets and braces to be used in a balanced way: thus frogs ( and toads ) would match, but frogs ( and would not. It therefore does not always match.

<np-unparsed-bal> ::=
    ^<balanced-text> |  ==> { fail }
    <np-unparsed>       ==> { pass 1 }

§4. The noun phrase of an existential sentence is recognised thus:

<np-existential> ::=
    there               ==> { 0, Diagrams::new_DEFECTIVE(W) }

§5. Articled nounphrases (NP2). Now an initial article becomes an annotation and is removed from the text. Note that

On the table is a thing called A Town Called Alice.

<np-articled> ::=
    ... |                                              ==> { lookahead }
    <if-not-cap> <indefinite-article> <np-unparsed> |  ==> { 0, NounPhrases::add_art(RP[3], RP[2]) }
    <if-not-cap> <definite-article> <np-unparsed> |    ==> { 0, NounPhrases::add_art(RP[3], RP[2]) }
    <np-unparsed>                                      ==> { pass 1 }

<np-articled-bal> ::=
    ^<balanced-text> |                                 ==> { fail }
    <np-articled>                                      ==> { pass 1 }

§6.

parse_node *NounPhrases::add_art(parse_node *p, article_usage *au) {
    Node::set_article(p, au);
    return p;
}

§7. The following function is only occasionally useful; it takes an existing raw node and retrospectively applies <np-articled> to it.

parse_node *NounPhrases::annotate_by_articles(parse_node *RAW_NP) {
    <np-articled>(Node::get_text(RAW_NP));
    parse_node *MODEL = <<rp>>;
    Node::set_text(RAW_NP, Node::get_text(MODEL));
    Node::set_article(RAW_NP, Node::get_article(MODEL));
    return RAW_NP;
}

§8. List-divided nounphrases (NP3). An "articled list" matches text like "the lion, a witch, and some wardrobes" as a list of articled noun phrases.

Note that the requirement that non-final terms in the list have to be balanced means that an and or a comma inside brackets can never be a divider. Thus "the horse (and its boy)" would be one item, not two.

<np-articled-list> ::=
    ... |                                   ==> { lookahead }
    <np-articled-bal> <np-articled-tail> |  ==> { 0, Diagrams::new_AND(R[2], RP[1], RP[2]) }
    <np-articled>                           ==> { pass 1 }

<np-articled-tail> ::=
    , {_and} <np-articled-list> |           ==> { Wordings::first_wn(W), RP[1] }
    {_,/and} <np-articled-list>             ==> { Wordings::first_wn(W), RP[1] }

§9. "Alternative lists" divide up at "or" rather than "and", thus matching text such as "voluminous, middling big or poky", and the individual entries are not articled.

<np-alternative-list> ::=
    ... |                                      ==> { lookahead }
    <np-unparsed-bal> <np-alternative-tail> |  ==> { 0, Diagrams::new_AND(R[2], RP[1], RP[2]) }
    <np-unparsed>                              ==> { pass 1 }

<np-alternative-tail> ::=
    , {_or} <np-alternative-list> |            ==> { Wordings::first_wn(W), RP[1] }
    {_,/or} <np-alternative-list>              ==> { Wordings::first_wn(W), RP[1] }

§10. Full nounphrases (NP4). When fully parsing the structure of a nounphrase, we have five different constructions in play, and need to work out their precedence over each other: rather as * takes precedence over + in arithmetic expressions in C, so here we have —

    RELATIONSHIP_NT > CALLED_NT > WITH_NT > AND_NT > KIND_NT

That is, relative clauses take precedence over callings, and so on. The above hierarchy is arrived at thus:

See About Sentence Diagrams for numerous examples.

§11. Full nounphrase parsing varies slightly according to the position of the phrase, i.e., whether it is in the subject or object position. Thus "X is Y" or "X is in Y" would lead to X being parsed by <np-as-subject>, Y by <np-as-object>. They are identical except that:

<np-as-subject> ::=
    <np-existential> |                             ==> { pass 1 }
    <if-not-cap> <np-relative-phrase-limited> |    ==> { pass 2 }
    <np-nonrelative>                               ==> { pass 1 }

<np-as-object> ::=
    <if-not-cap> <np-relative-phrase-unlimited> |  ==> { pass 2 }
    <np-nonrelative>                               ==> { pass 1 }

§12. To explain the limitation here: RPs only exist in the subject position due to subject-verb inversion in English. Thus, "In the Garden is a tortoise" is a legal inversion of "A tortoise is in the Garden". Following this logic we ought to accept Yoda-like inversions such as "Holding the light sabre is the young Jedi", but we don't want to do that, because then a sentence like "Holding Area is a room" might have to be read as saying that a nameless room is holding something called "Area".

<np-relative-phrase-limited> ::=
    <np-relative-phrase-implicit> |                ==> { pass 1 }
    <probable-participle> *** |                    ==> { fail }
    <np-relative-phrase-explicit>                  ==> { pass 1 }

<np-relative-phrase-unlimited> ::=
    <np-relative-phrase-implicit> |                ==> { pass 1 }
    <np-relative-phrase-explicit>                  ==> { pass 1 }

§13. Inform guesses above that most English words ending in "-ing" are present participles — like guessing, bluffing, cheating, and so on. But there is a conspicuous exception to this; so any word found in <non-participles> is never treated as a participle.

<non-participles> ::=
    thing/something

<probable-participle> internal 1 {
    if (Vocabulary::test_flags(Wordings::first_wn(W), ING_MC)) {
        if (<non-participles>(W)) { ==> { fail nonterminal }; }
        return TRUE;
    }
    ==> { fail nonterminal };
}

§14. An implicit RP is a word like "carried", or "worn", on its own — this implies a relation to some unspecified noun. We represent that in the tree using the "implied noun" pronoun. For now, these are fixed.

<np-relative-phrase-implicit> ::=
    worn |              ==> Act on the implicit RP worn14.1
    carried |           ==> Act on the implicit RP carried14.2
    initially carried   ==> Act on the implicit RP initially carried14.3

§14.1. Act on the implicit RP worn14.1 =

    #ifndef IF_MODULE
    ==> { fail production }
    #endif
    #ifdef IF_MODULE
    ==> { 0, Diagrams::new_implied_RELATIONSHIP(W, R_wearing) }
    #endif

§14.2. Act on the implicit RP carried14.2 =

    #ifndef IF_MODULE
    ==> { fail production }
    #endif
    #ifdef IF_MODULE
    ==> { 0, Diagrams::new_implied_RELATIONSHIP(W, R_carrying) }
    #endif

§14.3. Act on the implicit RP initially carried14.3 =

    #ifndef IF_MODULE
    ==> { fail production }
    #endif
    #ifdef IF_MODULE
    ==> { 0, Diagrams::new_implied_RELATIONSHIP(W, R_carrying) }
    #endif

§15. An explicit RP is one which uses a preposition and then a noun phrase: for example, "on the table" is explicit.

Note that we throw out a relative phrase if the noun phrase within it would begin with "and" or a comma; this enables us to parse sentences concerning directions, in particular, a little better. But it means we do not recognise "of, by and for the people" as an RP.

<np-relative-phrase-explicit> ::=
    <permitted-preposition> _,/and ... |       ==> { fail }
    <permitted-preposition> _,/and |           ==> { fail }
    <permitted-preposition> <np-nonrelative>   ==> Work out a meaning15.1

§15.1. Work out a meaning15.1 =

    VERB_MEANING_LINGUISTICS_TYPE *R = VerbMeanings::get_regular_meaning_of_form(
        Verbs::find_form(permitted_verb, RP[1], NULL));
    if (R == NULL) return FALSE;
    ==> { -, Diagrams::new_RELATIONSHIP(W, VerbMeanings::reverse_VMT(R), RP[2]) };

§16. We have now disposed of RELATIONSHIP_NT and are left with the constructs:

    CALLED_NT > WITH_NT > AND_NT > KIND_NT

These are all handled by <np-nonrelative>. Two points to note:

<np-nonrelative> ::=
    ... |                                      ==> { lookahead }
    <np-operand> {called} <np-articled-bal> |  ==> { 0, Diagrams::new_CALLED(WR[1], RP[1], RP[2]) }
    <np-operand> <np-with-or-having-tail> |    ==> { 0, Diagrams::new_WITH(R[2], RP[1], RP[2]) }
    <np-operand> <np-and-tail> |               ==> { 0, Diagrams::new_AND(R[2], RP[1], RP[2]) }
    <np-kind-phrase> |                         ==> { pass 1 }
    <agent-pronoun> |                          ==> { 0, Diagrams::new_PRONOUN(W, RP[1]) }
    <here-pronoun> |                           ==> { 0, Diagrams::new_PRONOUN(W, RP[1]) }
    <np-articled>                              ==> { pass 1 }

<np-operand> ::=
    <if-not-cap> <np-relative-phrase-unlimited> |  ==> { pass 2 }
    ^<balanced-text> |                             ==> { fail }
    <np-nonrelative>                               ==> { pass 1 }

§17. The tail of with-or-having parses for instance "with carrying capacity 5" in the NP

a container with carrying capacity 5

This makes use of a nifty feature of Preform: when Preform scans to see how to divide the text, it tries <np-with-or-having-tail> in each possible position. The reply can be yes, no, or no and move on a little. So if we spot "it with action", the answer is no, and move on three words: that jumps over a "with" which we don't want to recognise. (Because if we did, then "the locking it with action" would be parsed as a property list, "action", attaching to a bogus object called "locking it".)

<np-with-or-having-tail> ::=
    it with action *** |                       ==> { advance Wordings::delta(WR[1], W) }
    {with/having} (/) *** |                    ==> { advance Wordings::delta(WR[1], W) }
    {with/having} ... ( <response-letter> ) |  ==> { advance Wordings::delta(WR[1], W) }
    {with/having} <np-new-property-list>       ==> { Wordings::first_wn(WR[1]), RP[1] }

<np-new-property-list> ::=
    ... |                                      ==> { lookahead }
    <np-new-property> <np-new-property-tail> | ==> { 0, Diagrams::new_AND(R[2], RP[1], RP[2]) }
    <np-new-property>                          ==> { pass 1 };

<np-new-property-tail> ::=
    , {_and} <np-new-property-list> |          ==> { Wordings::first_wn(W), RP[1] }
    {_,/and} <np-new-property-list>            ==> { Wordings::first_wn(W), RP[1] }

<np-new-property> ::=
    ...                                        ==> { 0, Diagrams::new_PROPERTY_LIST(W) }

§18. The "and" tail is much easier:

<np-and-tail> ::=
    , {_and} <np-operand> |                    ==> { Wordings::first_wn(W), RP[1] }
    {_,/and} <np-operand>                      ==> { Wordings::first_wn(W), RP[1] }

§19. Kind phrases are easier:

A sedan chair is a kind of vehicle. A weather pattern is a kind.

Note that indefinite articles are permitted before the word "kind(s)", but definite articles are not.

<np-kind-phrase> ::=
    <indefinite-article> <np-kind-phrase-unarticled> |  ==> { pass 2 }
    <np-kind-phrase-unarticled>                         ==> { pass 1 }

<np-kind-phrase-unarticled> ::=
    kind/kinds |                                        ==> { 0, Diagrams::new_KIND(W, NULL) }
    kind/kinds of <np-operand>                          ==> { 0, Diagrams::new_KIND(W, RP[1]) }