Conjunctions (new style) in the Link Grammar Parser


CONTENTS

Conjunctions
1.1. Conjunction overview
1.2. Miscellaneous Idiomatic Conjoining Expressions
1.3. Known problems
[Back to Introduction to Link Grammar] [Back to Link Grammar front page]

Conjunction overview

Earlier versions of the Link Grammar parser (those prior to version 4.7.0) used a different, special technique for parsing sentences containing coordinating conjunctions and other correlative structures. This technique, known as "fat links", and described here, proved to have a number of disadvantages, was deprecated in version 4.7.0 (2010), and finally removed in version 5.2.0 (2014). What follows is a description of the current mechanism.

Conjunction refers to the joining together different parsts of a sentence using either coordinating conjunctions, correlative conjunctions, or subordinating conjunctions. The most common coordinating conjunctions are and, or, nor, but, and are used to join together two grammatically similar or identical words or phrases. Correlative conjunctions and subordinating conjunctions are set phrases enforcing more complex, long-range structure. Examples include either ... or ..., not only ... but also ..., if ... then ..., and so on. Conjunctions of any of these types are (mostly) handled by the same technique within link grammar, although there is some variation. The strongest link grammar patterns occur for coordinating conjunctions, which are supported by a half-dozen distinct link types, roughly aligned with the part-of-speech or gramatical type of the words or phrases involved. These links include:

Many of these link types have several common, recurring themes in their structure and use. In general, these links only connect words with the same part-of-speech. Thus, for example: "The black and white cat sleeps" gives the following parse:

       +-------------Ds-------------+        
       |             +-------A------+        
       |     +--AJl--+--AJr--+      +---Ss--+
       |     |       |       |      |       |
      the black.a and.j-a white.a cat.n sleeps.v 
Here, the AJ link connects each adjective to the conjunction "and". The resulting conjunction behaves as if it were an adjective itself: so "and" will link to "cat" with the A link, just as an ordinary adjective would. This arrangement is sometimes refered to as a "Prague style" dependency. It imposes a fundamentally hierarchical ordering: "and" acts as a head-word, with two dependent words: "black" and "white"; the combined phrase "black and white" acts as a single adjective. This style for handling coordination is used for other parts of speech as well; so, for example, SJ links combine nouns into a noun phrase:
                +------Spx------+       +----Js----+
         +-SJls-+--SJrs-+       +--MVa--+    +--Ds-+
         |      |       |       |       |    |     |
      Jack.b and.j-n Jill.f fell.v-d down.r the hill.n 
Here, the conjoined "Jack and Jill" acts as a (plural) subject.

The subtypes AJl and AJr, standing for "left" and "right", are used to maintain proper sequential ordering; this is useful for properly managing comma-conjoined lists:

       +-------------------Ds------------------+        
       |          +-----AJl-----+-------A------+        
       |     +-AJl+-AJr-+       +--AJr--+      +---Ss--+
       |     |    |     |       |       |      |       |
      the black.a , orange.a and.j-a white.a cat.n sleeps.v 
The l,r subtype is also used for most of the other links (so, for example, VJl and VJr), with a few exceptions: the NI link uses NIf and NIt subtypes for numerical ranges, indicating the "from" and "to" ends of the range.

Other subtypes are used for enforcing agreement of various sorts, such as number and tense; so, for example: "cars and trucks are vehicles" but "*car and truck are vehicles", "*car and truck is vehicle". Agreement is discussed in greater detail in each individual link-type page.

The basic mechanism shown above can be extended to, and employed for any sort of multi-word correlative conjunctions, such as "neither ... nor ..." or "not only ... but also ...". Agreement between the various parts is enforced by means of the XJ link. So, for example:

        +------XJn------+------S*x-----+       
        |        +-SJls-+--SJrs-+      +---I--+
        |        |      |       |      |      |
    neither.r Jack.b nor.j-n Jill.f will.v come.v 

Miscellaneous Idiomatic Correlating Expressions

Although the words "and", "or" are the conjunctions most commonly used to correlate words and phrases, there are also many more complex idiomatic expressions that are used for coordination. These include:

Few of the above are implemented; those that are, are handled with the XJ, RJ and V links.

Known problems

The implementation of the VC link is irregular, and overlaps significantly with the MVs link. It also behaves very differrently from how the AJ, SJ, and VJ are implemented. Both VC and MVs need review, and possibly, redesign.

The V link deals with adpositions, but somewhat irregularly. Notice how the adpositions have a conjoining-like behaviour, but different ... The overall theory should be clarified & unified. See the README file that is shipped with the distro for more information.

The problem of complex verbs

The "Prague style" dependency works well when the conjoined words/phrases have a simple relationship to the rest of the sentence, but are problematic for more complex verb phrase conjunctions. Consider the sentence I taught these mice to jump, and those mice to freeze. The first part of the sentence

            +-------TOo-----+
            +---Op------+   |
       +-Sp-+     +-Dmc-+   +-I-+
       |    |     |     |   |   |
       I taught these mice to jump
shows that taught acts as a head-word, and has two links to the right: the object link Op and the TOo link. A similar linkage is desired for the second half of the sentence; but how should this be acheived? There are two possibilities; neither are implemented by the current version of link-grammar, for reasons explained below.

The first possibility is to imprint the verb-linking rules onto the conjunction, asking and to behave as if it were a verb:

            +-----------V*taught----+
            +------TOo--            +-------TOo----+
            +---Op--                +---Op-----+   |
       +-Sp-+                       |    +-Dmc-+   +-I-+
       |    |                       |    |     |   |   |
       I taught these mice to jump and those mice to freeze
In the above, the verb taught links as usual, but also sports a custom VJ link to and that makes the and behave just like the verb. The problem with this scheme is that there are many exceptional verb linkages, and this requires a custom VJ link for each exceptional case (so that the correct linkage for and can be forced).

The second possibility is to attach all of the links to the conjunction, as below:

            +-----V*super*and-------+
            |          +--Op -------+-------TOo----+
            |          |    +--TOo--+---Op-----+   |
       +-Sp-+               |       |    +-Dmc-+   +-I-+
       |    |                       |    |     |   |   |
       I taught these mice to jump and those mice to freeze
The above somewhat resembles the simple "Prague style" conjunction linkage, but is also dis-satisfying: by forcing both left- and right-directed links onto the and, it makes it difficult or even impossible to determine the verb-phrase structure of the sentence. Do all left-facing links form one one verb-phrase, and all right-facing links form another? Or perhaps some of the left and right-facing links should be grouped into one verb-phrase, and the remainder into another? This could be clarified by adding "phantom nodes" to indicate the desired grouping; however, the current link grammar has absolutely no concept of a "phantom node".

Given these two choices, the first possibility appears to be the most natural, as it offers the greatest fidelity to verb-phrase structure. It does introduce the concept of "link transferance onto the conjunction" in order to make it workable. Although this can be implemented within the current theory of link-grammar, it does force the and to be festooned with a large variety of linkage rules, one for each exceptional verb linakge. There is currently no way of instructing the parser to treat the and "just like the verb before it".


Linas Vepstas
Last modified: 8 September 2010