Xtext Cross References and Scoping – an Overview (Part 1)

Grammar Aspects

As a side note: There are not many naming conventions the grammar language actually enforces. However, almost all conventions that have an actual impact on the behaviour have to do with cross referencing (I am referring to features with the name name, importURI and importedNamespace).

References

Xtext distinguishes containment references (the referenced object is contained/declared/defined in the referencing object; in the model file reference and definition basically coincide) from cross references (the referenced object is defined somewhere else, the referencing object has a link to the referenced object). I assume the reader is aware that this series talks only about the second type.

In the grammar of your DSL you specify that there should be a cross reference to an object, you specify the type of the referenced object and you specify the syntax of the name used for establishing the reference. Note that within the grammar you can only restrict the type of the referenced object. Further restrictions on the semantic validity of a cross reference have to be enforced elsewhere. Consider the following well-known grammar snippet

Entity: ... 'extends' extends=[Entity] ...;

The square brackets indicate that there will be a cross reference. The word “Entity” within the brackets indicates that the referenced object is of type Entity. It is very important to understand, that this Entity does not refer to the Grammar rule with name Entity. Further this rule states that the name that is to be establish the reference has the ID-syntax. This is because [Type] is a shorthand notation for [Type|ID] which means “cross reference to an object of type Type via a name that has the syntax of an ID”. The pipe symbol here is not indicating an alternative! It simply separates the referenced type from the string-returning-grammar-rule-describing-the-allowed-syntax. A more verbose version of the above snippet would be

EntityRule returns Entity: ... 'extends' extends=[Entity|ID]...;

Given you inherit the default ID-terminal rule, in your model file you can now write “… extends SuperType …” but not “… extends the.package.of.SuperType…”, i.e. you can use only “simple” names to establish the reference but not qualified ones. For that you would have to change the grammar to something like

Entity: ... 'extends' extends=[Entity|Fqn]...;
Fqn: ID ('.' ID)*;

For a reference you can only define one target type. That is, the following is not valid

Rule: ... ref=[(Type1|Type2) | ID]...

You would have to introduce a common super type of the two

SuperType: Type1|Type2;
Rule: ...ref=[SuperType|ID]...

Note that if you have no influence over the meta model (i.e. you cannot simply introduce a common super type for Type1 and Type2), you usually take the most specific existing common supertype (usually at least EObject is a candidate) and adapt the framework at the suitable hooks (i.e. validation, scoping, content assist).

A common language design problem is the following

Rule1: ... ( refToType1=[Type1] | refToType2=[Type2] )... //or
Rule2: ... (ref=[Type] | simpleValue=ID)...

The problem is that the parser has to decide which alternative is to be used. This is not possible as there is no syntactic indication — in each case an ID is read. Also, the developer cannot argue that “in case a Type1 is found link it, in case Type2 is found link that” should be the way to go. The parser is not responsible for linking, it only installs proxies to be resolved later. The usual pattern for resolving these problems would be the introduction of common super types or the introduction of keywords into the language.

Naming

As stated above, a name provider is responsible for giving a name to an object so that it may be referenced at all. In order to provide nice out-of-the-box behaviour, the following convention is followed. If an object has a feature with a string-type, whose name is “name”, the default name provider implementation will use the value of that feature for calculating the name of an object. This is why cross referencing works (immediately) for

Entity: 'entity' name=ID ...; //or
Entity: 'entity' name=STRING ...;

but not for

Entity: 'entity' id=ID ...;

If you look at the workflow for generating the language infrastructure, you will notice two naming fragments (SimpleNameFragment, QualifiedNameFragment) that may be used. When using the first one, only the name of the object itself is considered. When using the second, the actual name of an object is composed using all the names found in the containment hierarchy. Given the grammar snippet

Entity: 'entity' name=ID '{' attributes+=Attribute* '}';
Attribute: 'attribute' name=ID;

and the model file

entity X {attribute Att1 attribute Att2}
entity Y {attribute Att1 attribute Att2}

both fragments would yield entities with names X and Y. The simple name fragment would yield two attributes with name Att1 and two with name Att2. Note that here you might run into problems when trying to reference a specific Att1. The qualified name fragment would calculate the following names for the attributes: X.Att1, X.Att2, Y.Att1, Y.Att2.

The primary hook for adapting the naming is the IQualifiedNameProvider. Beware of trying to calculate names that require resolving links (i.e. follow any cross reference). Name calculation is done prior to indexing which has to be finished before linking; you see the problem here…

Imports

Import constructs are very common in languages. Xtext supports two import types out of the box, importing a particular file via a URI pointing to that file and importing a namespace (analogous to Java imports). Note that these imports only affect the visibility of objects. The import does not make the imported stuff part of the importing model. You choose the import semantic to be used by enabling the corresponding generator fragment (often the URI import goes along with the simple names and the name space import with qualified names). The grammar convention going along with the import semantics can also be found in the documentation. A URI import of a particular file usually looks as follows:

Import: 'import' importURI=STRING;

Again the name of the feature is important. If any feature with name “importURI” is found the value (in case it is a string-value) is interpreted as the URI of a model file whose objects should be made visible. Usually one uses STRING (rather than ID or some other datatype rule). A namespace import normally looks as follows:

Import: 'import' importedNamespace=FqnWithWildCard;
FqnWithWildCard: Fqn('.*')?;
Fqn:ID('.'ID)*

Now if you have a model like

import my.package.*
import my.other.package.*
entity X extends Y...

the cross reference resolution would not only look for elements with name Y, but also my.package.Y and my.other.package.Y. So it normalises the name to be used for cross referencing (Y) against all possible imports as well.