DIT++ Taxonomy of Dialogue Acts, Annotation Scheme, and DiAML
Markup Language

Release 5.2 (2019)

What's new in release 5.2

The DIT++ Taxonomy --- Concept definitions --- DiAML
Annotation guidelines --- Annotated examples --- ISO standard 24617-2 (Second Edition, 2020)
The DialogBank --- Publications

DIT++ is a semantically based framework for the analysis of human and human-machine dialogue, and for annotating dialogue with information about the communicative acts ('dialogue acts') that are expressed by dialogue segments. DIT++ consists of (1) a comprehensive, application-independent multidimensional taxonomy of communicative functions which are semantically defined in terms of their information-state changing potential, (2) the definition of a set of 10 orthogonal dimensions to which a dialogue act may belong, which offers a basis for understanding the multifunctionality of utterances in dialogue, (3) the definition of various kinds of semantic and pragmatic relations between dialogue acts, and (4) the specification of a small set of 'qualifiers' that may be used to indicate a speaker's uncertainty, reservations ('conditionality'), or sentiment. The Dialogue Act Markup Language (DiAML) was designed to use the concepts of DIT++ in dialogue act annotation and in the specification of dialogue acts in online recognition, interpretation, or generation of spoken, written, or multimodal dialogue.

The DIT++ taxonomy was constructed by extending the taxonomy of Dynamic Interpretation Theory (DIT), originally developed for information dialogues (Bunt, 1989; 1994), with a number of dialogue act types from DAMSL (Allen & Core, 1997) and other annotation schemes and dialogue studies.

Release 5.1 was developed in tandem with the definition of ISO standard 24617-2:2012 (September 2012) for dialogue act annotation. This concerns in particular (1) the definitions of the communicative functions in the DIT++ taxonomy and those included in the ISO standard, which have been made identical, and (2) the definition of the DiAML markup language, which can be used both with the concepts defined in DIT++ and with those defined in the ISO standard (see the annotated examples). The DIT++ release 5.1 annotation scheme is thus fully compatible with ISO 24617-2:2012. It is in some respects more fine-grained than the ISO scheme; where the latter includes 56 communicative functions, the DIT++ scheme (release 5.1) contains 88 functions, including notably more detailed feedback functions, more functions for discourse structuring and for social aspects of interacting, and functions for contact management (a dimension that is not included in the ISO 24617-2:2012 standard).

Experiences in the use of DIT++ Release 5.1 and of the ISO 24617-2 annotation scheme have inspired some improvements and extensions to DIT++ release 5.1 which are included in Release 5.2 and which are the basis of the second edition of the ISO 24617-2 standard, established in December 2020. Release 5.2 is an upward compatible revision of Release 5.1 in the sense that all annotations made according to Release 5.1 are also valid according to Release 5.2. The taxonomy of communicative functions is only extended; some other aspects have been improved. Release 5.2 includes a number of new elements that allow more accurate annotation of relations among dialogue acts. Moreover, the concept of a 'plug-in annotation scheme' has been introduced (Bunt, 2019), which allows various ways of enriching and customizing dialogue act annotation/specification. In particular, plug-in schemes are defined for (1) enriching DIT++ descriptions of dialogue acts with semantic content information; (2) introducing task- or domain-specific communicative functions; (3) annotating casual talk, for example in the opening and closing phases of a dialogue; and (4) indicating speaker emotions, importing elements from EmotionML. See 'New in Release 5.2' for a summary description of what's new in this release.

The concepts of DIT++ have been applied and evaluated in a number of annotation efforts and in the design of the ISO 24617-2 standard for dialogue act annotation. For some of its applications to annotation, see Geertzen and Bunt (2006), Petukhova and Bunt (2007), Geertzen et al. (2007), Petukhova, 2011, Fang et al. (2012), Petukhova & Bunt (2012), Bunt et al. (2019).

Another application is in the design of a dialogue manager module that is capable of generating multifunctional contributions to a dialogue; see Keizer and Bunt, 2006, Keizer and Bunt, 2007, Keizer et al., 2011, Malchanau et al., 2015 Malchanau et al., 2018; Malchanau 2019.

For the use of the DIT++ taxonomy and DIT more generally in other studies of dialogue see: Geertzen (2009), Morante (2007), Bunt (2011), Petukhova (2011), and the publications listed in Part 7 of this document.

The rest of this documentation consists of seven parts:

Part 1 explains in some detail the novel elements in Release 5.2.
Part 2 shows the taxonomy of communicative functions. The hierarchical relations in the taxonomy, indicated by indentation, represent relative degrees of specificity of dialogue acts, in the sense that a more specific act has stronger preconditions than a less specific act (which dominates it in the taxonomy); in other words, the preconditions of a more specific dialogue act logically entail those of any dominating act in the hierarchy. A communicative function inherits all the preconditions of its ancestors in the hierarchy. For instance, a Check Question is more specific than a Propositional Question because it has an additional precondition, concerning the speaker's expectation of the answer. Similarly, a Confirm(ation) is more specific than a Propositional Answer. This is reflected in the taxonomy by Check Question being dominated by Propositional Question, and Confirm by Propositional Answer.
Part 2 contains the definitions of all the communicative functions; you can consult the definition of a communicative function by clicking on its name in the taxonomy.
Part 3 gives some examples (not yet updated for release 5.2) of the linguistic and/or nonverbal expression of these functions; to see examples, click on a definition.
Part 4 contains a specification of the Dialogue Act Markup Language (DiAML).
Part 5 contains guidelines for how to use the taxonomy in dialogue act annotation using DiAML and the DIT++ concepts.
Part 6 provides some brief information about the ISO 24617-2 dialogue act annotation standard.
Part 7 contains a list of publications relating to the DIT++ taxonomy or to the underlying theory (DIT).

New in release 5.2:

Specification of semantic content: in previous releases the semantic content of dialogue acts could not be annotated. This is a limitation that restricts the usefulness of DIT++ for online dialogue act understanding and dialogue generation.

To overcome this limitation, two new concepts are introduced: tripartute plug-in annotation schemes and plug-in interfaces, which together allow DiAML annotations to be enriched with the annotation of their semantic content, by linking structures of the host annotation scheme to those of the plug-in scheme. More specifically, in order to add semantic content information to dialogue act annotations, an interface is specified which defines a new type of link structure, a 'content link structure' in the DiAML abstract syntax, and in the concrete syntax a '< contentLink >' element. This allows the functional specification of a dialogue act to be related to its semantic content, leading to representations in the following form (where "#z" refers to a semantic content annotation provided by a plug-in):
```
(N1)	           < dialogueAct xml:id="da1"target="#m1" speaker="#s" addressee="#a"
   		        dimension="task"  communicativeFunction="inform"/>		
                   < contentLink dialAct="#da1" content="#z"/>
```
The annotation of semantic content is optional; the use of such a plug-in is an option, not an obligation. The use of an explicit link between the functional aspects of a dialogue act and its semantic content allows the use of alternative plug-ins for content annotation, and offers the possibility to customize the content annotation to a specific application. It also enables the specification of additional information attached to the content link, such as (un-)certainty scores and alternatives, and gives support to the management of ambiguities in the semantic content. See below for more about plug-in annotatation schemes and interfaces.
Use of 'reference segments': stretches of dialogue to which a feedback act (or an own communication management (OCM) act, or a partner communication management (PCM) act) refers and which are not a functional segment.
Feedback acts are about the processing of something that was said earlier in the dialogue; feedback dependence relations link a feedback act to this 'something'. The nature of this 'something' depends on the 'level' of the feedback.
- Feedback about paying attention is mostly given in nonverbal form, such as by eye contact.
- Positive feedback at the level of perception can be expressed by echoeing what the previous speaker said; negative feedback by repeating part of what was said with a questioning intonation, like "Tuesday?", or "John WHO?".
- Feedback at the level of understanding can be expressed for example by "I see" or by paraphrasing something.
- Positive feedback at the level of evaluation can be expressed by "Excellent!", "True", or "Good question". Negative feedback, e.g. by "Really?", "Are you sure?"
- Positive feedback at the level of execution can be expressed by "Sure"; negative feedback for instance by "I don't know" in response to a question.
Feedback by means of expressions such as "OK", "Uh-huh", or "Really?" says something about a previous dialogue act, while feedback by means of "Tuesday?" or "John WHO?" is about a particular word or dialogue segment. DIT++ therefore allows both dialogue acts and functional segments as antecedents for feedback dependence relations. This is not always accurate, since segment-related feedback is not necessarily about a functional segment; it may be about any previous segment, functional or not, such as a single word. Reference segments are introduced in release 5.2 for more accurate markup of feedback dependences, which can also be used for OCM and PCM acts.
Argument roles in rhetorical relations: In the previous release of DIT++ and DiAML, only a limited provision was available for indicating the existence of a rhetorical relation between two dialogue acts, as illustrated in (N2).
```
(N2) a.  A: Have you seen Pete today?  
         B: He didn't come in; he has the flu.

     b.  < dialogueAct xml:id="da1" target="#fs1" sender="#a" addressee="#b" dimension="task" communicativeFunction="propositionalQuestion"/>
         < dialogueAct xml:id="da2" target="#fs2" sender="#b" addressee="#a" dimension="task" communicativeFunction="answer"
           functionalDependence="#da1"/> 
         < dialogueAct xml:id="da3" target="#fs3" sender="#b" addressee="#a" dimension="task" communicativeFunction="inform"/> 
         < rhetoricalLink dact="#da3" rhetoAntecedent="#da2" rhetoRel="cause"/> 
```
One important limitation of annotating rhetorical relations in this way is that it is not possible to indicate which argument of a relation has whhich role. For example, the annotation in (N2) merely says that a causal relation exists between the two dialogue acts, but it cannot indicate that the second argument causes the first, rather than the other way round. To overcome this limitation, the < drLink> construct has been introduced in Release 5.2, inspired by the way semantic relations in discourse are annotated in ISO 24617-8:2016 ('Semantic relations in discourse'), which allows the bottom line in (N2)b to be replaced by (N3):
```
(N3) < drLink arg1="#da1" arg2="#da3" rel="cause" >
        < argRole arg="#da2" role="result" />
        < argRole arg="#da3" role="reason" />
     < /drLink >
```
More dialogue act types for social talk: Like most annotation schemes, DiT++ was originally intended for the analysis and description of task-oriented dialogues where the participants have a clear purpose, such as finding a train connection, making an appointment, designing a remote control, or finding a route on a map. The DIT++ taxonomy in previous releases therefore included only a very small number of communicative functions for 'social' activities such as greeting, thanking, and leavetaking.
Everyday conversations such as a chat with a neighbour or with a colleague at the coffee machine often do not have such a well-delineated task as their motivation, but are aimed at a social purpose, such as establishing a pleasant atmosphere or maintaining a good relationship. Task-related dialogues often have an initial phases in which the participants are exchanging small talk before getting to a specific task, and such initial phases have often been omitted in dialogue corpora, where the initial small talk is viewed as occurring ‘before’ the ‘actual’ dialogue. An exception is the ADELE corpus of casual conversations (Gilmartin et al., 2018) in the form of textual chat dialogues. The dialogues in this corpus often have rather elaborate initial phases with sequences of greetings and discussions of each other’s health, and sometimes also an extended leavetaking phase with various kinds of greetings and well-wishing. In order to annotate the communicative functions in such phases in a satisfactory way, Release 5.2 includes several additional communicative functions in the Social Obligations Management dimension.

Plug-in annotation schemes are mini annotation schemes that can be added on the a host annotation scheme. In order to see how this can be done, it should be kept in mind that, according to the fundamental principles of semantic annotation as formulated in Bunt (2015) and as laid down in ISO 24617-6, a semantic annotation scheme has a three-part definition, consisting of

an abstract syntax which specifies the possible annotation structures at a conceptual level as set-theoretical constructs, such as pairs and triples of concepts;
a semantics which specifies the meaning of the annotation structures defined by the abstract syntax;
a concrete syntax which specifies a representation format for annotation structures (for example using XML).

Formally, the definition of an annotation scheme is thus a triple L = 〈AS, CS, Sm〉, formed by specifications of an abstract syntax (AS), a concrete syntax (CS), and a semantics (Sm). Each of these components is further structured as follows:

The abstract syntax specification AS is a pair 〈CI, AC〉, consisting of the conceptual inventory (CI) and the specification of conceptual entity structures and link structures (AC); together, these define the class of well-formed annotation structures.
The concrete syntax specification CS is a triple 〈V, CC, F〉, where V is a vocabulary, CC is the specification of a class of syntactic structures (such as XML elements), and F is an encoding function that maps AS-annotation structures to CS-representations. The components V and CC together define a class of well-formed representations, and F assigns such a representation to every well-formed annotation structure.
The semantic specification Sm is a pair 〈 M, I〉, consisting of a model and an interpretation function. For the DIT++ host annotation scheme, the semantics uses a context model (or 'information state') and an interpretation function defined in terms of context updates (see Bunt, 2014) for details).

Augmenting a host annotation scheme with a 'plug-in' means augmentations at all levels: abstract syntax, concrete syntax, and semantics. In order to make this possible, the specification of a plug-in includes specifications at all three levels. For example, see the specification below of a simple plug-in for semantic content expressed as attribute-value pairs. More details and discussion can be found in Bunt (2019)

Plug-ins annotation schemes and interfaces for semantic content
The degree of detail in which semantic content of a dialogue act is appropriately represented depends on the application domain. For some domains a simple representation as a list of attribute-value pairs may be adequate; for others a representation in terms of events with their participants, time and place may be more appropriate; for more advanced applications it may be necessary to take general aspects of natural language utterance meaning into account, including quantification and modification phenomena.

In any case, the use of a semantic content plug-in PL_c for the host annotation scheme L_a requires a plug-in interface _aY_c, which can be defined as shown in (1): the abstract syntax introduces the content link structure as a pair consisting of a dialogue act entity structure (‘a’) and a content entity structure (‘c'); the concrete syntax specifies its XML encoding, and the semantics specifies its meaning as the application of the function I_a(a), defined by the semantics of the host annotation scheme, to the argument I_c(c), defined by the plug-in semantics. This semantics reflects the dialogue act theory underlying DIT++, according to which the semantics of a full-blown dialogue act is an update operation on information states, defined by applying the semantics of the functional part of the dialogue act to its semantic content (which is computed as the semantic interpretation of the content annotation).

(P1) _aY_c = 〈_aAS_c, _aCS_c, _aSm_c〉, with:

_aAS_c = 〈 _aCI_c, _aAC_c〉 = 〈∅, {〈a,c〉 content link structure}〉
_aCS_c = 〈 _aVc, _aCC_c, _aF_c〉 = 〈∅, {< contentLink> element}, _aF_c(〈a, c〉) = < contentLink dialAct=_aF_c(a) content= _aF_cc/>〉
_aSm_c = 〈 _aM_c, _aI_c〉〉 = 〈∅, _aI_c(〈a,c〉) = I_a(a)(I_c(c))〉

The DIT++ (release 5.2) host annotation system L_a = 〈AS_a, CS_a, Sm_a〉〉 = 〈〈CI_a, AC_a〉, 〈V_a, CC_a, F_a〉, 〈M_a, I_a〉〉〉, together with a content plug-in PL_c = 〈〈CI_c, AC_c〉, 〈V_c, CC_c, F_c〉, 〈M_c, I_c〉〉〉 and an interface _aY_c = 〈〈∅,{〈a,c〉}〉, 〈〈∅, {},_aF_c〉〉, 〈∅, _aI_c〉〉 (with {I_ac(〈a,c〉) = I_a(a)(I_c(c)) ) defines an extended annotation scheme L_a+c formed by the unions of their components:

(P2) L_a+c = 〈〈CI_a ∪ CI_c, AC_a ∪ AC_c〉 ∪ {〈a,c〉}, 〈V_a ∪ V_c, CC_a ∪ CC_c ∪ {}, F_a ∪ F_c〉, 〈M_a ∪ M_c, I_a ∪ I_c ∪ _aI_c 〉〉

The union of these components forms a useful annotation scheme only if two important properties of the conceptual inventory of the host annotation scheme are preserved: the orthogonality of the set of dimensions and the taxonomic structure of the set of communicative functions. In other words, if a plug-in introduces a new dimension, then this should be orthogonal to the dimensions defined in the host scheme, and if it introduces additional communicative functions, then these should fit in the taxonomy of the host scheme. For semantic content plug-ins no such issues arise, but for other plug-ins they may.

Three plug-ins for semantic content are defined below with increasing richness, one where dialogue act content is described as a set of attribute values, one where events, participants, and semantic roles are distinguished, and one where natural language quantification is additionally taken into account.

Attribute-Value plug-in
A simple domain-specific plug-in for semantic content described as (lists of) attribute-value pairs could for example be useful in a travel planning domain where a journey can be described by a few attribute-value pairs, specifying departure place, destination, travel date, etc. In such a context, the semantic content of the utterance "I’d like to leave around ten in the morning", could be annotated as in (P3b):
```
							      
(P3)	a. I'd like to leave around ten in the morning (= markable m1)
        b. < avContent xml:id="c1" target="#m1" attribute="departureTime" value="10:00"/>
```
According to the annotation theory that underlies the DIT++ scheme (Bunt, 2010; 2013; 2015; Pustejovsky et al., 2017) semantic annotations must have besides a concrete representation format also a format-independent abstract syntax and a semantics. Underlying the representation in (P3b) is a conceptual inventory that lists the attributes and their possible values, and the definition of an entity structure containing one or more attribute-value pairs 〈Ai, vij〉. The semantics of such an entity structure can be defined as a feature structure [A_i': v_ij'] or, equivalently, as the property λx. A_i'(x) = v_ij'. The variable 'x' in the lambda abstraction can in this domain be thought of as ranging over journeys. The syntax and semantics of such AV-entity structures define a very simple annotation language L_AV, the semantics of which is a defined by:
```
(P4) 	I_AV(〈A_i, v_ij〉) = [I_AV(A_i): I_AV(v_ij)] = [A_i': v_ij'] 
```
To link an AV-content annotation to dialogue act annotations, the XML element < contentLink >, defined in the interface specified in (P1), can be used to obtain representations of the form (P5).
```
(P5) 	< dialogueAct xml:id="da1"target="#m1" speaker="#s" addressee="#a" dimension="task" communicativeFunction="inform"/>
	< avContent xml:id="c1" target="#m1" attribute="departureTime" value="10:00/>" 		
	< contentLink dialAct="#da1" content="#c1"/>
```
The formal specification of the tripartite attribute-value content plug-in PL_AV is as follows:
- Abstract syntax: AS
  = 〈CI_AV, AC_AV〉:
  - the conceptual inventory CI_AV lists attributes and their possible values;
  - AC_AV: content entity structures are triples of the form 〈m, 〈A_i, v_ij〉〉, with m a markable;
- Concrete syntax: CS_AV = 〈V_AV, CC_AV, F_AV〉:
  - the vocabulary VC_AV lists names of XML attributes and values;
  - CC_AV: specification of XML element - see (P3b).
  - encoding function F_AV: mapping from CI_AV to V_AV; encoding of AC_AV – entity structures: F_AV(〈m,〈A_i,v_ij〉) = < avContent xml:id=”c1” target=”#m” attribute= “F_AV(A_i)” value= “F_AV(v_ij)”/>.
- Semantics: Sm_AV uses interpretation function I_AV as defined in (P4).
Note that the interface _aY_AV for connecting the AV plug-in with DiAML, defined in (1), introduces in the abstract syntax ‘content link structures’ which are just pairs 〈a, c〉 consisting of a dialogue act entity structure and a content entity structure. The semantic component of the interface combines the interpretation functions of the host annotation scheme and the plug-in by (P6), which says that the interpretation of the dialogue act annotation is applied (as a function) to the argument formed by the interpretation of the content annotation.
(P6) _aI_AV((a, c〉) = I_a(a)(I_AV(c))
This combination of the two interpretation functions is possible only if the interpretation function I_a of the host language is applicable to the output of the plug-in interpretation function. The interpretation function I_a makes use of elementary context update operators (see Bunt, 2014 for details) which are defined in a representation-neutral way, just stipulating that the given semantic content should be added to that part of the addressee's information state which contains information about the task that still has to be verified for consistency with other available information (the addressee's `pending semantic context'). To apply this approach in a dialogue system, the elementary update operators must be instantiated for the representation formalism of the system's information state. The semantic content of dialogue acts has to be represented in a form that fits in with that formalism, and if necessary has to be converted to it. For content expressed in the form of feature structures, as is the case for I_AV, this not an obstacle. Existing DiAML implementations in dialogue systems, such as Keizer et al. (2011), Malchanau et al. (2019), and Malchanau (2019) use typed feature structures for information state representation, making the implementation of (P6) a straightforward matter.

Plug-in for events and semantic roles

The following more general content-plug is based on ISO standard 24617-4 for the annotation of semantic roles. The annotation scheme of this standard, a.k.a. 'SemAF-SR', marks up semantic information related to the question “Who did what to whom?”, assigning semantic roles to the participants in an event. For instance, the example sentence “The soprano sang an aria” is analysed as mentioning a singing event and would be annotated as shown in (P7b), where “sing.01" “ refers to a verb sense in VerbNet:

(P7) a. "The soprano sang an aria"
	 Markables: m1=“The soprano", m2=“sang", m3=“an aria" 

     b. < entity xml:id="x1" target="#m1" pred="soprano"/>
        < event xml:id=”e1” target=”#m2” eventFrame=”sing.01”  eventualityType=”accomplishment” />
         
	< entity xml:id=”x2” target=”#m1” pred=”aria”/>
	< srLink event=”#e1” participant=”#x1” semRole=”theme”/>

SemAF-SR interprets such annotations as expressing the existence (or denied existence, in case of a clause with negative polarity) of certain states or events and participants in certain roles. For the example in (P7) the semantics can be expressed by the following DRS:

(P8) [ e1, x1, x2 | sing01(e1), soprano(x1), aria(x2), agent(e1,x1), theme(e1,x2) ]

The plug-in consists in this case of the abstract and concrete syntax of the SemAF-SR markup language and the semantic interpretation function which produces DRSs like those in (P8). The abstract syntax has a conceptual inventory that lists semantic roles and verb senses by reference to VerbNet, defines entity structures for eventualities and their participants, and defines link structures for relating participants to eventualities in a certain role. The concrete syntax defines XML encodings of the annotation structures defined by the abstract syntax, as illustrated in (P7b).

When defining a content plug-in for information about semantic roles, the question arises whether all the information encoded in SemAF-SR annotations should be taken along in the plug-in. This issue regards in particular the reference to event frames for VerbNet verb senses. While this seems appropriate for the purposes of SemAF-SR, it would bring a level of detail to the interpretation of verbs and deverbal nouns which is not pursued for other content words; it may therefore be more appropriate to make this optional in a plug-in, allowing users to choose whether they want to use a conceptual inventory with that level of granularity or a less fine-grained one. The annotation of time and events also needs to be considered: ISO-TimeML (ISO 24617-1) uses a classification of event types that differs from that of SemAF-SR, and includes other detailed information about events that is not considered in SemAF-SR (like tense and aspect). Again, it is not obvious how much of that information would seem appropriate to take along in a plug-in for DiAML.

The simplest content plug-in for semantic roles is one that takes a minimalist approach to event classifications, and uses a simple form like < event xml:id="e2" target="#m3" pred="sing"/> rather than the more fine-grained representations of SemAF-SR or ISO-TimeML. This plug-in (`PL_SR') is informally characterized by the following schema:

Abstract syntax: the conceptual inventory lists the semantic roles defined in the ISO 24617-4 standard and a set of verb senses that only distinguishes between senses which differ in the semantic roles that they take; two kinds of entity structures are distinguished, for eventualities and their participants (other than eventualities), and just one kind of link structure, for indicating a semantic role.
Concrete syntax: specifies names for the elements of the conceptual inventory; XML elements for encoding the entity and link structures.
Semantics: translation of entity and link structures and their combination to DRSs.

Formally, this plug-in is defined as follows:

(P9) PL_SR = 〈〈CI_SR, AC_SR〉, 〈V_SR, CC_SR, F_SR〉, 〈M_SR, I_SR〉〉, with

CI_SR = R_SR ∪ EP ∪ DP, i.e. the conceptual inventory is made up of a set R_SR set of semantic roles, such as the ones defined in ISO 24617-4 (see Table 1), a set EP of event predicates defined in VerbNet (such as ‘sing01’), and a set DP of domain predicates defined in a domain ontology (like ‘soprano’ and ‘aria’)
AC_SR = {e_EV = 〈m, p〉 with markable 'm', 'p' ∈ EP, e_P = 〈m, p〉 with p ∈ DP, 〈e_EV, e_P, R〉 with R ∈ R_SR}
V_SR = names of CI_SR elements
CC_SR = {< event>, < entity>, < srLink>}
F_SR assigns a value in V_SR to every element in CI_SR;
F_SR(e_P) = F_SR(〈m, p〉) = < entity xml:id=”x” target=”#m1” pred=” F_SR(p)””/> if p ∈ DP;
F_SR(e_EV) = F_SR(〈m, p〉) = < event xml:id=”x” target=”#m1” pred=” F_SR(p)”/> if p ∈ EP;
F_SR(〈e_EV, e_P, R〉) = < srLink event=”F_SR(e_EV)” participant=”F_SR(e_P)” semRole=”F_SR(R)”/>
I_SR(〈m,p〉) = [x | p’(x)], where p' designates I_SR(p);
I_SR(〈e_EV, e_P, R〉) = [ e x | I_SR(R)(e,x)) ] ∪ I_SR(e_EV) ∪ I_SR(e_P)

Table 1 Semantic roles defined in ISO 24617-4

	Role	Definition
1.	Agent	Participant in an event who intentionally or consciously initiates an event, and who exists independently of the event.
2.	Beneficiary	Participant in an eventuality that is advantaged or disadvantaged by the eventuality, and that exists independently of the event.
3.	Cause	Participant in an event that initiates the event, but does not act with any intentionality or consciousness; the participant exists inde-pendently of the event.
4.	Goal	Participant in an event that is the (non-locative, non-temporal) end point of an action; the participant exists independently of the event.
5.	Instrument	Participant in an event that is manipulated by an agent, and with which an intentional act is performed; it exists independently of the event.
6.	Partner	Participant in an event that is intentionally or consciously involved in carrying out the event. Participant is not the principal agent of the event, and exists independently of the event.
7.	Patient	Participant in an event that undergoes a change of state, location or condition, is causally involved or directly affected by other participants, and exists independently of the event.
8.	Pivot	Participant in a state that is characterised as being in a certain position or condition throughout that state, and has a major or central role or effect in that state. A pivot is more central to the state than a participant in a theme role, and exists independently of the state.
9.	Purpose	Set of facts or circumstances that an agent wishes or intends to accomplish by performing some intentional action.
10.	Reason	Set of facts or circumstances explaining why a state exists or an event occurs.
11.	Result	Participant in an event that comes into existence through the event; it indicates a terminal point for the event: when that is reached, the event does not continue.
12.	Setting	Set of (non-locative and non-temporal) facts or circumstances of the occurrence of an event or a state.
13.	Source	Non-locative, non-temporal starting point of an event. The source exists independently of the event.
14.	Theme	Participant in a state or an event that (i) in the case of an event, is essential to the event taking place, but does not have control over the way the event occurs and is not structurally changed by the event, and (ii) in the case of a state, is characterised as being in a certain position or condition throughout the state, and is essential to the state being in effect but not as central to the state as a participant in a pivot role. The theme of a state or event exists independently of the state or event.
15.	Manner	The way or style of performing an action or the degree/strength of a cognitive or emotional state.
16.	Medium	The physical setting, device or channel that allows an event to take place.
17.	Means	Procedure for performing an action in terms of component steps, or a methodology by which an intentional act is performed by an agent. A means does not necessarily exist independently of the event.
18.	Location	Place where an event occurs, or a state is true, or a thing exists.
19.	Initial Location	Participant in an event that indicates the location where an event begins or a state becomes true; initial-location exists independently of the event.
20.	Final Location	Location where an event ends or a state becomes false; final-location exists independently of the event.
21.	Path	Intermediate location or trajectory between two locations, or in a designated space, where an event occurs.
22.	Distance	Length or extent of space that plays a role in an eventuality.
23.	Time	Participant that indicates an instant or a time interval during which a state exists or an event takes place.
24.	Duration	Length or extent of time during which an event occurs or a state is true.
25.	Initial Time	Indication of the point in time when an event begins or a state becomes true.
26.	Final Time	Indication of a point in time when an event ends or a state ceases to be true.
27.	Amount	Quantity of something other than time or space, or number of objects of a certain kind, which plays a role in an event or a state.
28.	Attribute	Property that an event or state associates with one of the other participants.

The interface for this plug-in is the same as the one defined above in (P1).

Plug-in for events, participants, and quantification
A plug-in for the semantic content of dialogue acts is more general and more powerful as it takes more aspects into account of the meanings of phrases, clauses, sentences, and other natural language structures that may express semantic content. On top of the identification of events with their time and place and participants with their respective roles, the interpretation of quantifier and modifier structures forms the most important source of semantic information. The ISO standard 24617-12 under development can be the basis of a powerful plug-in for this type of information. See Bunt et al. (2018) and Bunt (2019) for the design of an annotation scheme for quantification and modification, and Bunt (2018) for a preliminary version of a standard annotation scheme.

Plug-in for rhetorical relations

DIT++ release 5.2 supports the marking up of rhetorical relations in dialogue in a more fine-grained way than previous releases, but does not specify any particular set of relations to be used. A plug-in for such relations does not require the introduction of any entity structures or link structures, since these have been defined in this release of DiAML (the < drLink> element in the conrete syntax and the corresponding link structure in the abstract syntax). No specification of an interface is thus required, only the specification of a set of rhetorical relations and their argument roles. Such a plug-in still has the geneal tripartite structure, but has a very simple form: the specification of a rhetorical relation appears in three places, for example: ‘Cause’ occurs as the causal relational concept in the conceptual inventory of the abstract syntax; the string “cause” occurs as the value of an XML attribute in the vocabulary of the concrete syntax, and ‘Cause’ occurs as a binary predicate constant in the semantics.

The tripartite plug-in specified here takes the set of relations in ISO 24617-8:2015 (DR-Core) as its point of departure. The DR-Core set contains 18 core relations, to which the relation ‘Evaluation’ has been added. Table 2 lists the resulting 19 relations with their definitions, which describe their semantics in an informal way. Many other relations that are distinguished in other annotation schemes can be seen as special cases of these relations, for example ‘Explanation’ as a case of ‘Cause’, ‘Juxtaposition’ as a case of ‘Contrast’, and ‘Specification’ as a case of ‘Elaboration’.

The ISO 24617-8:2016 standard for annotating semantic relations in discourse distinguishes between ‘semantic’ and ‘pragmatic’ variants of discourse relations. This distinction is illustrated by the difference between (P11a) on the one hand and (P11b) and (P11c) on the other. Where in (P11a) having the flu is the reason for not coming in, in (P11b) beating his wife is not a reason for Jim to be an idiot, but for the speaker to say that Jim is an idiot, and in (P11c) “this” water being from yesterday is the reason for the request to give fresh water. In (P11a) the causal relation holds between the semantic contents of B’s two Inform acts (‘semantic cause’); in (P11b,c) there is a causal relation between the semantic content of the second dialogue act and the performance of the first (‘pragmatic cause’).

  (P11) a. John did not come in today. He's still struggling with the flu.
        b. B: Jim is an idiot. He beats his wife.  
        c. A: Could you give me a glass of fresh water please? This is from yesterday.

This distinction could be expressed in DiAML by introducing an attribute whose values represent the 'semantic'/'pragmatic' distinction, but such an extension would not be semantically interpretable unless the semantic content of dialogue acts would be available for interpreting the 'semantic' case. The distinction can be expressed directly in the presence of a semantic content plug-in by allowing the arguments of a rhetorical relation to be both dialogue act structures and semantic content structures.

In any case, a tripartite plug-in annotation scheme PL_DR for discourse relations in dialogue can be simple, as follows:

Abstract syntax:

CI_DR = the relations defined in DR-Core, i.e. the set {Cause, Condition, Negative Condition,..., Expansion, Evaluation}, corresponding to the left column in Table 2, and their argument roles, as listed in Table 2. (Any other well-defined set of relations and arguments could equally well be chosen.)

Concrete syntax:

XML names for the relations in the conceptual inventory and for their argument roles. (Note that not all discourse relations have distinct argument roles. For example, the relations Contrast and Similarity have two arguments that play semantically identical roles. In such cases, the two arguments are named “arg1” and “arg2”, and "arg1" is conventionally the argument that occurs first in the discourse.)

Semantics:

Table 3 lists the meanings of the rhetorical relations and argument roles which are specified in the conceptual inventory.

Table 2. Rhetorical relations and argument roles in PL_DR.

Relation	First argument	Second argument
Cause	reason	result
Condition	antecedent	consequent
NegativeCondition	negatedAntecedent	consequent
Purpose	enablement	goal
Manner	means	achievement
Concession	expectationRaiser	expectationDenier
Exception	regular	exclusion
Substitution	disfavoredAlternative	favoredAlternative
Exemplification	set	instance
Elaboration	broad	specific
Asynchrony	before	after
Expansion	narrative	expander
Evaluation	situation	judgement

Symmetrical relations, with 'arg1' and 'arg2' roles:

contrast, similarity

conjunction, disjunction,

restatement, synchrony

Table 3. Binary predicates as interpretations of rhetorical relations and argument roles in PL_DR.

	Relation	Definition
1.	Cause	The second argument provides a reason why the first argument occurs or holds true.
2.	Condition	The first argument is an unrealized situation which, when realized, would lead to the situation that forms the second argument.
3.	Negative Condition	The first argument is an unrealized situation which, when not realized, would lead to the situation that forms the second argument.
4.	Purpose	The second argument is the goal or purpose of the situation that forms the first argument.
5.	Manner	The second argument describes how the first argument comes about or occurs.
6.	Concession	The second argument cancels or denies an expected causal relation between the first argument and the negation of the second.
7.	Contrast	One or more differences between the two arguments are highlighted with respect to what each predicates as a whole or about some entities they mention.
8.	Exception	The second argument indicates one or more circumstances in which the situation that forms the first argument does not hold.
9.	Similarity	One or more similarities between the two arguments are highlighted with respect to what each predicates as a whole or about some entities they mention.
10.	Substitution	The two arguments are alternatives, the situation of the second argument being the favored or chosen alternative.
11.	Conjunction	The two arguments bear the same relation to some other situation evoked in the discourse. Their conjunction indicates that they both hold with respect to that situation.
12.	Disjunction	The two arguments bear the same relation to some other situation evoked in the discourse. Their disjunction indicates that they are non-exclusive alternatives with respect to that situation.
13.	Exemplification	The second argument is a situation that is an element of the set of situations described by the first argument. Arg1 describes a set of situations.
14.	Elaboration	The two arguments are the same situation, but the second argument is specified in more detail.
15.	Restatement	The two arguments are the same situation, but viewed from different perspectives.
16.	Synchrony	The two arguments form two temporally overlapping situations. All forms of overlap are included.
17.	Asynchrony	The first argument temporally precedes the second.
18.	Expansion	The two arguments are distinct situations that involve some shared entities; the second argument expands a narrative of which the first argument forms part of a certain narrative and Arg1 is a part, or expanding on the setting relevant for interpreting Arg1.
19.	Evaluation	The second argument provides an opinion on the social, esthetic, economic, or other qualities of the first argument.

Note that this plug-in is especially powerful in combination with a plug-in for semantic content, but it can of course also be used without.

Plug-in for application-specific dialogue act types
DIT++ was designed to be domain-independent, and for this reason does not include communicative functions that would be specific for a certain application domain. All its communicative functions are either general-purpose or belong to one of the dialogue control dimensions. The general-purpose functions of DiAML form a powerful battery of functions for use in any application, but still many applications could benefit from the availability of additional, domain-specific communicative functions. This is another area where plug-ins can be useful.
One important question that arises when designing a plug-in for domain-specific types of dialogue acts is how these communicative functions relate to the general-purpose functions of the host annotation scheme. In a negotiation domain, for example, one finds offers, counter-offers, accepts and rejects of counter-offers, and so on. Such offers and their various kinds of responses and continuations can be viewed as special cases of the general-purpose functions Offer and AddressOffer, and they would thus fit well within the taxonomy of the ISO standard.
According to the general structure of a plug-in, PL_a = 〈A_a, CS_a, Sm_a〉, with AS_a = 〈CI_a, AC_a〉; CS_a = 〈V_a, CC_a, F_a〉, and Sm_a = 〈M, I_a〉, a plug-in PL_CF for adding certain communicative functions would have a very simple specification since no new entity structures or link structures are needed, but only the following components:
- Abstract syntax: conceptual inventory CI_CF listing the new functions;
- Concrete syntax: CV_CF: corresponding XML vocabulary items, and mapping from CI_CF to CV_CF;
- Semantics: the context-update semantics I_a(f_j) for every f_j ∈ CI_CF.
No plug-in interface is needed in this case.
Plug-in for emotions
The sender of a dialogue act may expresses a certain emotion associated with the performance of the dialogue act, such as amusement, irritation, or disappointment. DIT++ in previous releases used qualifiers for this purpose, in paticular as values of the @sentiment attribute, but this assumes that an emotional state can be characterized in a one-dimensional way, through a single predicate. That may be reasonable for some use cases, but is in general too simple.
The W3C recommendation EmotionML is a flexible scheme, designed with the aim of being combined with other annotation schemes. It characterizes emotions as complex entities, including ‘emotion categories’ such as “anger”, “happiness”, or “surprise”, an intensity value (called ‘valence’), and a confidence value, as well as various alternative other ways of describing emotions, notably in terms of ‘action tendencies’, ‘appraisals’, and multiple ‘dimensions’. An emotion in EmotionML may have components of various categories; for instance, in the snippet (P13), taken from the document https://www.w3.org/TR/emotionml/, an emotion is annotated as being a form of anger with elements of sadness and fear.
```
(P13)	< emotion category-set=”http://www.w3.org/TR/emotion-voc/xml#big6”>
	     < category name=”sadness” value=”0.3”/>
	     < category name=”anger” value=”0.8”/>
	     < category name=”fear” value=”0.3”/>
	< /emotion>
```
Observing that there is no general agreement in the community, EmotionML does not provide a single repository of emotion descriptors, but gives users a choice to select a suitable emotion vocabulary in their annotations. In order to promote interoperability, EmotionML provides a number of emotion vocabularies that can be used for this purpose. The guiding principle for selecting emotion vocabularies has been to list vocabularies that are either commonly used in technological contexts, or represent current emotion models from the scientific literature. One of the best known repositories is Ekman’s ‘big six’, (Ekman, 1972), a set of basic emotions with universal facial expressions; emotions that are recognized and produced in all human cultures. Example (H41) shows how this repository or one of the others listed by EmotionML is referenced in an annotation.
EmotionML is defined only at the level of concrete syntax, so it cannot directly be used as a tripartite plug-in. However, an abstract syntax for EmotionML can be developed using the CASCADES methodology in reverse engineering mode (Bunt, 2016), and a semantics can be added for those parts of EmotionML markups that are truly semantic in nature (in contrast with e.g. confidence values).
An emotion has an experiencer and an object that the emotion is about. The emotional aspect associated with a dialogue act is a relation between the speaker, as the experiencer of the emotion, and (the semantic content of) the dialogue act as the object of the emotion. For example, in (P14) the experiencer of the emotion associated with the acceptance of the preceding offer is participant P2 and the object is the semantic content of this offer and its acceptance, viz. P2 having a cup of coffee.
```
(P14)	a. P1: Would you like to have a cup of coffee? ( = markable m1)
	   P2: That would be wonderful! ( = markable m2) 

	b. < dialogueAct xml:id=”da1" target=#m1" speaker=#p1" addressee=”#p2" dimension=”social"  communicativeFunction=”offer"/>
	   < contentLink dialAct=”#da1" content=”#e1"/>
	   < dialogueAct xml:id=”da2" target=”#m2" speaker=”#p2" addressee=”#p1" dimension=”social" communicativeFunction=”acceptOffer" funcDep=”#da1"/>
	   < event xml:id=”e1" target=”#m2" pred=”have-coffee"/>
	   < srLink event=”#e1" participant=”#p2" semRole=”agent"/>
	   < contentLink dialAct=”#da2" content=”#e1"/>
	   < emotion xml:id=”em1" target=`”#m2" category=”happiness" value=”0.8"/>
	   < emoLink holder=”#p2" object=”#e1” emotion=”#em1"/>
```
A simple plug-in PL_e for adding emotion annotation to dialogue acts, based on EmotionML, can be defined as follows.

Abstract syntax:
- CI_e = set of emotion categories; a set of intensity values (any floating point number in the interval [0,1];
- Entity structures: nested pairs 〈m,〈c, v〉〉 consisting of an emotion category (c) and an intensity value (v) (and a markable m).
Concrete syntax:
- XML names for the emotion categories in the conceptual inventory; numerical values representing emotion intensities;
- encodings of entity structures using elements (as defined in EmtionML, but simplified).
Semantics:
- Pairs 〈c, v〉 are interpreted as attribute-value pairs where c denotes a two-place function, applicable to the experiencer and the object of an emotion, so I_e(〈c, v〉) is defined as the two-place predicate λx. λy. I_e(c)(x,y) = I_e(v).
For linking emotion specifications to dialogue act annotations, a plug-in interface is needed that defines the < emoLink> element used in (P14b) with its underlying abstract syntax and semantics. In the abstract syntax, an emotion link structure is a triple 〈p, s, e〉 formed by a dialogue participant ‘p’ who is the sender of a dialogue act, the semantic content ‘s’ of this dialogue act, and an emotion ‘e’. These components correspond in the concrete syntax to the values of the attributes @holder, @object, and @emotion in an < emoLink> element, as illustrated in (P14). The semantics of the emotion link structures is defined by (P15).
(P15) _a+cI_e((p, s, e〉) = I_e(e)(I_a(p), I_c(s))

Release 5.1:

Differences between release 5.1 and the previous release 4 (from February 2010): most importantly, release 5.1 has been developed in tandem with the definition of ISO standard 24617-2 for dialogue act annotation. In particular, the definitions of the communicative functions in the DIT++ taxonomy and those included in the ISO standard have been made identical. Since the latter form a subset of the former, DIT++ release 5.1 is a fully ISO-compatible dialogue act annotation scheme which is somewhat more fine-grained than the ISO scheme.

DIT++ Taxonomy of Communicative Functions

The DIT++ taxonomy forms a multidimensional system not only in the sense that it supports the assignment of multiple communicative functions to dialogue segments, but also in the sense that dimensions have a well-defined conceptual status in dialogue analysis, as different aspects of communication that may be addressed independent of each other (see Bunt, 2006). For annotation, the multidimensionality of the schema means that a functionally relevant segment of dialogue behaviour may be tagged as having more than one communicative function -- maximally one in each dimension if the tagging of implied functions is avoided. Dimensions are represented in the presentation of the taxonomy in boldface italic.

Some communicative functions can only be used in a particular dimension. For example, Turn Take and Turn Release are two function which can only be used for turn management, and Stalling and Pausing are two functions that can only be used for Time Management. Such functions are called dimension-specific. Other functions can be used in any dimension, for instance a Request can be related to the performance of the task that motivates the dialogue, but it can also be used for time management (Could you give me just a few minutes?) or for feedback (Could you please clarify that?); such functions are called general-purpose communicative functions. The DIT++ taxonomy thus consists of two parts: (A) that of the general-purpose functions and (B) that of the dimension-specific functions. In the presentation of the DIT++ communicative functions below, first the general-purpose functions are shown and subsequently the dimension-specific functions. For convenience, the taxonomy is structured not only in dimensions but also in some additional groupings that do not have a theoretical significane, but that are convenient for seeing the structure of the set of communicative functions, as well for referring to certain groups of functions. Such groupings are represented in italics.

General-Purpose Communicative Functions

Information Transfer Functions
- Information Seeking Functions
  - Question

Information Providing Functions
- Inform

Action Discussion Functions

Commissives
- Offer
- Address Suggestion
- Other commissives, as expressible by means of performative verbs or by addressing other directives

Directives
- Request
- Suggestion (a.k.a. Open-option)
- Other directives, as expressible by means of performative communicative verbs, such as Advice, Proposal, Permission, Urge,...

Dimension-Specific Communicative Functions

Domain-related Functions Functions, expressible either by means of performative verbs denoting actions for performing activities in a specific domain, or by means of graphical actions such as highlighting, or pointing to something in a picture. For example:
- Open Meeting, Suspend Meeting, Resume Meeting, Close Meeting (in meeting situations)
- Bet, AcceptBet
- Congratulation, Condolance
- Hire, Fire, Appoint,... (in a human resource management domain)
- Show, Highlight, Point, List,... for performing graphical/multimodal dialogue acts
Dialogue Control Functions
- Feedback Functions
  - Auto-Feedback

Definitions of DIT++ communicative functions

In the definitions of communicative functions, the hierarchical relations will be exploited by only specifying the way in which the preconditions of a communicative function strenghten or are additional to those of its ancestors.

To see examples, click at the communicative function name.

General-Purpose Communicative Functions are functions that can be applied to any kind of semantic content.
They are often applied to information concerning the task or activity that motivates the dialogue, and in that case they form a dialogue act in the Task/Activity dimension.
In information-seeking dialogues, advice-giving dialogues, and other dialogues whose primary motivation is to exchange certain information, the general-purpose functions are the only functions that are needed in the Task/Activity dimensions. In other types of dialogue one finds besides the general-prupose functions als certain activity-specific communicative functions, some examples of which are mentioned above.
General-purpose functions can also be applied to content concerning the communication, in which case they form a `dialogue control act'. For example, the utterance I did not hear what you said has the general-purpose function Inform, and in view of the type of is semantic content, it provides (negative) feedback about the speaker's perception of the previous utterance (forming a dialogue act in the Auto-Feedback dimension).

Information Transfer Functions
The class of information-transfer functions consists of all those functions whose primary aim is to obtain or to provide information. The class falls apart into information-seeking and information-providing functions.
- Information-seeking functions:
  All functions in this class have in common that the speaker wants to know something, which he assumes the addressee to know, and puts pressure on the addressee to provide this information.
  So-called Check Questions carry the additional assumption that the speaker expects the answer to be that the proposition under discussion is true.
  Still more specifically, some check questions carry the additional assumption that the addressee's beliefs confirm the speaker's expectation (Posi-Check) or that they contradict this expectation (Nega-Check). Other special types of questions, which are not considered here, are the Exam Question and the Rhetorical Question (which only looks like a question, but is not really one).
- Information-providing functions
  All information-providing acts have in common that the speaker provides the addressee certain information which he believes the addressee not to know or not to be aware of, and which he assumes to be correct.
  The various subtypes of information-providing functions differ in the speaker's motivation for providing the information, and in different additional beliefs about the information that the addressee possesses.
  - Inform: Speaker S wants to make the information p that forms the semantic content of the inform act known to addressee A; S assumeses that the information p is correct.
    - Agreement: S believes that A weakly believes the semantic content to be true
    - Disagreement: S believes that A weakly believes the semantic content to be false
      - Correction: S wants the semantic content, which he believes to be correct, to replace a belief by A that S believes to be incorrect.
    - Answer: S believes that A wants to possess the information which forms the semantic content of the Answer act.
- Action Discussion Functions:
  Action Discussion functions have a semantic content consisting of an action, and possibly also a description of a manner or frequency of performing the action. This frequency may be zero, so e.g. an Instruct to perform an action with frequency zero is the same as prohibiting that action, and committing oneself to perform an action with zero frequency is the same as committing oneself to not perform the action.
  - Commissives:
    S is committed to performing a certain action in a certain manner or with a certain frequency, possibly dependent on certain conditions, and possibly dependent on A's consent that S do so.
    - Offer: S is committed to perform the action in the manner or with the frequency, described in the semantic content, if A woud like S to do so
    - Directives:
      S wants A to consider a certain action which A might carry out (possibly together with S), potentially wanting to put pressure on A to do so
      - Instruct: S wants A to perform the action in the manner or with the frequency described in the semantic content; S assumes that A is able to do so
      - Request: S wants A to perform the requested action in the manner or with the frequency described, conditional on A's consent; S assumes that A is able to do so
      - Suggestion: S wants A to know that the action in the manner or with the frequency described in the semantic content, is potentially promising for achieving a certain goal, which either S believes A to have, or which is specified as part of the semantic content; S assumes that A (possibly toegether with S) is able to perform the action in the manner or with the frequency described.
- Dialogue Control Functions: The functions of communicative acts that serve to create or maintain the conditions for successful interaction.
  - Feedback Functions:
    Feedback acts provide or elicit information about the processing of he previous utterance(s), where at least five levels of attending to an utterance and processing it are distinguished:
    - attention, i.e. paying attention to the dialogue partner sufficiently to fully enable the perception of the partner's contributions (e.g. listening, looking).
    - perception, i.e. the recognition of the auditive, visual, or tactile components of communicative behaviour.
    - interpretation, i.e. the assignment of meaning to the recognized communicative behaviour. In terms of dialogue acts, this is the assignment of semantic content and communicative functions to utterances.
    - evaluation, i.e. comparing the information that an utterance encodes, due to its communicative functions and semantic content, with what was already known. For instance, when a question was asked to which, according to the addressee, the questioner already knows the answer, then the addressee cannot accept the information conveyed by the question, as this would put him in an inconsistent belief state.
    - execution, also called 'application' or 'dispatch'. For instance, execution of a request or instruct is performing the requested or instructed action; execution of a question is gathering the information to answer; executing an answer is integrating its semantic content with the belief state.
    Auto-Feedback acts are about the speaker's own attention and processing of an utterance in the addressee's last turn; Allo-Feedback acts are about the speaker's beliefs about the addressee's attention and processing of an utterance in the speaker's last turn.
    Dimension-specific Auto-Feedback functions are intended to signal that the processing of the utterance in question failed at a certain level or was successful up to a certain level, ranging from attending via perceiving, understanding, and evaluating to doing something with the result of the processing at all these levels ("execution"). (More articulate feedback acts, signalling or requesting help for a specific processing problem, are constructed with general-purpose functions and a specific processing-related semantic content.)
    - Auto-feedback:
      - AutoPositive (= Unspecified Positive): S successfully processed the previous utterance, but provides no information about the level(s) of processing being reported
      - AutoNegative (= Unspecified Negative): S was unsuccessful in processing the previous utterance, but provides no information about the level(s) of processing being reported
      - ExecPositiveAutoFeedback (= Overall positive auto-eedback: S's perception, interpretation, evaluation, and execution of the previous utterance were successful.
      - EvalPositiveAutoFeedback: S's perception, interpretation, and evaluation of the previous utterance were successful.
      - InterprPositiveAutoFeedback: S's perception and interpretation of the previous utterance were successful.
      - PerceptPositiveAutoFeedback: S's perception of the previous utterance was successful.
      - AttentPositiveAutoFeedback: S is paying full attention
      - ExecNegativeAutoFeedback: S's perception, interpretation, and evaluation of the previous utterance were successful, but that he encountered a problem in applying the information from that utterances (for example, S was unable to carry out an instruction, or to find the information needed for answering a question).
      - EvalNegativeAutoFeedback: S encountered a problem in evaluating the semantic content of the previous utterance (for example, the utterance provided information that is in conflict with information already available to S).
      - InterprNegativeAutoFeedback: S's perception of the previous utterance was successful, but he encountered a problem in trying to assign an interpretation to the utterance (for example, S was unable to make sense of the semantic content).
      - PerceptNegativeAutoFeedback: S's perception of the previous utterance encountered a problem (S did not hear the utterance well, or was unable to read it).
      - AttentNegativeAutoFeedback (Overall negative auto-feedback): S did not pay (full) attention to the previous utterance (e.g., S did not listen carefully).
    - Allo-Feedback:
      - Feedback-Giving Functions:
        
        AlloPositive (= Unspecified positive allo-feedback): S believes that A successfully processed the previous utterance, but provides no information about the level(s) of processing being reported
        Negative (= Unspecified negative): S believes that A was unsuccessful in processing the previous utterance, but provides no information about the level(s) of processing being reported
        Overall Positive Allo_Feedback: S believes that A's execution of S's previous utterance was successful
        Execution Negative Allo_Feedback: S believes that A's execution of S's previous utterance was unsuccessful
        Evaluation Positive Allo_Feedback: S believes that A's evaluation of S's previous utterance was successful
        Evaluation Negative Allo_Feedback: S believes that A's evaluation of S's previous utterance was unsuccessful
        Interpretation Positive Allo_Feedback: S believes that A's interpretation of S's previous utterance was successful
        Interpretation Negative Allo_Feedback: S believes that A's interpretation of S's previous utterance was unsuccessful
        Perception Positive Allo_Feedback: S believes that A's perception of S's previous utterance was successful
        Perception Negative Allo_Feedback: S believes that A's perception of S's previous utterance was unsuccessful
        Attention Negative Allo_Feedback: S believes that A did not pay attention to S's previous utterance.
    - Turn management:
      Turn management acts are those dialogue acts which are performed in order to keep or to reallocate the speaker role. The beginning and end of a turn, defined as an instance of communicative behaviour bounded by lack of activity or another communicator's activity, are associated with a reallocation of the speaker role. A turn ends either because the current speaker assigns the speaker role to the addressee, or because he offers the speaker role without putting any pressure on the addressee to take the turn, or because the addressee interrupts the speaker and 'grabs' the speaker role. Turn Assign and Turn Release are thus two of the possible turn-final functions. A turn may also include smaller units with boundaries where a reallocation of the speaker role might have occurred, but where in fact it does not occur because the speaker indicates that he wants to keep the turn. Such a smaller unit then has Turn Keep function as the unit-final Turn Management function.
      A turn may also have a turn-initial function, indicating whether the speaker of this turn obtained the speaker role by 'grabbing' it (Turn Grab), by taking it when it was available (Turn Take) or by accepting the addressee's assignment of the speaker role to him (Turn Accept).
      The units of Turn Management can thus have both a turn-unit-initial and a turn-unit-final function, which is captured by giving them a pair of functions, an initial and a final one.
      - Turn-unit-initial functions:
        
        Turn Take: S wants to have the turn, which is available
        Turn Accept: S agrees to take the turn, which A has given to him
        Turn Grab: S wants to get the turn, which A currently has, before A assigns the turn to him or releases it.
      - Turn-unit-final functions:
        
        Turn Keep: S wants to keep the turn
        Turn Assign: S wants A to take the turn
        Turn Release: S wants to make the turn available to any participant
    - Time management functions:
      - Stalling: S needs a little bit of time to formulate an utterance
      - Pausing: S needs some time to do something (either in preparation of continuing the dialogue, or because something else came up which is more urgent for him to attend to) and therefore wants to suspend the dialogue for a while
    - Contact management functions
      - Contact Check : S wants to establish whether A is ready to receive messages from and to send messages to S
      - Contact Indication: S wants A to know that S is ready to send messages to and receive messages from A
    - Own communication management functions:
      - Self-error signal: S wants A to know that S has made a mistake in speaking
    - Partner communication management functions:
      - Completion: S wants to help A to complete an utterance that A is struggling to complete
      - Correct-misspeaking: S wants to correct (part of) an utterance by A, believing that A made a speaking error
    - Dialogue structuring functions:
      - Opening: S wants A to know that S is ready and willing to engage in a dialogue with A, of which the present utterance precedes any utterance with a activity-oriented function in the current dialogue
      - Dialogue act announcement: S plans to perform the dialogue act mentioned in the semantic content
      - Dialogue act invitation: S wants A to know that A is welcome to perform a certain dialogue act
      - Topic management functions:
        
        Topic introduction: S wants to introduce the topic mentioned in the semantic content.
        Topic shift announcement: S wants to change the topic.
        Topic shift: S wants to shift the topic to the one mentioned in the semantic content.
      - Preclosing: S wants to start moving towards ending the dialogue
    - Social obligations management functions:
      - Salutation
        Initial greeting: S wants A to be aware of S's presence; S is aware of A's presence; S believes that S and A are in a position to exchange messages; S puts pressure on A to acknowledge this.
        
        Return greeting: S wants A to be aware that S is aware of A's presence; S is aware of A's presence; S believes that S and A are in a position to exchange messages; S is pressured to respond to an initial greeting by A addressed to S.
        
        Follow-on greeting: S has established A's identity; S wants A to be aware of that.
      - Self-introduction
        Initial self-introduction: S wants to make himself known to A; S puts pressure on A to acknowledge this.
        
        Return self-introduction: S wants to make himself known to A; S is pressured to do so by an initial self-introduction by A addressed to S.
      - Opening politeness:
        Politeness question: S wants to know A's state of well-being; S puts pressure on A to provide this information.
        
        Return politeness question: S wants to know A's state of well-being; S puts pressure on A to provide this information; S responds to a Politeness Question by A.
        
        Opening politeness statement: S wants A to know that S is pleased to meet A.
      - Apologizing
        Apology: S wants A to know that S regrets having made an error in perceiving, understanding, evaluating, or executing an utterance by A, or not having paid attention to, perceived well, or misunderstood an utterance from A, or being unable to evaluate or execute an utterance from A; S puts pressure on A to acknowledge this.
        
        Apology-downplay: S wants to mitigate A's feelings of regret; S has been pressured to respond to an apology by A adressed to S.
      - Empathy expression
        Compliment: S likes A's appearance, certain of A's qualities, or something that A has achieved, and wants A to know that.
        
        Congratulation: S takes pleasure in the successful achievement of something by A, or in A's good fortune, and wants A to know that.
        
        Sympathy expressions: S wants A to know that S is sorry for something that happened to A.
      - Gratitude
        Thanking: S wants A to know that S is grateful for what A has done in the current dialogue; S puts pressure on A to acknowledge this.
        
        Thanking-downplay: S wants to mitigate A's feelings of gratitude; S has been pressured to respond to a thanking act by A addressed to S.
      - Valediction
        Closing politeness statement: S wants A to know that S is pleased or has enjoyed meeting A and/or communicating with A.
        
        Farewell wish: S wishes A to enjoy a state of success or well-being.
        
        Relay greeting: S wants A to pass on S's greetings to one or more specified other people.
        
        Initial goodbye: S wants A to know that S intends the current utterance to be his final contribution to the dialogue; S puts pressure on A to acknowledge this.
        
        Return goodbye: S wants A to know that S intends the current utterance to be his final contribution to the dialogue; S wants to acknowledge his awareness that A want his (A's) last utterance to be his final contribution to the dialogue; S has been pressured to respond to an initial goodbye by A addressed to S.

Examples of DIT++ dialogue acts

Sources:

'LIRICS' = from the LIRICS project multilingual test suite of dialogues (in English, Dutch, and Italian);
'DIAMOND' = from the DIAMOND project corpus of dialogues (in Dutch);
'IMIX' = from the IMIX project corpus of dialogues (in Dutch);
'AMI' = from the AMI project corpus of dialogues (in English)
'SCHISMA' = from the SCHISMA project corpus of dialogues (in Dutch);
'OVIS" = from the OVIS project corpus of dialogues (in Dutch).

General-Purpose Communicative Functions

Information-transfer functions
- Information-seeking functions
  - Nega-check: NOT tonight??
  - Posi-check: Direct Questions: In the evening, right?
  - Nega-check: NOT tonight??
- Information-providing acts
  - Inform: You didn't get it. (AMI; example of an Inform used for negative feedback); Ik zie dat ik ook woorden kan aanwijzen (IMIX; example of an Inform with a semantic content in the Activity Management domain (see DAMSL))
  - Agreement: Exactly.
  - Disagreement: No!
  - Confirm: It is
  - Confirm[uncertain]: I think so.
  - Disconfirm: It's not
  - Disconfirm[uncertain]: I don't think so.
  - Answer[uncertain]Maybe.
  - AnswerYes; Sure.

Action-discussion function

Commissives:
Directives:
- Suggestion Shall we go? (AMI); Maybe we can do that later; (AMI) Let's wait for them. (AMI); Ik kan kaartjes voor u reserveren (SCHISMA)
- Request: Please give me some more light.
- AcceptOffer: Yes please
- DeclineOffer: No thank you.
- Other directives, as expressible by means of performative communicative verbs, such as Advice, Proposal, Permission,...

Feedback Elicitation acts:

Perception Feedback Elicitation: Goed zo? (IMIX)
Evaluation Feedback Elicitation: OK? (IMIX)

Partner Communication Management acts

Completion: A: ... to Tempelhof, ... to the airport of, er, ... S: of Berlin.
Correct-misspeaking: [A: On Thursday] S: On TUESday

Own Communication Management acts

Self-error: Oops!

Time management acts

Stalling: Let me see
Pausing: Just a moment; Een ogenblik alstublieft (OVIS)

Discourse structure management acts:

Opening: Okay.....
Topic introduction Now for the next meeting,...
Topic shift announcement Iets anders; Verder.
- Topic shift

Social obligations management acts

Politeness Question: How do you do? How are you? Comment allez-vous? Ça va? Wie geht's?
Opening Politeness Question: And you? How about you? Et vous?

annotated examples

DiAML: the Dialogue Act Markup Language

Basic DIT concepts

Dalogue acts

The term ‘dialogue act' is often used rather loosely in the sense of speech act used in dialogue. Indeed, the idea of interpreting communicative behaviour in terms of actions, such as questions, promises, and requests goes back to speech act theory (Austin, 1962; Searle, 1969). But where speech act theory is primarily an action-based approach to meaning within the philosophy of language, dialogue act theory is an empirically-based approach to the computational modeling of linguistic and nonverbal communicative behaviour in dialogue. Dialogue acts offer a way of characterizing the meaning of communicative behaviour in terms of update operations, to be applied to the information states of participants in the dialogue; this approach is commonly known as the ‘information-state update’ or ‘context-change’ approach -- see e.g. Bunt (1989; 2000a); Traum and Larsson (2003). For instance, when an addressee understands the utterance “Do you know what time it is?” as a question about the time, then the addressee’s information state is updated to contain (among other things) the information that the speaker does not know what time it is and would like to know that. If, by contrast, it is understood that the speaker is reproaching the addressee for being late, then the addressee’s information state is updated to include (among other things) the information that the speaker does know what time it is. Distinctions such as that between a question and a reproach concern the communicative function of a dialogue act, which is one of its two main components. The other main component is its semantic content, which describes the objects, properties, relations, situations, actions or events that the dialogue act is about. The communicative function of a dialogue act specifies how an addressee updates his information state with the information expressed in the semantic content when he understands the dialogue act.

This approach to the definition of communicative functions is strictly semantic, in contrast to approaches based on linguistic form. For example, the behaviour of a speaker who repeats something that was said by someone else may be characterised as a ‘repetition’ (which is a communicative function in some annotation schemes); however, this only says something about the form of the behaviour compared to the repeated behaviour, not about its function. A repetition often has a feedback function, as in (D1).a, but it can also have other functions, as in (D1).b, where it is used as a confirmation in response to a check question:

(D1) S: There are evening flights at seven-fifteen and eight-thirty 
     a. C: Seven-fifteen and eight-thirty
     b. C: And that’s on Sunday too 
        S: And that’s on Sunday too

A form-related requirement for introducing a communicative function is however that there are observable features of communicative (linguistic and/or nonverbal) behaviour which are indicative for that function in the context in which the behaviour occurs. This requirement puts all communicative functions on an empirical basis. Dialogue act annotation is the marking up of stretches of dialogue with information about the dialogue acts they contain. Spoken dialogues are traditionally segmented into turns, defined as stretches of communicative behaviour produced by one speaker, bounded by periods of inactivity of that speaker. Turns can be quite long and complex, and are therefore not the most useful units of behaviour to assign communicative functions to. Communicative functions can be assigned more accurately to smaller units, which are called functional segments, and which are defined as the minimal stretches of communicative behaviour that are functionally relevant.

Inherent to the notion of a dialogue act is that there is an agent who produces the dialogue act, called the ‘sender’, and one or more agents who are addressed, called the ‘addressee(s)’. Dialogue studies often focus on two-person dialogues, in which case the dialogue acts have only one addressee. Besides sender and addressee(s), there may be various types of side-participants who are present but do not or only marginally participate (see Clark, 1996).

Dialogue act annotation is often limited to assigning communicative functions to dialogue segments, which corresponds intuitively to indicating the type of communicative action that is performed. A semantically more complete characterization additionally provides information about the category of semantic content. The DAMSL annotation scheme distinguishes three categories of semantic content: Task, Task Management, and Communication, which indicate whether the semantic content of the dialogue act advances the task which underlies the dialogue, or discusses how to perform the task, or concerns the communication process. DIT++ distinguishes 10 subcategories of communication-related information, such as feedback information, turn allocation information, and speech management information. These categories of semantic content are also called ‘dimensions’.

Example (D2) illustrates the use of the key attributes of a dialogue act in the DiAML annotation of a task-related yes-no question addressed by speaker ‘a’ to addressee ‘b’, expressed by the functional segment ‘m1’:

(D2)	< dialogueAct xml:id="da1" target="#m1" sender="#a" addressee="#b" dimension="task" communicativeFunction="propositionalQuestion"/>

Dependence relations

Some types of dialogue acts are inherently dependent for their full meaning on one or more dialogue acts that occurred earlier in the dialogue. This is for example the case for answers, whose meaning is partly determined by the question that is being answered, and also for the acceptance or rejection of offers, suggestions, requests, and apologies. This is illustrated in example (D3), where the meaning of the answer in turn 3 depends on whether it is an answer to the question in turn 1 or to the one in turn 2.

(D3) 1. B: Do you know who’s coming tonight? 
     2. B: Which of the project members do you think will be there? 
     3. A: I’m expecting Jan, Alex, Claudia, and David, and maybe Olga and Andrei.

As an answer to the question in 1, A’s answer says that nobody else is expected to come than the people that are mentioned, but as an answer to the question in 2 it leaves open the possibility that other people will come, who are not members of ‘the project’. This kind of semantic dependence, which is due to the responsive character of some communicative functions, is called a functional dependence relation. Marking up this relation between a dialogue act with a responsive communicative function and its ‘antecedent’ dialogue acts allows the annotation to not just indicate e.g. that an utterance has the function of an answer, but also to indicate to which question it is an answer, as illustrated in (D4).

(D4) a. B: Which of the project members do you think will be there?  
	A: I’m expecting Jan, Alex, Claudia, and David, and maybe Olga and Andrei.

     b. < dialogueAct xml:id="da1" target="#m1" sender="#b" addressee="#a" dimension="task" communicativeFunction="setQuestion"/>
	< dialogueAct xml:id="da2" target="#m2" sender="#a" addressee="#b" dimension="task" communicativeFunction="answer" functionalDependence=”#da1”/>

Positive and negative feedback-providing acts depend for their interpretation also on what happened earlier in the dialogue, but in a different way. They are concerned with the processing of what was said before - such as its perception or its interpretation. This is illustrated by the examples in (D5).

(D5) 1. A: The flight on Tuesday would suit me really well. 
       	B: Okay.

     2. A: The flight on Tuesday would suit me really well.
	B: On Tuesday?

In the first example B indicates that he has correctly understood A’s remark; in the second he checks whether he heard (or remembers) correctly what A said. This relation between a positive or negative feedback act ant its ‘antecedent’ is called a feedback dependence relation.

A feedback dependence relation indicates one or more preceding dialogue acts if the feedback concerns high-level processing, such as understanding, and it indicates a dialogue segment in the case of low-level processing, such as hearing what was said. In the latter case, the feedback dependence relation was annotated according to Release 5.1 as referring to the smallest functional segment containing the segment that the feedback act is about. This way of annotating feedback dependence relations is not quite accurate, since feedback about a stretch of communicative behaviour smaller than a functional segment is not about the entire segment. For example, negative feedback that signals a problem in hearing certain words may imply positive feedback about the rest of the segment. Similarly for feedback-eliciting acts and for dialogue acts in the Own Communication Management (OCM) dimension or in the Partner Communication Management (PCM) dimension. In particular, Self-Corrections and Partner Corrections frequently refer to a single word or phrase which does not form a functional segment. To make more accurate annotation possible, Release 5.2 introduces a ‘reference segment’ as being a stretch of communicative behaviour that is the object of a feedback dependence relation and that is not a functional segment.

Rhetorical relations

Dialogue acts may also be semantically and pragmatically related through rhetorical relations. These have been studied extensively for their occurrence in written discourse, and are also known as 'discourse relations'. They occur also in (spoken and multimodal), dialogue, as in the examples shown in (D6).

(D6)	1. A: It ties you on in terms of the technology and the complexity that you want 
   	2. A: like for example voice recognition 
   	3. A: because you might need to power a microphone and other things 
        4. A: so that’s one constraint there

In this example we see a sequence of four functional segments contributed by the same participant. The segments in lines 2-4 are all related to the dialogue act expressed in the first segment. Segment 2 is related to the initial statement through an Exemplification relation, segment 3 through a Cause relation, and segment 4 through a Restatement relation.

A wide diversity of sets of rhetorical relations has been proposed (see e.g. Hobbs, 1979; Mann and Thompson, 1988; Lascarides and Asher, 1991, Hovy and Maier, 1993; Prasad et al., 2008; Sanders et al., 1992), which has inspired a great deal of discussion, comparisons, and attempts to specify mappings between various sets (Benamara and Taboada, 2015; Bunt and Prasad, 2016; Schefler and Stede, 2016; Demberg et al., 2017; Sanders et al., 2018). In view of this situation, DIT++ does not propose any specific set of relations to be used, but only provides a conceptual category for which a set of relations may be specified. In Release 5.1, this provision plays out at the level of concrete DiAML syntax in the definition of an XML element called ‘< rhetoricalLink>’ which has attributes referring to two dialogue acts and an attribute for whose value a rhetorical relation can be specified. Example (N@) anove illustrates the use of this provision for indicating a causal relation between two dialogue acts.

In 2015, Prasad & Bunt defined a set of 18 ‘core’ rhetorical relations which occur in some form in most annotation schemes for rhetorical relations (see Prasad and Bunt, 2015), and a proposal for using this set for defining an ISO annotation standard. This has become 24617-8:2016, a.k.a. DR-Core. The DR-Core relations have been used in DiAML annotations as values of the @rhetoRel attribute in several annotation efforts (see e.g. Petukhova et al., 2014 and Bunt et al., 2019). The < rhetoricalLink> element was found to be rather coarse-grained, however, for the two limitations already mentioned: (1) it is not possible to indicate the roles of the arguments; and (2) it is not possible to distinguish between ‘semantic’ and a ‘pragmatic’ variants of a relation. The distinction is illustrated in (P11) avbove.

In Release 5.2 the constructs < drLink > and < argRole > are introduced in the DiAML-XML concrete syntax, and the conceptual structures that they encode are added to the DiAML abstract syntax with their semantics. Semantically the structure is similar to the element; the interpretation consists of an update operation that inserts the semantic relation in an addressee’s information state, adding a specification of the argument roles.

Qualifiers

The examples in (D7) illustrate another phenomenon that is frequently found in dialogue, namely that speakers may are uncertain about the information they provide, as in B’s utterance in (D7)a), or about their commitment to the performance of an action, as in (D7)b1. Speakers may also express a certain sentiment about the information or event that is being discussed, as in (D7.b3), or express a reservation in the form of a condition, as in (D7.b2), where an offer is conditionally accepted. For the annotation of conditions, uncertainty, and sentiment, DIT++ makes use of so-called qualifiers. Example (D7)c, annotating (D7)b2, illustrates their use.

(D7) 	a. A: Do you know what time the meeting starts?
 	   B: At 4 p.m. I think.

	b. A: Would you like to have some coffee? 
	   1. B: Maybe later.
	   2. B: Only if you have it ready.
           3. B: Yes please!

        c. < dialogueAct xml:id="da2" target="#m2" sender="#b" addressee="#a" dimension="task" communicativeFunction="acceptRequest"
             functionalDependence=”#da1” conditionality="conditional"/>

According to the annotation theory that underlies dialogue act annotation with the DIT++ scheme (Bunt, 2010; 2013; 2015; Pustejovsky et al., 2017) semantic annotations must have besides a concrete representation format also a format-independent abstract syntax and a semantics. The annotation theory implements the distinction made in the ISO Linguistic Annotation Framework (LAF, ISO 24612:2009) between annotations and representations. The term ‘annotation' refers to the linguistic information that is added to segments of language data, independent of the format in which the information is presented; ‘representation' refers to the format in which an annotation is rendered. This distinction is implemented in the DiAML definition by a syntax specification that defines, besides a class of XML-based representation structures, also a class of more abstract annotation structures. These specifications are called the concrete syntax and the abstract syntax, respectively. Annotation structures are set-theoretical structures. The concrete syntax defines a reference format for rendering annotation structures in XML. Alternative representation formats for DiAML annotation structures are discussed in Bunt et al. (2019). For a detailed specification of the semantics of DiAML annotation structures see Bunt (2014).

DiAML Abstract syntax

The abstract syntax of DiAML consists of: (a) a specification of the elements from which annotation structures are built up, called a ‘conceptual inventory', and (b) a specification of the possible ways of combining these elements to form annotation structures.

The conceptual inventory of DiAML consists of sets of dialogue participants, dimensions, communicative functions, functional segments, and qualifiers.

An annotation structure is a collection of entity structures and link structures. Entity structures contain semantic information about a dialogue segment; link structures describe semantic relations between entity stuctures. Entity structures are always of the general form 〈m,z〉, where ‘m’ is a markable and ‘z’ designates a structure that describes some linguistic information. Link structures are typically of the form 〈e1, e2, R〉, consisting of two entity structures and a relation.

The entity structure of central interest in DiAML is a pair 〈m,α〉 of which the linguistic information ‘α’ is a so-called ‘dialogue act structure’. A dialogue act structure contains the information that characterizes a single dialogue act. This includes minimally a specification of the sender, the addressee(s), and the communicative function. For dialogue acts with a general-purpose communicative function, the dimension of the semantic content is another component; for dialogue acts with a dimension-specific function the dimension does not need to be specified, since it is inherent in the definition of the function. General-purpose functions may additionally have one or more qualifiers. For a dialogue act which depends semantically on (the interpretation of) one or more previous dialogue segments, a sixth component is a set E of elements that the act depends on through functional or feedback dependence relations. In a setting in which other participants than the sender and the addressees should be taken into account, an additional element is a set H of ‘other participants’. A dialogue act structure is therefore in the simplest case a triple 〈S, A, f_d〉, consisting of a sender S, a (set of) addressee(s) A, and a dimension-specific function f_d, and in the most complex case a 7-tuple as in (19), with a general-purpose function f, a dimension d,, a set q of one or more qualifiers, and a set E of one or more dialogue units that the act depends on.

(D8) α = 〈S, A, H, f, d, q, E〉

A link structure in DiAML is a triple 〈ε, E, ρ〉 consisting of an entity structure ε, a set E of one or more entity structures, and a rhetorical relation ρ, which relates the dialogue act in ε to those in E.

Concrete syntax

The DiAML concrete syntax is defined in accordance with the CASCADES methodology for developing semantic annotation languages, described in Bunt (2013 and Pustejovsky et al. (2017). This methodology includes the notion of an ideal representation format, defined as one which is (1) ‘complete' in the sense that every annotation structure defined by the abstract syntax can be represented, and (2) ‘unambiguous' in the sense that every representation defined by the concrete syntax represents one and only one annotation structure defined by the abstract syntax. Since the semantics of DiAML is defined for the structures defined by the abstract syntax, any two representation formats which are ‘ideal' in this sense are semantically equivalent, and every representation in one such format can be converted by a meaning-preserving mapping into any other such format. The DiAML concrete syntax specifies a reference representation format based on XML, often referred to as 'DiAML-XML'. This specification lists names of XML tags, attributes, and values corresponding to the various ingredients in the conceptual inventory, and defines the possible ways of combining these elements in XML structures. In particular, XML elements are defined for entity structures and link structures.

Entity structures for dialogue acts are represented by a DiAml-XML element called < dialogueAct>, which has the following attributes:

@xml:id, whose value is a unique identifier of a dialogue act structure.
@target, whose value refers to a functional segment.
@sender, @addressee, and @otherParticipant, whose values refer to dialogue participants, identified in the metadata of the annotated primary data; the attribute otherParticipant is optional.
@dimension, whose value names one of the DIT++ dimensions.
@communicativeFunction, whose value names one of the communicative functions defined of the DIT++ taxonomy.
@certainty, @conditionality, and @sentiment, whose values is one of the qualifiers defined in DIT++. The attributes are optional.
@functionalDependence, whose values refer to one or more dialogue acts that the given dialogue act has a functional dependence relation with. This attribute has a value only for dialogue acts with a responsive communicative function.
@feedbackDependence, whose values refer to one or more dialogue acts or reference segments that the given dialogue act has a feedback dependence relation with. This attribute has a value only for certain feedback acts and for dialogue acts in the Own Communication Management or Partner Communication Management dimensions.

Link structures are represented either by the DiAML-XML element < rhetoricalLink> or by the element < drLink> (annotators can choose either). The < rhetoricalLink> element has the following attributes:

@dact, whose value refers to a dialogue act that is rhetorically related to other dialogue acts;
@rhetoRelatum, whose value refers to one or more dialogue acts that the given dialogue act is rhetorically related to;
@rhetoRel, whose value names a rhetorical relation.

The < drLink> element has the following attributes:

@arg1 and @arg2, whose values refer to two rhetorically related dialogue acts or their semantic content, if a plug-in for semantic content is used;
@rel, whose value is a rhetorical relation and makes use of embedded < argRole> elements, which have an attribute @arg, whose value identifies a dialogue act, and an attribute @role, whose value names an argument role.

Example (20c-d), shows the abstract annotation structure and its DiAML-XML representation of the dialogue fragment in (20a), segmented as shown in (20b).

 
(20a) 	P1: What time does the next train to Utrecht leave?
    	P2: The next train to Utrecht leaves I think at 8:32.

Annotations may be attached to primary dialogue data in a variety of ways; they may be attached directly to stretches of speech, defined by temporal begin- and end points, or to structures at lower levels of description, such as the output of a tokenizer. Here it is assumed that functional segments are identified at another level of XML representation. P2's utterance is segmented into two overlapping functional segments: m2 in the Auto-Feedback dimension (reflecting that the repetition of a large part of an utterance signals positive feedback on understanding it) and m3 in the Task dimension.. Following the guidelines of the Text Encoding Initiative (TEI P5, 2010), the prefix '#' is used to indicate that the prefixed value is identified either in the metadata of the primary data or in another layer of annotation, or elsewhere within the same representation. Note that the abstract annotation structure in (D9c) is a set of three elements, corresponding to the three dialogue acts in this fragment, where the second and the third element both have the first element embedded, indicating their dependence on the first dialogue act.

(D9b) 	Segmentation of the exchange in (D9a):
	m1 = What time does the next train to Utrecht leave? (Task dimension)
      	m2 = The next train to Utrecht leaves (Auto-Feedback dimension)
        m3 = “The next train to Utrecht leaves I think at 8:32.“ (Task dimension).
(D9c)	Annotation structure according to DiAML abstract syntax:
	{〈m1,〈p1,p2,setQuestion,Task〉, 
	 〈m2,〈p2,p1,autoPositive,{〈m1,〈p1,p2,setQuestion,Task〉〉}〉, 
         〈m3,〈p2,p1,aswer,Task,{uncertain},{〈m1,〈p1,p2,setQuestion,Task〉〉}〉〉}   
(D9c) 	DiAML-XML annotation representation:
	< diaml xmlns:"http://www.iso.org/diaml/"> 
          < dialogueAct xml:id="da1" target="#m1" sender="#p1" addressee="#p2" 
            communicativeFunction="setQuestion" dimension="task"/> 
          < dialogueAct xml:id="da2" target="#m2" sender="#p2" addressee="#p1" 
            communicativeFunction="autoPositive" feedbackDependence="#da1"/>
       	  < dialogueAct xml:id="da3" target="#m3" sender="#p2" addressee="#p1" 
            communicativeFunction="answer" certainty="uncertain" dimension="task" 
            functionalDependence="#da1"/>
      	< /diaml>

Semantics

DiAML annotation structures have a semantics in terms of information-state updates. The most important kind of structure defined by the DiAML abstract syntax, the dialogue act structure, is a functional characterization of a dialogue act. It does not correspond to a complete dialogue act, since it does not include the semantic content (but only a semantic content category, a ‘dimension’). The semantics of a complete dialogue act is obtained by combining the interpretation of a dialogue act structure with a semantic content. This is accomplished by applying the interpretation Ia(〈s,α〉) of an entity structure which contains a dialogue act structure α, to the semantic content κ(s) of the functional segment that expresses the dialogue act. The result is an information state update operation as shown in (D10) for a dialogue act that has no functional dependences to other dialogue acts.

(D10) I_a(〈s,α〉) = I_a(α)(κ(s))

The interpretation I_a(α) of a dialogue act structure α is defined as follows for dialogue acts without qualifiers:

(D11) I_a(〈S,A,f,d〉) = I_a(f)(I_a(S), I_a(A), I_a(d))

i.e. the interpretation of a dialogue act structure is the interpretation of its communicative function, applied to the interpretations of its sender, its addressee, and its dimension. For more details see Bunt (2014).

ISO dialogue act annotation standard 24617-2, First and Second Edition

First Edition, September 2012

ISO 24617-2 is an ISO international standard for the annotation of dialogue with dialogue act information. The technical content of the standard was formally approved (by the national standardization bodies participating in ISO), and the official registration of the standard occurred on 4 September 2012 when the document which describes the standard was published by the ISO Central Secretariat in Geneva. A pre-final version of the document describing the standard can be found here; the final version can be obtained from ISO and from the national standardization institutes.

A variety of English and Dutch dialogues, annotated according to the ISO 24617-2 standard, has been collected in the DialogBank resource.

The ISO scheme for dialogue act annotation is a subset of release 5 of the DIT++ scheme, or rather, release 5 of the DIT++ annotation scheme is an extension of the scheme described in the ISO standard. The DIT++ annotation scheme can thus be said to be strictly ISO-compatible, and in some respects more fine-grained.

The project team that has developed this standard consists of: Jan Alexandersson, Harry Bunt (project leader), Jean Carletta, Jae-Woong Choe, Alex Chengyu Fang, Koiti Hasida, Volha Petukhova, Andrei Popescu-Belis, Claudia Soria, and David Traum.

The project team was supported by an Expert Consultancy Group, consisting of: James Allen, Jens Allwood, Nick Campbell, Roberta Catizone, Thierry Declerck, Anna Esposito, Raquel Fernandez, Giacomo Ferrari, Gil Francopoulo, Dirk Heylen, Julia Hirschberg, Kristiina Jokinen, Maciej Karpinski, Staffan Larsson, Kiyong Lee, Oliver Lemon, Carlos Martinez-Hinarejos, Paul Mc Kevitt, Michael McTear, David Novick, Tim Paek, Patrizia Paggio, Catherine Pelachaud, Massimo Poesio, German Rigau, Laurent Romary, Nicla Rossini, Milan Rusko, Candice Sidner, Pavel Smrz, Marieke van Erp, Ielka van der Sluis, Kristinn Thorisson, Aesoon Yoon, Yorick Wilks.

The following papers summarize the standard and describe its use:

Harry Bunt: Guidelines for using ISO standard 24617-2 for dialogue annotation. TiCC Technical Report 2019-1, Tilburg Center for Cognition and Communication and Department of Cognitive Science and Artificial Intelligence, Tilburg University.
Harry Bunt, Volha Petukhova, Andrei Malchanau, Alex Chengyu Fang, and Kars Wijnhoven: 'The DialogBank: Dialogues with Interoperable Annotations'. Language Resources and Evaluation 2019. Available online at DOI: 10.1007/s10579-018-9436-9
Harry Bunt, Volha Petukhova, David Traum, and Jan Alexandersson: 'Dialogue Act Annotation with the ISO 24617-2 Standard'. In Deborah Dahl (ed.) Multimodal Interaction with W3C Standards: Towards Natural User Interaces to Everything Springer, Cham (Switzerland), 2017. pp. 109-135.
Harry Bunt, Jan Alexandersson, Jean Carletta, Jae-Woong Choe, Alex Chengyu Fang, Koiti Hasida, Kiyong Lee, Volha Petukhova, Andrei Popescu-Belis, Laurent Romary, Claudia Soria, and David Traum:: "Towards an ISO standard for dialogue act annotation". In Proceedings of LREC 2010, May 2010, Malta.
Harry Bunt, Jan Alexandersson, Jae-Woong Choe, Alex Chengyu Fang, Koiti Hasida, Volha Petukhova, Andrei Popescu-Belis, and David Traum:: "ISO 24617-2: A semantically-based standard for dialogue annotation". In Proceedings of LREC 2012, May 2012, Istanbul.

Second Edition, December 2020

Following the ISO practice of reviewing its standards every five years, the first edition ISO 24617-2:12 was examined in 2017-2018 for the need of revision. At a meeting in September 2017 it was concluded that some minor revisions would be desirable, as well as some extensions. These were discussed in a meeting of users of the first edition in Tilburg, April 2018, including Pierere Andre, Shammur Chowdhury, Emer Gilmartin, Simon Keizer, Andrei Malchanau, Catherine Pelachaud, Volha Petukhova, Laurent Prévot, Mariet Theune, Kars Wijnhoven and Harry Bunt, and at the ISA-14 workshop in Santa Fe, New Mexico, August 2018.

Additionally, possibilities for extending the standard were discussed at the ISA-15 workshop on Gothenburg, May 2019, where the notion of 3-part layered plug-ins was discussed, which was introduced in DIT++ release 5.2. A proposal for a revised, second edition of the standard was submitted to the ISO organization in 2019 and was approved in an international ballot in February 2020 -- see the ISO/DIS document. The following papers are about the revision of the first edition of the standard, discussing limitations and proposing improvements and extensions:

Harry Bunt

Plug-ins for content annotation of dialogue acts.
In Proceedings of the 15th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-15) at IWCS 2019, Gothenburg, Sweden, pp. 34-45.

Harry Bunt, Emer Gilmartin, Catherine Pelachaud, Volha Petukhova, Laurent Prévot and Mariet Theune (2018)

Downward Compatible Revision of Dialogue Act Annotation.
In Proceedings 14th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-14) at COLING 2018, Santa Fé, New Mexico, pp. 21-34.

Harry Bunt, Volha Petukhova and Alex Chengyu Fang

'Revisiting the ISO Standard for Dialogue Act Annotation.'
In Proceedings 13th Joint ISO-ACL Workshop on Interoperable Semantic Annotation (ISA-13) at IWCS 2017, Montpellier, France, September 2017.

DIT++ or ISO 24617-2 related publications

Tatiana Anikina and Ivana Kruijff-Korbayova: 'Dialogue Act Classification in Team Communication for Robot Assisted Isaster Response'. In Proceedings 20th Annual SIGdial Meeting on Discourse and Dialogue, Stockholm, Sweden, September 2019, pp. 399-410.
Harry Bunt: 'Plug-ins for content annotation of dialogue acts'. In Proceedings 15th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISBN 978907402384), Gothenburg, Sweden, May 2019, pp. 33-45.
Simon Keizer, Ondrej Dusek, Xingkun Liu and Verena Rieser: 'User Evaluation of a Multi-dimensional Statistical Dialogue System'. In Proceedings 20th Annual SIGdial Meeting on Discourse and Dialogue, Stockholm, Sweden, September 2019, pp. 392-398.
Harry Bunt, Volha Petukhova, Andrei Malchanau, Alex Chengyu Fang, and Kars Wijnhoven: 'The DialogBank: Dialogues with Interoperable Annotations'. Language Resources and Evaluation 2019. Available online at DOI: 10.1007/s10579-018-9436-9
Andrei Malchanau: 'Cognitive Architecture for Multimodal Multidimensional Dialogue Management.' PhD Thesis, University of Saarland, Saarbruecken 2019.
Andrei Malchanau, Volha Petukhova and Harry Bunt: 'Towards Integration of Cognitive Models in Dialogue Management: Designing the Virtual Negotiation Coach Application.' Dialogue and Discourse 9(2), 35-79, 2018. DOI: 10.5087/dad.2018.202.
Alex Chengyu Fang, Yanjao Liu, Jing Cao and Harry Bunt: 'Chinese Multimodal Resources for DA Analysis.' In Chu-Ren Huang, Zhuo Jing-Schmidt, and Barbara Meisterernst (eds) The Handbook of Chinese Applied Linguistics, Chapter 17. Routledge, Oxford 2018.
Harry Bunt, Emer Gilmartin, Catherine Pelachaud, Volha Petukhova, Laurent Prévot and Mariet Theune (2018): Downward Compatible Revision of Dialogue Act Annotation.
In Proceedings 14th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-14), Santa Fé, New Mexico, pp. 21-34.
Volha Petukhova, Andrei Malchanau and Harry Bunt: 'Modelling argumentation in parliamentary debates: data collection, analysis and test case'. In Matteo Baldoni, Cristina Baroglio, Floris Bex, The Duy Bui, Floriana Grasso, Nancy Green, Mohammad Namazi, Masayuki Numao, Mercedes Rodrigo, Merlin Teodosia Suarez (eds.) Principles and Practice of Multi-Agent Systems. Springer Lecture Notes in Artificial Intelligence, Springer, Berlin, 2017, pages 26-46.
James Pustejovsky, Harry Bunt and Annie Zaenen: 'Designing Annotation Schemes: From Theory to Model.' In Nancy Ide and James Pustejovsky (eds): Handbook of Linguistic Annotation, Springer, Berlin, 2017.
Simon Keizer and Verena Rieser: 'Towards Learning Transferable Conversational Skills using Multi-dimensional Dialogue Modelling.' In Proceedings 21st Workshop on the Semantics and Pragmatics of Dialogue (SemDial/SaarDial), Saarbrucken 2017.
Harry Bunt, Volha Petukhova, David Traum and Jan Alexandersson: 'Dialogue Acts Annotation with the ISO 24617-2 Standard'. In Deborah Dahl (ed.) Multimodal Interaction with W3C Standards: Towards Natural User Interfaces to Everythhing.Springer, Berlin, pp. 109-135.
Andrei Malchanau, Volha Petukhova, Harry Bunt and Dietrich Klakow: 'Multidimensional dialogue management for tutoring systems.'' In Proceedings of the 7th Language and Technology Conference (LTC 2015), Poznan, Poland.
Harry Bunt: 'On the principles of semantic annotation.' Proceedings 11th Joint ACL-ISO Workshop on Interoperable Semantic Annotation, London, April 2015, pp. 1-13.
Volha Petukhova, Harry Bunt and Andrei Malchanau and Ramkumar Aruchamy: 'Experimenting with grounding strategies in dialogue'. Proceedings of GoDial, The 19th International Workshop on the Semantics and Pragmatics of Dialogue (SEMDIAL 2015), Gothenburg, August 2015.
Harry Bunt: 'A context-change semantics for dialogue acts'. In Computing Meaning, Vol. 4, Harry Bunt, Johan Bos and Stephen Pulman, editors. Springer, Dordrecht 2014.
Harry Bunt and Volha Petukhova: 'Incremental Recognition and Prediction of Dialogue Acts'. In Computing Meaning, Vol. 4, Harry Bunt, Johan Bos and Stephen Pulman, editors. Springer, Dordrecht, 2014.
Harry Bunt: 'A methodology for designing semantic annotations'. TiCC Technical Report 2013-001, Tilburg Center for Cognition and Communication and Department of Cognitive Science and Artificial Intelligence, Tilburg University, 2013.
Harry Bunt: 'The semantics of feedback.' In Proceedings of SeineDial, the 2012 Workshop on the Semantics and Pragmatics of Dialogue, Paris, September 2012.
Harry Bunt: 'Multifunctionality in dialogue.' Computer, Speech and Language 25 (2011), 225-245.
Harry Bunt: 'The semantics of dialogue acts'. In Proceedings of the 9th International Conference on Computational Semantics (IWCS-9), Oxford, January 12-14, 2011, pp. 1-13.
Volha Petukhova and Harry Bunt: 'Incremental dialogue act understanding'. In Proceedings of the 9th International Conference on Computational Semantics (IWCS-9), Oxford, January 12-14, 2011.
Keizer, S., H. Bunt and V. Petukhova (2011): Multidimensional dialogue management. In A. van den Bosch and G. Bouma (eds.) Interactive Multimodal Question Answering. Berlin: Springer, pp. 57-86.
Harry Bunt and Volha Petukhova: 'On the many benefits of a multidimensional approach to the analysis of spoken dialogue.' In Peter Juel Henrichsen (ed.) Linguistic Theory and Raw Sound. Copenhagen Studies in Language 40, 209-250.
Harry Bunt, Jan Alexandersson, Jean Carletta, Jae-Woong Choe, Alex Chengyu Fang, Koiti Hasida, Kyong Lee, Volha Petukhova, Andrei Popescu-Belis, Laurent Romary, Claudia Soria, and David Traum: `Towards an ISO standard for dialogue act annotation' In Proceedings of LREC 2010, the Seventh International Conference on Language Resources and Evaluation, Malta, May 16-23, 2010.
Harry Bunt: 'Interpetation and generation of dialogue with multidimensional context models.' In Anna Esposito (ed.) Toward Autonomous, Adaptive, and Context-Aware Multimodal Interfaces. Springer, Berlin, pp. 214-242. See also online version.
Harry Bunt: `A methodology for designing semantic annotation languages exploiting syntactic-semantic iso-morphisms.' In Proceedings of ICGL 2010, Second International Conference on Global Interoperability for Language Resources, Hong Kong, 18-20 January 2010.
Volha Petukhova, Harry Bunt, and Andrei Malchanau: 'Empirical and theoretical constraints on dialogue act combinaations'. In Proceedings of the 14th International Workshop on the Semantics and Pragmatics of Dialogue (PozDial), Poznan, June 16-18, 2010.
Volha Petukhova and Harry Bunt: 'Introducing communicative function qualifiers.' In Proceedings of ICGL 2010, Second International Conference on Global Interoperability for Language Resources, Hong Kong, January 2010.
Volha Petukhova and Harry Bunt: 'Grounding by nodding.' In Proceedings of GESPIN 2009, Conference on Gestures and Speech in Interaction, Poznan, September 2009.
Harry Bunt: `Multifunctionality and multidimensional dialogue semantics'. In Proceedings of DiaHolmia 2009, (invited talk), Stockholm, June 2009.
Volha Petukhova and Harry Bunt: 'Who's next? Speaker-selection mechanisms in multiparty dialogue'. In Proceedings of DiaHolmia 2009, 8th Internal Workshop on the Semantics and Pragmatics of Dialogue, Stockholm, June 2009.
Harry Bunt: 'The DIT++ taxonomy for functional dialogue markup'. In Proceedings of the AAMAS 2009 Workshop "Towards a Standard Markup Language for Embodied Dialogue Acts" (EDAML 2009), Dirk Heylen, Catherine Pelachaud, Roberta Catizone, and David Traum, editors, Budapest, May 12, 2009.
Volha Petukhova and Harry Bunt: 'The independence of dimensions in multidimensional dialogue act annotation'. tab In Proceedings of the NAACL 2009 conference, Boulder, Colorado, June 2009.
Harry Bunt: 'Semantic Annotations as Complimentary to Underspecified Semantic Representations'. In Proceedings of the 8th International Conference on Computational Semantics (IWCS-8), Tilburg, January 7-9, 2009.
Volha Petukhova and Harry Bunt: 'Towards a multidimensional semantics for discourse markers'. In Proceedings of the 8th International Conference on Computational Semantics (IWCS-8), Tilburg, January 7-9,2009
Harry Bunt and Chwhynny Overbeeke: `An Extensible, Compositional Semantics of Temporal Annotation.' In Proceedings of LAW-II, the Second Linguistic Annotation Workshop, Marrakech, Morocco, May 26-27, 2008.
Jeroen Geertzen, Volha Petukhova, and Harry Bunt: `Evaluating Dialogue Act Tagging with Naive and Expert Annotators.' In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco, May 28-30, 2008.
Harry Bunt and Chwhynny Overbeeke: `Towards formal interpretation of semantic annotation.' In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco, May 28-30, 2008.
Volha Petukhova and Harry Bunt: `LIRICS semantic role annotation: design and evaluation of a set of data categories.' In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco, May 28-30, 2008.
Harry Bunt: `The Semantics of Semantic Annotation.' Invited paper presented at PACLIC-21, the 21st Pacific Asia Conference on Language, Information and Compuation, Seoul, Korea, November 2,. 2007. In (ed.) Proceedings of PACLIC-21, the 21st Pacific Asia Conference on Language, Information and Compuation, Seoul, Korea, November 1-3, 2007.
Harry Bunt: `Multifunctionality and Multidimensional Dialogue Act Annotation.' In E. Ahlsen et al. (ed.) Communication - Action - Meaning, A Festschrift to Jens Allwood. Gothenburg University Press, August 2007, pp. 237 -- 259.
Harry Bunt and Roser Morante: `The Weakest Link.' In V. Matousek and P. Mautner (2007) (eds.) Text, Speech and Dialogue. Lecture Notes in Artificial Intelligence (LNAI) 4629, Springer, Berlin.
Harry Bunt, Roser Morante, Simon Keizer: `An empirically based computational model of grounding in dialogue.' In Proceedings of the Eighth SIGDIAL Conference on Discourse and Dialogue (SIGDIAL 2007). Antwerp, 1-2 September, 2007, pp. 283-290.
Jeroen Geertzen, Volha Petukhova, and Harry Bunt: `A multidimensional approach to utterance segmentation and dilaogue act classification.' In Proceedings of the Eighth SIGDIAL Conference on Discourse and Dialogue (SIGDIAL 2007). Antwerp, 1-2 September, 2007, pp. 140-149.
Roser Morante, Simon Keizer and Harry Bunt: `Dialogue simulation and context dynamics for dialogue management.' In J. Nivre, H.-J. Kaalep, K. Muischnek, and M. Keit (eds) Proceedings of the 16th Nordic Conference on Computational Linguistics (NODALIDA 2007). Tartu, Estonia, pp. 310-317.
Simon Keizer and Harry Bunt: `Evaluating combinations of dialogue acts'. In Proceedings of the Eighth SIGDIAL Workshop on Discourse and Dialogue (SIGDIAL 2007), Antwerp, September, 2007, pp. 158-165.
Roser Morante, Simon Keizer and Harry Bunt: `A dialogue act based model for context updating.' In Proceedings of the Eleventh International Conference on the Semantics and Pragmatics of Dialogue (DECALOG 2007). Trento, 30 May - 3 July, 2007, pp. 9-16.
Harry Bunt and Amanda Schiffrin: `Interoperable concepts for dialogue act annotation.' In Proceedings of the Seventh International Workshop on Computational Semantics (IWCS-7). Tilburg, January 10-12, 2007, pp. 16-27.
Volha Petukhova and Harry Bunt: `A Multidimensional Approach to Multimodal Dialogue Act Annotation.' In Proceedings Seventh International Workshop on Computational Semantics (IWCS-7). Tilburg, January 10-12, 2007, pp. 142-153.
Volha Petukhova, Harry Bunt and Amanda Schiffrin: `Defining Semantic Roles.' In Proceedings Seventh International Workshop on Computational Semantics (IWCS-7). Tilburg, January 10-12, 2007, pp. 362-365.
Jeroen Geertzen and Harry Bunt: `Measuring annotator agreement in a complex hierarchical dialogue act scheme'. In Proceedings of SIGDIAL 2006, Sydney, July 15-16, 2006.
Simon Keizer and Harry Bunt: `Multidimensional dialogue management'. In Proceedings of SIGDIAL 2006, Sydney, July 15-16, 2006.
Harry Bunt: 'Dimensions in Dialogue Act Annotation'. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006). Genova, Italy, May 24-26, 2006.
Harry Bunt and Amanda Schiffrin: `Methodologial aspects of semantic annotation'. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006). Genova, Italy, May 24-26, 2006.
Jacques Terken, Hans van Dam and Harry Bunt: `Cooperative assistance for human-system interaction'. In Proceedings of the 16th World Conference on Ergonomics (IEA2006), Maastricht, July 10-14, 2006.
Harry Bunt: `Mass Terms'. In Keith Brown (ed.) Encyclopaedia of Language and Linguistics, Second Edition. Elsevier, Amsterdam, pp. 5757-5760.
Rieks op den Akker, Harry Bunt, Simon Keizer and Boris van Schooten: `From Question Answering to Spoken Dialogue: Towards an Information Search Assistant for Interactive Multimodal Information Extraction'. In Proceedings of the Ninth European Conference on Speech Communication and Technology, Interspeech 2005, Lisbon, September 2005, pp. 2793-2797.
Harry Bunt and Yann Girard: `Designing an Open, Multidimensional Dialogue Act Taxonomy'. In Claire Gardent & Bertrand Gaiffe (eds) DIALOR'05, Proceedings of the Ninth International Workshop on the Semantics and Pragmatics of Dialogue. Nancy, June 9-11, 2005.
Harry Bunt: `A Framework for Dialogue Act Specification'.Paper presented at ISO_SIGSEM workshop Tilburg, January 10-11, 2005.
Harry Bunt, Michael Kipp, Mark Maybury and Wolfgang Wahlster: `Fusion and Coordination for Multimodal Interactive Information Presentation'. In: O. Stock and M. Zancanaro (eds) Multimodal Intelligent Information Presentation. Springer, Dordrecht 2005, pp. 325-339.
Harry Bunt and Laurent Romary: `Standardization in Multimodal Content Representation: Some Methodological Issues'.In: Proceedings of LREC 2004, Lisbon, Portugal, May 2004, pp. 2219-2222.
Harry Bunt and Laurent Romary: `Towards multimodal semantic representation'.In: Key-Sun Choi (ed.) Proceedings of LREC 2002 Workshop on International Standards of Terminology and Language Resourses Management, Las Palmas, Spain, 29 May 2002. ELRA, Paris, pp. 54-60.
Leen Kievit, Paul Piwek, Robbert-Jan Beun and Harry Bunt: `Multimodal Cooperative Resolution of Referential Expressions in the DenK System.'(pdf file).; (Postscript file). In: H.C. Bunt & R.J. Beun (eds.) Cooperative Multimodal Communication , Lecture Notes in Artificial Intelligence 2155, Springer Verlag, Berlin, forthcoming October 2001, pp. 197-214.
Harry Bunt: `Dialogue pragmatics and context specification.'(ps file); `Dialogue pragmatics and context specification.'(pdf file). In: Harry Bunt and William Black (eds) Abduction, Belief and Context in Dialogue. Studies in Computational Pragmatics. John Benjamins, Amsterdam, 2000, Series Natural Language Processing, Volume 1, pp. 81-150.
Harry Bunt and Willam Black: `The ABC of Computational Pragmatics.'. In: Harry Bunt and William Black (eds) Abduction, Belief and Context in Dialogue. Studies in Computational Pragmatics. John Benjamins, Amsterdam, 2000, Series Natural Language Processing, Volume 1, pp. 1-46.
Harry Bunt: `Non-problems and social obligations in human-computer conversation'. In: Proceedings Third International Workshop on Human-Computer Conversation, Bellagio (Italy), July 2000.
Harry Bunt, Rene Ahn, Robbert-Jan Beun, Teun Borghuis, & Cees van Overveld: `The DenK architecture: a pragmatic approach to user interfaces.' Artificial Intelligence Review 8 (3), 1995, 431-445.
Harry Bunt, Rene Ahn, Robbert-Jan Beun, Teun Borghuis, & Cees van Overveld: `Multimodal Cooperation with the DenK System.' (postscript file) In: H.C. Bunt, R.J. Beun & T. Borghuis (eds.) Multimodal Human-Computer Communication. Sytems, Techniques and Experiments. Lecture Notes in Artificial Intelligence 1374, Springer Verlag, Berlin, pp. 39-67.
Harry Bunt: `Dynamic Interpretation and Dialogue Theory'. In: M.M. Taylor, D.G. Bouwhuis & F. Neel (eds.) The Structure of Multimodal Dialogue, Vol 2., Amsterdam: John Benjamins, 2000, pp 139-166.
Harry Bunt: `Dialogue control functions and interaction design'. In: R.J. Beun, M. Baker & M. Reiner (eds.) Dialogue in Instruction. Springer Verlag, Heidelberg, 1995, pp. 197 -- 214.
Harry Bunt: `Interaction management functions and context representation requirements'. In: S. LuperFoy, A. Nijholt, & G. Veldhuizen van Zanten (eds.) Dialogue Management in Nateral Language Systems. Proc. of 11th Twente Workshop on Language Technology, University of Twente, Enschede, June 1996, pp. 187 -- 198.
Harry Bunt: Context and Dialogue Control. Think Quarterly 3(1), 19-31.
Harry Bunt: `Belief Contexts in Human-Computer Dialogue'. In: D. Nauta, A. Nijholt & J. Schaake (eds.) Pragmatics in Language Technology. Proc. of 4th Twente Workshop on Language Technology, University of Twente, Enschede, June 1992, pp. 106 -- 114.
Harry Bunt: Information Dialogues as Communicative Action in Relation to User Modelling and Information Processing. In: M.M. Taylor, D.G. Bouwhuis & F. Neel (eds.) The Structure of Multimodal Dialogue, Vol. 1, Amsterdam: North-Holland, 1989, pp. 47-74.

Other references:

Bunt (2019) An annotation scheme for quantification. In Proceedings of the 14th International Conference on Computational Semantics (IWCS 2019), Gothenburg, Sweden.

Bunt, H. (2018) Semantic Annotation of Quantification in Natural Language -- preparatory study for developing an ISO standard TiCC Technical Report 2018-15, Tilburg Center for Cognition and Communication and Department of Cognitive Science and Artificial Intelligence, Tilburg University.

Bunt, H., J. Pustejovsky and K. Lee (2018) Towards an ISO Standard for the Annotation of Quantification. In Proceedings 11th International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, May 2018.

Burkhardt, F., Pelachaud, C., Schuller, Bj., and Zovato, E. (2017) EmotionML. In D. Dahl (ed.) Multimodal Interaction with W3C Standards. Springer, Cham (Switzerland), pp. 65-80.

Ekman, P. (1972). Universals and Cultural Differences in Facial Expressions of Emotion. In J. Cole (Ed.), Nebraska Symposium on Motivation (Vol. 19, pp.207-282). University of Nebraska Press.

Different in release 5.1 from previous release (release 4, February 2010)
DIT++ release 5.1 offers the same coverage as release 4, and is fully compatible with it: annotations using release 4 are easily converted into annotations using release 5.1 and vice versa. The changes (improvements!) have been inspired by the fact that the DIT++ taxonomy has been the basis for the ISO standard for dialogue act annotation. In the course of defining the ISO standard proposal, some points were noted where the DIT++ taxonomy (release 4) could be improved. The most important of these improvements are the introduction of (1) communicative function qualifiers; and (2) rhetorical relations among dialogue acts.

Last modified: Tue Jan 19 15:32:52 CET 2021
<harry.bunt@uvt.nl>

DIT++ Taxonomy of Dialogue Acts, Annotation Scheme, and DiAML Markup Language