Img

DSTC5

Fifth Dialog State Tracking Challenge @ SLT2016

Task Descriptions

Main task: Dialog State Tracking at Sub-dialog Level

The goal of the main task of the challenge is to track dialog states for sub-dialog segments. For each turn in a given sub-dialog, the tracker should fill out a frame of slot-value pairs considering all dialog history prior to the turn. The performance of a tracker will be evaluated by comparing its outputs with reference annotations. Weighted accuracy will be used as evaluation metric to give higher score when a correct frame structure is completed in an earlier turn.

In the development phase, participants will be provided with a training set of English dialogs and a development set of Chinese dialogs with manual annotations over frame structures. In the test phase, each tracker will be evaluated on the results generated for a test set of unlabeled Chinese dialogs.

A baseline system and evaluation scripts are available at DSTC5 GitHub repository.

A more comprehensive description of the main task, avaliable datasets and evaluation protocol can be found in DSTC5 Main Task Handbook.

Pilot tasks: four pilot tasks are available in DSTC5

* Spoken language understanding: Tag a given utterance with speech acts and semantic slots.
* Speech act prediction: Predict the speech act of the next turn imitating the policy of one speaker.
* Spoken language generation: Generate a response utterance for one of the participants.
* End-to-end system: Develop an end-to-end system playing the part of a guide or a tourist.

Pilot task will follow the same cross-language approach as in the main task. Systems are to be trained on English dialogs and a small subset of Chinese dialogs will be provided for development purposes. Evaluations will be conducted over Chinese dialogs.

The resources and the handbook for the pilot tasks are also avaliable at DSTC5 GitHub repository.

Open track: an optional open track is available in DSTC5

DSTC5 registered teams and/or individuals are free to work and report results on any proposed task of their interest over the provided dataset.