Human-machine conversation is one of the most important topics in artificial intelligence (AI) and has received much attention across academia and industry in recent years. Currently dialogue system is still in its infancy, which usually converses passively and utters their words more as a matter of response rather than on their own initiatives, which is different from human-human conversation. Therefore, we set up this competition on a new conversation task, named knowledge driven dialogue, where machines converse with humans based on a built knowledge graph. It aims at testing machines’ ability to conduct human-like conversations.
Please refer to competition website for details of the competition.
Given a dialogue goal g and a set of topic-related background knowledge M = f1 ,f2 ,..., fn , a participating system is expected to output an utterance "ut" for the current conversation H = u1, u2, ..., ut-1, which keeps the conversation coherent and informative under the guidance of the given goal. During the dialogue, a participating system is required to proactively lead the conversation from one topic to another. The dialog goal g is given like this: "Start->Topic_A->TOPIC_B", which means the machine should lead the conversation from any start state to topic A and then to topic B. The given background knowledge includes knowledge related to topic A and topic B, and the relations between these two topics.
Please refer to task description for details of the task.
To facilitate the development of proactive conversation, we create a new conversation dataset named DuConv , which has around 30k conversations containing 270k utterances. Each conversation is created by two crowdsourced workers, where one acts as conversation leader and another acts as follower. The leader is provided with a part of knowledge graph and is asked to sequentially change the discussion topics, following the given conversation goal. The follower is provided with nothing but conversation history and only has to respond to the leader.
We provide retrieval-based and generation-based baseline systems. Both systems were implemented by PaddlePaddle (the Baidu deeplearning framework) and Pytorch (the Facebook deeplearning framework). The performance of the two systems is as follows:
baseline system | F1/BLEU1/BLEU2 | DISTINCT1/DISTINCT2 |
---|---|---|
retrieval-based | 31.72/0.291/0.156 | 0.118/0.373 |
generation-based | 32.65/0.300/0.168 | 0.062/0.128 |
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。