D4 is a novel Chinese Dialogue Dataset for Depression-Diagnosis-Oriented Chat, which is a brand new type of dialogue called Task-Oriented Chat. It consists of 1,339 multi-turn (Avg.turns 21.6) dialogues with dialogue summary and diagnosis results. This dataset mainly serves for constructing a more empathy-driven and diagnostic-accurate consultation dialogue system.
Get Access
To get access to the data, you must Sign a Data Usage Agreement (DUA) to prevent the invasion of privacy or other potential misuses.
Please read the DUA carefully, and especially Pay Attention that:
1. You cannot transfer or share any part of the dataset.
2. You cannot identify or contact any user in the dataset.
3. You can only use the data for research purposes.
Please send a email to binnieyao@gmail.com with the message: "I consent to the Data Usage Agreement (DUA)." and attach the DUA including your handwritten signature in it.
Download DUA from Google Drive
Oct 6, 2022
Our paper is accepted by EMNLP 2022.
Oct 27, 2022
Baseline is available on
D4_baseline
If you have any questions about this dataset, please contact binnieyao@gmail.com, mengyuewu@sjtu.edu.cn or chenlusz@sjtu.edu.cn
Depression Severity Stats
Category | Overall | Control | Mild | Moderate | Severe |
---|---|---|---|---|---|
Dialogues | 1,339 | 430 | 342 | 368 | 199 |
Avg. Turns | 21.6 | 17.9 | 21.3 | 23.7 | 26.0 |
Avg. Tokens of Symptom Summary | 84.4 | 59.8 | 82.0 | 100.5 | 111.9 |
Topic Stats
4 tasks on D4 introduced in our paper, including Response Generation, Topic Prediction, Dialog Summary and Severity Classification. The baseline will be available soon.