What is D4?

D4 is a novel Chinese Dialogue Dataset for Depression-Diagnosis-Oriented Chat, which is a brand new type of dialogue called Task-Oriented Chat. It consists of 1,339 multi-turn (Avg.turns 21.6) dialogues with dialogue summary and diagnosis results. This dataset mainly serves for constructing a more empathy-driven and diagnostic-accurate consultation dialogue system.

Get Access

To get access to the data, you must Sign a Data Usage Agreement (DUA) to prevent the invasion of privacy or other potential misuses.
Please read the DUA carefully, and especially Pay Attention that:
1. You cannot transfer or share any part of the dataset.
2. You cannot identify or contact any user in the dataset.
3. You can only use the data for research purposes.
Please send a email to binnieyao@gmail.com with the message: "I consent to the Data Usage Agreement (DUA)." and attach the DUA including your handwritten signature in it.

Download DUA from Google Drive


Oct 6, 2022 Our paper is accepted by EMNLP 2022. Oct 27, 2022 Baseline is available on

Contact Us

If you have any questions about this dataset, please contact binnieyao@gmail.com, mengyuewu@sjtu.edu.cn or chenlusz@sjtu.edu.cn


Depression Severity Stats

Category Overall Control Mild Moderate Severe
Dialogues 1,339 430 342 368 199
Avg. Turns 21.6 17.9 21.3 23.7 26.0
Avg. Tokens of Symptom Summary 84.4 59.8 82.0 100.5 111.9

Topic Stats

Topic Ratio
Topic Ratio


4 tasks on D4 introduced in our paper, including Response Generation, Topic Prediction, Dialog Summary and Severity Classification. The baseline will be available soon.