|
Jianshuo Dong (董建硕)
I am a third-year Ph.D. student of INSC at Tsinghua University since 09/2023.
I focus on machine learning security, trustworthy AI systems, and explainable AI.
My advisor is Prof. Han Qiu.
Contact: dongjs23@mails.tsinghua.edu.cn
Office: Room 205, FIT Building, Tsinghua University, Beijing, China
|
| 10/17/2025 |
I became a PhD candidate at Tsinghua University!
|
| 08/23/2025 |
One paper got accepted to EMNLP 2025 main (oral)!
|
| 08/10/2025 |
I created a new website for my research!
|
Publications
Conference Papers
|
|
[EMNLP'25]
|
“I’ve Decided to Leak”: Probing Internals Behind Prompt Leakage Intents
Jianshuo Dong, Yutong Zhang, Yan Liu, Zhenyu Zhong, Tao Wei, Ke Xu, Minlie Huang, Chao Zhang, Han Qiu
PDF /
Code /
Oral
TL;DR: Diving into the internals to understand LLMs' prompt leakage intents.
|
|
[ICLR'25]
|
An Engorgio Prompt Makes Large Language Model Babble on
Jianshuo Dong, Ziyuan Zhang, Qingjie Zhang, Tianwei Zhang, Hao Wang, Hewu Li, Qi Li, Chao Zhang, Ke Xu, Han Qiu
PDF /
Code
TL;DR: An inference cost attack targeting modern auto-regressive LLMs.
|
|
[ICCV'24]
|
One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training
Jianshuo Dong, Han Qiu, Yiming Li, Tianwei Zhang, Yuanjie Li, Zeqi Lai, Chao Zhang, Shu-Tao Xia
PDF /
Code
TL;DR: Insert unactivated backdoor in the model training process and make it activated by bit-flip attack.
|
|
[arXiv 2509.23694]
|
SafeSearch: Automated Red-Teaming for the Safety of LLM-Based Search Agents
Jianshuo Dong, Sheng Guo, Hao Wang, Xun Chen, Zhuotao Liu, Tianwei Zhang, Ke Xu, Minlie Huang, Han Qiu
arXiv /
Code
TL;DR: To proactively discover the potential vulnerabilities of LLM-based search agents.
|
|
[arXiv 2506.21571]
|
Towards Understanding the Cognitive Habits of Large Reasoning Models
Jianshuo Dong, Yujia Fu, Chuanrui Hu, Chao Zhang, Han Qiu
arXiv /
Code
TL;DR: Does reasoning models have human-like cognitive habits?
|
|
[arXiv 2507.04214]
|
Can Large Language Models Automate the Refinement of Cellular Network Specifications?
Jianshuo Dong, Tianyi Zhang, Feng Yan, Yuanjie Li, Hewu Li, Han Qiu
arXiv
TL;DR: Evaluate and improve the performance of LLMs in automatically refining cellular network specifications concerning security/trustworthiness issues.
|
| 2019-2023 |
Wuhan University, Wuhan, China
|
| 2023-2028 (expected) |
Tsinghua University, Beijing, China
|
| 2025 Spring |
Trustworthy Machine Learning, Tsinghua University, Teaching Assistant
|
|
[ICLR'26]
|
Reviewer
|
|
[NeurIPS'25]
|
Top Reviewer
|
|
[ICLR'25]
|
Notable Reviewer
|
|