Biography
Hi! I'm a PhD student at Tsinghua University, where I explore
machine learning security, trustworthy AI systems, and explainable AI.
I'm advised by Prof. Han Qiu and collaborate closely with
Prof. Tianwei Zhang.
I'm always excited to connect with fellow researchers and practitioners—feel free to reach out!
News
2025-10-17
I became a PhD candidate at Tsinghua University!
2025-08-23
One paper got accepted to EMNLP 2025 main (oral)!
2025-08-10
I created this website for my research!
Publications
Conference Papers
[EMNLP'25]
Oral
Oral
"I've Decided to Leak": Probing Internals Behind Prompt Leakage Intents
TL;DR: Diving into the internals to understand LLMs' prompt leakage intents.
@inproceedings{dong-etal-2025-ive,
title = "``{I}{'}ve Decided to Leak'': Probing Internals Behind Prompt Leakage Intents",
author = "Dong, Jianshuo and Zhang, Yutong and Yan, Liu and Zhong, Zhenyu and Wei, Tao and Xu, Ke and Huang, Minlie and Zhang, Chao and Qiu, Han",
booktitle = "Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2025",
address = "Suzhou, China",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.emnlp-main.1082/",
doi = "10.18653/v1/2025.emnlp-main.1082",
pages = "21318--21348"
}
[ICLR'25]
Poster
Poster
An Engorgio Prompt Makes Large Language Model Babble on
TL;DR: An inference cost attack targeting modern auto-regressive LLMs.
@inproceedings{dong2025engorgio,
title={An Engorgio Prompt Makes Large Language Model Babble on},
author={Jianshuo Dong and Ziyuan Zhang and Qingjie Zhang and Tianwei Zhang and Hao Wang and Hewu Li and Qi Li and Chao Zhang and Ke Xu and Han Qiu},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=m4eXBo0VNc}
}
[ICCV'23]
Poster
Poster
One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training
TL;DR: Insert unactivated backdoor in the model training process and make it activated by bit-flip attack.
@inproceedings{dong2023onebit,
title={One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training},
author={Dong, Jianshuo and Han, Qiu and Li, Yiming and Zhang, Tianwei and Li, Yuanjie and Lai, Zeqi and Zhang, Chao and Xia, Shu-Tao},
booktitle={ICCV},
year={2023}
}
Pre-prints
[arXiv:2512.14754]
Revisiting the Reliability of Language Models in Instruction-Following
TL;DR: Investigating nuance-oriented reliability of LLMs in instruction-following with varied phrasings.
@article{dong2025revisiting,
title={Revisiting the Reliability of Language Models in Instruction-Following},
author={Dong, Jianshuo and Zhang, Yutong and Liu, Yan and Zhong, Zhenyu and Wei, Tao and Zhang, Chao and Qiu, Han},
journal={arXiv preprint arXiv:2512.14754},
year={2025},
url={https://arxiv.org/abs/2512.14754}
}
[arXiv:2509.23694]
SafeSearch: Automated Red-Teaming for the Safety of LLM-Based Search Agents
TL;DR: To proactively discover the potential vulnerabilities of LLM-based search agents.
@article{dong2025safesearch,
title={SafeSearch: Automated Red-Teaming of LLM-Based Search Agents},
author={Dong, Jianshuo and Guo, Sheng and Wang, Hao and Chen, Xun and Liu, Zhuotao and Zhang, Tianwei and Xu, Ke and Huang, Minlie and Qiu, Han},
journal={arXiv preprint arXiv:2509.23694},
year={2025},
url={https://arxiv.org/abs/2509.23694}
}
[arXiv:2506.21571]
Towards Understanding the Cognitive Habits of Large Reasoning Models
TL;DR: Does reasoning models have human-like cognitive habits?
@article{dong2025cognitive,
title={Towards Understanding the Cognitive Habits of Large Reasoning Models},
author={Dong, Jianshuo and Fu, Yujia and Hu, Chuanrui and Zhang, Chao and Qiu, Han},
journal={arXiv preprint arXiv:2506.21571},
year={2025},
url={https://arxiv.org/abs/2506.21571}
}
[arXiv:2507.04214]
Can Large Language Models Automate the Refinement of Cellular Network Specifications?
TL;DR: Evaluate and improve the performance of LLMs in refining cellular network specifications concerning security issues.
@article{dong2025cellular,
title={Can Large Language Models Automate the Refinement of Cellular Network Specifications?},
author={Dong, Jianshuo and Li, Yuanjie and Liu, Jun and Li, Hewu and Qiu, Han},
journal={arXiv preprint arXiv:2507.04214},
year={2025},
url={https://arxiv.org/abs/2507.04214}
}
Education
2023 - 2028 (expected)
Tsinghua University, Beijing, China — Ph.D. Student
2019 - 2023
Wuhan University, Wuhan, China — B.E.
Teaching
2025 Spring
Trustworthy Machine Learning, Tsinghua University — Teaching Assistant
Professional Services
[ICLR'26]
Reviewer
[ICML'26]
Reviewer
[NeurIPS'25]
Reviewer Top Reviewer
[ICLR'25]
Reviewer Notable