Seguir
Youliang Yuan 袁尤良😄
Youliang Yuan 袁尤良😄
PhD student of Computer Science, The Chinese University of Hong Kong (Shenzhen)
Dirección de correo verificada de link.cuhk.edu.cn - Página principal
Título
Citado por
Citado por
Año
GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
Y Yuan, W Jiao, W Wang, J Huang, P He*, S Shi, Z Tu
ICLR 2024, 2023
1602023
On the Humanity of Conversational AI: Evaluating the Psychological Portrayal of LLMs
J Huang, W Wang, EJ Li, MH Lam, S Ren, Y Yuan, W Jiao, Z Tu, MR Lyu
ICLR 2024 (Oral), 2023
59*2023
All Languages Matter: On the Multilingual Safety of LLMs
W Wang, Z Tu, C Chen, Y Yuan, J Huang, W Jiao, MR Lyu
ACL 2024 Findings, 2023
47*2023
How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
J Huang, EJ Li, MH Lam, T Liang, W Wang, Y Yuan, W Jiao, X Wang, Z Tu, ...
ICLR 2025, 2024
372024
LogicAsker: Evaluating and Improving the Logical Reasoning Ability of Large Language Models
Y Wan, W Wang, Y Yang, Y Yuan, J Huang, P He, W Jiao, MR Lyu
EMNLP 2024, 2024
22*2024
Refuse whenever you feel unsafe: Improving safety in llms via decoupled refusal training
Y Yuan, W Jiao, W Wang, J Huang, J Xu, T Liang, P He, Z Tu
arXiv preprint arXiv:2407.09121, 2024
102024
New Job, New Gender? Measuring the Social Bias in Image Generation Models
W Wang, H Bai, J Huang, Y Wan, Y Yuan, H Qiu, N Peng, MR Lyu
ACM MM 2024 (Oral), 2024
72024
The earth is flat? unveiling factual errors in large language models
W Wang, J Shi, Z Tu, Y Yuan, J Huang, W Jiao, MR Lyu
arXiv preprint arXiv:2401.00761, 2024
52024
Libra-leaderboard: Towards responsible ai through a balanced leaderboard of safety and capability
H Li, X Han, Z Zhai, H Mu, H Wang, Z Zhang, Y Geng, S Lin, R Wang, ...
arXiv preprint arXiv:2412.18551, 2024
32024
Does ChatGPT Know That It Does Not Know? Evaluating the Black-Box Calibration of ChatGPT
Y Yuan, W Wang, Q Guo, Y Xiong, C Shen, P He
COLING 2024 (Oral), 5191-5201, 2024
32024
Learning to ask: When llms meet unclear instruction
W Wang, J Shi, C Wang, C Lee, Y Yuan, J Huang, MR Lyu
arXiv preprint arXiv:2409.00557, 2024
22024
On the resilience of multi-agent systems with malicious agents
J Huang, J Zhou, T Jin, X Zhou, Z Chen, W Wang, Y Yuan, M Sap, MR Lyu
arXiv preprint arXiv:2408.00989, 2024
22024
Difficult Task Yes but Simple Task No: Unveiling the Laziness in Multimodal LLMs
S Zhao, Y Yuan, X Tang, P He
EMNLP 2024 Findings, 2024
12024
Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step
W Wang, K Gao, Z Jia, Y Yuan, J Huang, Q Liu, S Wang, W Jiao, Z Tu
arXiv preprint arXiv:2410.03869, 2024
12024
Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs
X Liu, W Wang, Y Yuan, J Huang, Q Liu, P He, Z Tu
arXiv preprint arXiv:2410.08145, 2024
2024
El sistema no puede realizar la operación en estos momentos. Inténtalo de nuevo más tarde.
Artículos 1–15