Portfolio

Selected research publications and projects.

Selected Publications

Chen, B., Fang, S., Ji, J., Zhu, Y., Wen, P., Wu, J., ... & Yao, A. (2025). AI Deception: Risks, Dynamics, and Controls. arXiv preprint arXiv:2511.22619. (core contributor)
Sana, S.*, Wu, J.*, & Wells, M. T. (2026). Democratic Preference Alignment via Sortition-Weighted RLHF. arXiv preprint arXiv:2602.05113. (*equal contribution; co-first authors)

Projects

Woods (withwoods.ai)

Woods homepage screenshot

1 / 3

Cornell AI Alignment Website (cornell-aia.org)

Cornell AI Alignment Home

1 / 7