Fan Yin

Fan Yin (银帆)

Hi, I am a Research Scientist at Google Deepmind, working on Gemini post-training and tool-use. I obtained my PhD from the Department of Computer Science at UCLA, where I was fortunate to be part of the UCLA-NLP group and advised by Prof.Kai-Wei Chang. During my PhD, I worked on making NLP systems more reliable and robust. Before that, I received my B.S. degree in Computer Science from Peking University in 2020.

Email / linkedin / Github / CV / Google Scholar

Education

Ph.D.          September 2020 - May 2025
                 University of California Los Angeles (UCLA), Los Angeles, CA, U.S.
                 Ph.D. student in Computer Science

B.S.              September 2016 - June 2020
                 Peking University (PKU), Beijing, China.
                 B.S. in Computer Science

Intern Experience

Oct 2024 -- April 2025, Google LLC
Student Researcher.
Jun 2024 -- Sep 2024, Salesforce, Palo Alto
Research Intern. Mentors: Philippe Laben and Becky Xiangyu Peng. Manager: Jason Wu.
Jun 2023 -- Sep 2023, Amazon AWS, Santa Clara
Applied Scientist Intern. Mentors: He He and Samson Tan. Manager: Aditya Rawal.
Jun 2022 -- Sep 2022, Salesforce Research, Palo Alto
Research Intern. Mentors: Jesse Vig and Philippe Laban. Manager: Jason Wu.

Preprints

Alignment Data Curation with Influence Function

In progress
[ paper | code ]

OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement

Yihe Deng, Hritik Bansal, Fan Yin, Nanyun Peng, Wei Wang, Kai-Wei Chang, Wei Wang
arxiv
[ paper | code ]

Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller

Min Cai, Yuchen Zhang, Shichang Zhang, Fan Yin, Difan Zou, Yisong Yue, Ziniu Hu
arxiv
[ paper | code ]

Evaluating Human Alignment and Model Faithfulness of LLM Rationale

Mohsen Fayyaz, Fan Yin, Jiao Sun, Nanyun Peng
arxiv
[ paper | code ]

Publications

Magnet: Multi-turn Tool-use Data Synthesis and Distillation via Graph Translation

Fan Yin, Zifeng Wang, I-Hung Hsu, Jun Yan, Ke Jiang, Yanfei Chen, Jindong Gu, Long T. Le, Kai-Wei Chang, Chen-Yu Lee, Hamid Palangi, Tomas Pfister
ACL 2025
[ paper | code ]

BingoGuard: LLM Content Moderation Tools with Risk Levels

Fan Yin, Philippe Laban, Xiangyu Peng, Yilun Zhou, Yixin Mao, Vaibhav Vats, Linnea Ross, Divyansh Agarwal, Caiming Xiong, Chien-Sheng Wu
ICLR 2025
[paper | code ]

Synchronous Faithfulness Monitoring for Trustworthy Retrieval-Augmented Generation

Di Wu, Jia-Chen Gu, Fan Yin, Nanyun Peng, Kai-Wei Chang
EMNLP 2024
[ paper | code ]

Enhancing Large Vision Language Models with Self-Training on Image Comprehension

Yihe Deng, Pan Lu, Fan Yin, Ziniu Hu, Sheng Shen, James Zou, Kai-Wei Chang, Wei Wang
NeurIPS 2024
[ paper | code ]

Characterizing Truthfulness in Large Language Model Generations with Local Intrinsic Dimension

Fan Yin, Jayanth Srinivasa, Kai-Wei Chang
ICML 2024
[ paper | code ]

Prompt-Driven LLM Safeguarding via Directed Representation Optimization

Chujie Zheng, Fan Yin, Hao Zhou, Fandong Meng, Jie Zhou, Kai-Wei Chang, Minlie Huang, Nanyun Peng
ICML 2024
[ paper | code ]

Red Teaming Language Model Detectors with Language Models

Fan Yin*, Zhouxing Shi*, Yihan Wang*, Xiangning Chen, Kai-Wei Chang, Cho-Jui Hsieh
TACL, equal contribution, ordered alphabetically
[ paper | code ]

Active Instruction Tuning: Improving Cross-Task Generalization by Training on Prompt Sensitive Tasks

Po-Nien Kung, Fan Yin, Di Wu, Kai-Wei Chang, Nanyun Peng
EMNLP 2023
[ paper | code ]

Dynosaur: A Dynamic Growth Paradigm for Instruction-Tuning Data Curation

Da Yin*, Xiao Liu*, Fan Yin*, Ming Zhong*, Hritik Bansal, Jiawei Han, Kai-Wei Chang
EMNLP 2023, equal contribution
[ paper | code ]

Did You Read the Instructions? Rethinking the Effectiveness of Task Definitions in Instruction Learning

Fan Yin, Jesse Vig, Philippe Laban, Shafiq Joty, Caiming Xiong, Chien-Sheng Jason Wu
ACL 2023
[ paper | code ]

Efficient Shapley Values Estimation by Amortization for Text Classification

Chenghao Yang, Fan Yin, He He, Kai-Wei Chang, Xiaofei Ma, Bing Xiang
ACL 2023
[ paper | code ]

CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning

Hritik Bansal, Nishad Singhi, Yu Yang, Fan Yin, Aditya Grover, Kai-Wei Chang
ICCV 2023, Oral | Best Paper Award at Trustworthy and Reliable Large-Scale ML@ICLR 2023
[ paper | code ]

ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation

Fan Yin, Yao Li, Cho-Jui Hsieh, Kai-Wei Chang
EMNLP 2022
[ paper | code ]

On the Sensitivity and Stability of Model Interpretations in NLP

Fan Yin, Zhouxing Shi, Cho-Jui Hsieh, Kai-Wei Chang
ACL 2022
[ paper | code ]

On the Robustness of Language Encoders against Grammatical Errors

Fan Yin, Quanyu Long, Tao Meng, Kai-Wei Chang
ACL 2020
[ paper | code ]

Entity-relation Extraction as Multi-turn Question Answering

Xiaoya Li*, Fan Yin*, Zijun Sun, Xiayu Li, Arianna Yuan, Duo Chai, Mingxin Zhou, Jiwei Li
ACL 2019, equal contribution
[ paper | code ]

Glyce: Glyph-vectors for Chinese Character Representations

Yuxian Meng, Wei Wu, Fei Wang, Xiaoya Li, Ping Nie, Fan Yin, Muyu Li, Qinghong Han, Xiaofei Sun, Jiwei Li
NeurIPS 2019
[ paper | code ]

Academic Services

Journal/Conference Reviewer: NeurIPS 2022, ACL 2022, NAACL 2022, NAACL Student Research Workshop 2022

Teaching

Teaching Associate position, UCLA CS M146, Introduction to Machine learning, Fall 2022, with Prof. Kai-Wei Chang
Teaching Assistant, UCLA CS M146, Introduction to Machine learning, Fall 2021, with Prof. Kai-Wei Chang
Teaching Assistant, UCLA CS M146, Introduction to Machine learning, Winter 2022, with Prof. Sriram Sankararaman
Teaching Assistant, UCLA CS M146, Introduction to Machine learning, Spring 2022, with Prof. Aditya Grover

Awards

Excellent Graduate, Peking University, 2020.
Merit Student, Peking University, 2019.