publications

* indicates equal contribution; ^ denotes the student I mentored.

2024

  1. preprint
    Dual Modalities of Text: Visual and Textual Generative Pre-training
    Yekun Chai ,  Qingyi Liu^ ,  Jingwu Xiao^ ,  Shuohuan Wang , and 2 more authors
    2024
  2. preprint
    On Training Data Influence of GPT Models
    Qingyi Liu*^ ,  Yekun Chai*Shuohuan Wang ,  Yu Sun , and 3 more authors
    2024
  3. preprint
    Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
    Taishi Nakamura ,  Mayank Mishra ,  Simone Tedeschi ,  Yekun Chai , and 41 more authors
    2024
  4. preprint
    StarCoder 2 and The Stack v2: The Next Generation
    Anton Lozhkov ,  Raymond Li ,  Loubna Ben Allal ,  Federico Cassano , and 62 more authors
    2024
  5. ICML
    Interpreting Natural Language Generation via Optimal Transport
    Xuhong Li* ,  Jiamin Chen* ,  Yekun Chai* ,  and  Haoyi Xiong
    In The Forty-first International Conference on Machine Learning , 2024
  6. LREC-COLING
    HumanEval-XL: A Multilingual Code Generation Benchmark for Cross-lingual Natural Language Generalization
    Qiwei Peng*Yekun Chai* ,  and  Xuhong Li
    In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) , May 2024
  7. ICLRSpotlight
    Tool-Augmented Reward Modeling
    Lei Li*^ ,  Yekun Chai*Shuohuan Wang ,  Yu Sun , and 3 more authors
    In The Twelfth International Conference on Learning Representations , May 2024

2023

  1. ACLFindings
    ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages
    Yekun ChaiShuohuan Wang ,  Chao Pang ,  Yu Sun , and 2 more authors
    In Findings of the Association for Computational Linguistics: ACL 2023 , Jul 2023
  2. NeurIPSDatasets and Benchmarks
    M4: A Unified XAI Benchmark for Faithfulness Evaluation of Feature Attribution Methods across Metrics, Modalities and Models
    Xuhong LiMengnan Du ,  Jiamin Chen ,  Yekun Chai , and 2 more authors
    In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track , Jul 2023
  3. IJCNLP-AACLDemos
    ERNIE-Music: Text-to-Waveform Music Generation with Diffusion Models
    Pengfei Zhu ,  Chao Pang ,  Yekun Chai ,  Lei Li^ , and 4 more authors
    In Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics: System Demonstrations , Nov 2023
  4. ICASSPOral
    Improved Training of Mixture-of-Experts Language GANs
    Yekun ChaiQiyue Yin ,  and  Junge Zhang
    In 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , Nov 2023

2022

  1. EMNLPFindings
    Clip-Tuning: Towards Derivative-free Prompt Learning with a Mixture of Rewards
    Yekun ChaiShuohuan Wang ,  Yu Sun ,  Hao Tian , and 2 more authors
    In Findings of the Association for Computational Linguistics: EMNLP 2022 , Dec 2022
  2. ACLOral
    Predicate-Argument Based Bi-Encoder for Paraphrase Identification
    Qiwei PengDavid Weir ,  Julie Weeds ,  and  Yekun Chai
    In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , May 2022
  3. SemEval
    X-PuDu at SemEval-2022 Task 6: Multilingual Learning for English and Arabic Sarcasm Detection
    Yaqian Han ,  Yekun ChaiShuohuan Wang ,  Yu Sun , and 4 more authors
    In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022) , Jul 2022

2021

  1. EMNLPFindings
    Counter-Contrastive Learning for Language GANs
    Yekun Chai ,  Haidong Zhang ,  Qiyue Yin ,  and  Junge Zhang
    In Findings of the Association for Computational Linguistics: EMNLP 2021 , Nov 2021

2020

  1. ACL
    Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
    Yekun Chai ,  Shuo Jin ,  and  Xinwen Hou
    In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics , Jul 2020