About

Daoyuan Chen (陈道源)

Hi, I'm doing research and development at Data Analytics and Intelligence Lab, Alibaba Tongyi. I earned my Master's degree in Computer Application Technology in June 2019 from Peking University, co-supervised by Ying Shen and Kai Lei (academic mentors), and Yaliang Li (industry mentor).


I've published over 30 technical papers, more than 10 of which I've led as the first author and were presented at top-tier conferences such as ICML, NeurIPS, ICLR, SIGMOD, KDD, ACL and SIGIR.


I’ve learned a lot from the open-source community and am glad to have the opportunity to deeply engage with several open-source projects:

  • • Data-Juicer:
    • Data processing for and with foundation models.
    • Founder, maintainer; co-authored papers: [1-8, 11].
  • • AgentScope:
    • A developer platform for LLM-empowered multi-agent applications.
    • Committer; co-authored paper: [13].
  • • FederatedScope:
    • An easy-to-use FL platform.
    • Founder, committer; co-authored papers: [14-25].

My interests broadly lie in insight- and theory-informed research, simple yet effective systems, and real-world applications related to:

  • • Large Language Models (LLMs)
  • • Multimodal LLMs
  • • Efficient Machine Learning (ML)
  • • Data- and Knowledge-Driven ML
  • • Human-centric ML
  • • Federated Learning (FL)

More specifically, including but not limited to:

  • • Data-model co-development, e.g., building dedicated infrastructures, and exploring generalized feedback signals between them
  • • Algorithms for enhancing data quality, diversity, and usability
  • • Synthetic data for model training and evaluation
  • • Better human-computer interaction, e.g., empathetic dialog, multimodal AIGC, and personalized modeling
  • • On-device solutions via utilizing small models, and addressing privacy issues with FL

Collaborations are welcome; we're currently hiring full-time researchers/developers and self-motivated interns!
Feel free to reach out if you are interested.

Selected

Research

Full lists: Google Scholar and DBLP

Remark: # indicates equal contribution to the first author; ^ indicates industrial mentor to the first student author.

(Multimodal) LLMs, Data-Driven & Human-Centric ML

FL+LLM, On-device & Personalized FL

Efficient, Adaptive, & Knowledge-Driven ML

Working

Experiences

  • 2023 - Now, Data Analytics and Intelligence Lab, Alibaba Tongyi
  • July 2019 - 2023, Data Analytics and Intelligence Lab, Alibaba DAMO Academy
  • Research Intern, March 2018 - June 2018, Tencent Medical AI Lab
  • Research Assistant, October 2016 - August 2017, Multimedia Software Engineering Research Center @ City University of Hong Kong

Professional

Activities

Tutorial Organizer:
  • KDD 2022
  • KDD 2024
Competition Organizer: data leaderboards for (multimodal) LLMs
Competition Participant:
Conference Reviewer:
  • NeurIPS/ICML/ICLR (2022-2025)
  • CVPR/ICCV/ECCV (2023-2025)
  • COLM (2024-2025)
  • KDD (2021-2024)
  • ACL/EMNLP/NAACL (2021-2024)
  • IJCAI/CIKM (2021-2022)
Journal Reviewer:
  • Expert Systems with Applications
  • Neurocomputing
  • Neural Networks
  • Knowledge-Based Systems
  • IEEE Transactions on Big Data
  • Patterns
  • Artificial Intelligence (AIJ)
  • Artificial Intelligence In Medicine

Misc.

Creativity is intelligence having fun.

I enjoy learning new things, playing basketball, guitar, and music (especially R&B and hip-hop).

Contact

Collaborations are welcome; we're currently hiring full-time researchers/developers and self-motivated interns!

Feel free to reach out if you’re interested: daoyuanchen.cdy@alibaba-inc.com, chendaoyuan@pku.edu.cn