Zhengyuan (Dora) Dong

Zhengyuan (Dora) Dong

Ph.D. Student, Data Systems Group Cheriton School of Computer Science, University of Waterloo

My Research Interests: Data Lake, Model Lake, Multi-agent System, AI for Science

I like jogging, ai for music, ai for productivity, and have two parrots.

๐Ÿƒ๐ŸŽน๐Ÿง‘โ€๐Ÿ’ป๐Ÿฆœ๐Ÿฆœ

I am opening for collaboration, always welcoming discussion.

News

  • 2025 Aug. Attending KDD 2025 โ€” looking forward to connecting with you there.
  • 2025 Jul. Completed one Ph.D. Seminar Talk

Publications

  • LazyVLM: Neuro-Symbolic Approach to Video Analytics Xiangru Jian*, Wei Pang*, Zhengyuan Dong*, Chao Zhang*, M Tamer ร–zsu ,ย  arXiv preprint arXiv:2505.21459 (2025)
  • GraphOmni: A Comprehensive and Extendable Benchmark Framework for Large Language Models on Graph-theoretic Tasks Hao Xu*, Xiangru Jian*, Xinjian Zhao*, Wei Pang*, Chao Zhang, Suyuchen Wang, Qixin Zhang, Zhengyuan Dong, Joao Monteiro, Bang Liu, Qiuzhuang Sun, Tianshu Yu ,ย  arXiv preprint arXiv:2504.12764 (2025)
  • BioMANIA: Simplifying bioinformatics data analysis through conversation Zhengyuan Dong, Victor Zhong, and Yang Lu ,ย  bioRxiv (2023)

Service

Open Source Projects

LazyVLM

LazyVLM

Status: Completed โœ… at Mar 2025. To Be Released

LazyVLM is a neuro-symbolic video analytics system that combines the flexibility of Vision Language Models (VLMs) with the efficiency of symbolic methods. It allows users to query open-domain video data at scale using a semi-structured text interface, decomposing complex video queries into efficient operations for robust and scalable analytics.

BioMANIA

BioMANIA

Status: Completed โœ… at Oct 2023. Updated at Oct 2024

An AI-driven chatbot platform that simplifies bioinformatics data analysis through conversation. Features include front-end and back-end components, extensive data setup, model fine-tuning, and deployment solutions across Docker, Railway, and terminal CLI.

DocLocal

DocLocal

Status: Complete โœ… at Jun 2023

A GUI application that downloads and manages GitHub repository README files locally while offering integrated web search functionality through popular search engines. The tool streamlines documentation access by automatically fetching README files from repositories and displaying them in a user-friendly interface for offline browsing.

Teaching

  • Mentor, CS 399 Readings in Computer Science (F25)
  • Teaching Assistant, CS 348 Introduction to Database Systems (S24, S25, F25)
  • Teaching Assistant, CS 136 Elementary Algorithm Design and Data Abstraction (W24, F24, W25)

Honors

  • Prov-Doc Entrance Award, University of Waterloo, 2024
  • International Doctoral Student Award (IDSA), University of Waterloo, 2024

Talks

  • 2025 Jul. Model Discovery at DSG Talk & Ph.D. Seminar Talk, University of Waterloo
  • 2025 Mar. Introduction to Data Lakes and Tabular Data in NLP at R2L Lab
  • 2024 Oct. Scientific Discovery Agent at R2L Lab
  • 2024 Jul. BioMANIA: Simplifying Bioinformatics Data Analysis (Poster) at ISMB
  • 2024 Jan. Language Model Pretraining at CS886 Presentation