Repositories
Most of my research has code open-sourced at GitHub.
Sy-Zhang
Code for the EMNLP 2022 paper "Learning a Grammar Inducer by Watching Millions of Instructional YouTube Videos."
mugen-org
Training, evaluation, and inference code for multimodal video-audio-text generation and retrieval baselines on MUGEN.
Sy-Zhang
Video-aided unsupervised grammar induction, awarded NAACL 2021 Best Long Paper.
microsoft
A collection of video cross-modal models, including code for expanding language-image pretrained models to video.
Sy-Zhang
Code for the ACM MM 2019 paper "Exploiting Temporal Relationships in Video Moment Localization with Natural Language."
Sy-Zhang
Code for the WACV 2017 paper "On Geometric Features for Skeleton-Based Action Recognition Using Multilayer LSTM Networks."
Sy-Zhang
Reimplementation of "LIME: A Method for Low-light Image Enhancement" from ACM MM 2016.