Repositories

Most of my research has code open-sourced at GitHub.

PTC-PCFG

Sy-Zhang

Python

Code for the EMNLP 2022 paper "Learning a Grammar Inducer by Watching Millions of Instructional YouTube Videos."

EMNLP 2022
grammar induction instructional video
Python

Training, evaluation, and inference code for multimodal video-audio-text generation and retrieval baselines on MUGEN.

ECCV 2022
multimodal generation retrieval
MMC-PCFG

Sy-Zhang

Python

Video-aided unsupervised grammar induction, awarded NAACL 2021 Best Long Paper.

NAACL 2021
grammar induction best paper
VideoX

microsoft

Python

A collection of video cross-modal models, including code for expanding language-image pretrained models to video.

ECCV 2022
video recognition cross-modal learning
Python

Code for the ACM MM 2019 paper "Exploiting Temporal Relationships in Video Moment Localization with Natural Language."

ACM MM 2019
moment localization video-language

Code for the WACV 2017 paper "On Geometric Features for Skeleton-Based Action Recognition Using Multilayer LSTM Networks."

WACV 2017
action recognition skeleton features
LIME

Sy-Zhang

MATLAB

Reimplementation of "LIME: A Method for Low-light Image Enhancement" from ACM MM 2016.

ACM MM 2016
image enhancement low-light imaging