About Me

I am currently a Researcher and Engineer in Zoom, based in Singapore. Before joining Zoom, I was a Postdoctoral Fellow in the Hong Kong Polytechnic University, supervised by Prof. Kin-Man Lam. Before PhD graduation, I was interned in Microsoft Research Asia.

Research Interests

  • AIGC (post-training): image/video generation, world model, efficient generative models, and preference optimization.
  • Low-level vision processing: image/video super-resolution, image/video denoising, and high dynamic range imaging.
  • 3D vision: generative-based 3D reconstruction

I am actively seeking opportunities for research collaboration. Please feel free to reach out via email at jun.xiao@connect.polyu.hk.

Work Experience

2024 – PresentResearch Engineer, Zoom Video Communications, Singapore
Project 1: Logo-driven image generation — subject-consistency score 0.855; generation within 2s at 1280×720.
Project 2: Text-driven style transfer for human images — human-ID consistency 0.85; style consistency 0.63; generation within 1s at 512×512.
2022 – 2024Postdoctoral Fellow, The Hong Kong Polytechnic University, Hong Kong SAR
Research project: Led the project “efficient old movie 4K photorealistic restoration” — designed an efficient video restoration model for low-resolution old movies without ground truth.
Results: 4× faster processing; ~85% acceptance rate; delivered a cinema-released film (Nomad).
2021 – 2022Research Intern, Microsoft Research Asia, Shanghai
Research project: Involved in the project “online video restoration and enhancement system” — designed an efficient, lightweight super-resolution model for online streaming videos.
Results: 110 FPS for 720p videos on a 2080Ti GPU; only 1.75M model parameters; published in IEEE Transactions on Multimedia.
Award: MSRA Stars of Tomorrow (Award for Excellent Intern).

Download my CV .

Education
  • Ph.D., The Hong Kong Polytechnic University (PolyU), 2018.09 -- 2022.10

  • M.S.c (with distinction), The Hong Kong Polytechnic University (PolyU), 2016.09 -- 2018.03

  • BSc, Guangdong University of Technology, 2012.09 -- 2016.06

Publications

(2026). Dynamic Mutual Learning for Object Detection in Aerial Imagery. In IEEE TGRS.

(2026). Exploring Basic Expression Representation for Compound Facial Expression Recognition. In IEEE TMM.

(2025). Geometric Distortion Guided Transformer for Omnidirectional Image Super-resolution. In IEEE TCSVT.

(2025). See in Detail: Enhancing Sparse-view 3D Gaussian Splatting with Local Depth and Semantic Regularization. In ICASSP.

(2025). Photometric Regularization for 3D Gaussian Splatting in Multi-view Surface Projection. In IEEE J-STSP.

(2024). Point Cloud Densification for 3D Gaussian Splatting from Sparse Input Views. In ACM MM.

(2024). Bridging Text and Image for Artist Style Transfer via Contrastive Learning. In ECCV-W.

(2024). Frequency-Aware Guidance for Blind Image Restoration via Diffusion Models. In ECCV-W.

(2024). Learning Equilibrium Transformation for Gamut Expansion and Color Restoration. In ECCV.

(2024). Towards Multi-View Consistent Style Transfer with One-Step Diffusion via Vision Conditioning. In ECCV-W.

(2024). Towards Progressive Multi-Frequency Representation for Image Warping. In CVPR.

(2024). Deep Progressive Feature Aggregation Network for Multi-frame High Dynamic Range Imaging. In Neurocomputing.

(2024). Deep Multi-scale Feature Mixture Model for Image Super-resolution with Multiple-Focal-length Degradation. In Signal Processing: Image Communication.

(2023). Improving Robustness of Single Image Super-Resolution Models with Monte Carlo Method. In ICIP.

PDF Cite DOI

(2023). Boosting Object Detectors via Strong-Classification Weak-Localization Pretraining in Remote Sensing Imagery. In IEEE Transactions on Instrumentation and Measurement (TIM).

PDF Cite DOI

(2023). Online Video Super-Resolution with Convolutional Kernel Bypass Grafts. In IEEE Transaction on Multimedia (TMM).

PDF Cite

(2023). Efficient Feature Fusion for Learning-based Photometric Stereo. In ICASSP.

PDF Cite DOI

(2021). Self-feature Learning: An Efficient Deep Lightweight Network for Image Super-resolution. In ACM-MM.

PDF Cite

(2021). Progressive and Selective Fusion Network for High Dynamic Range Imaging. In ACM-MM.

PDF Cite

(2021). Invertible image decolorization. In IEEE Transactions on Image Processing (TIP).

PDF Cite DOI

(2021). Feature redundancy mining: Deep light-weight image super-resolution model. In ICASSP.

PDF Cite DOI

(2021). Bayesian sparse hierarchical model for image denoising. In Signal Processing:Image Communication.

PDF Cite DOI

(2020). Progressive Motion Representation Distillation With Two-Branch Networks for Egocentric Activity Recognition. In Signal Processing Letter.

PDF Cite DOI

(2019). Deep Progressive Convolutional Neural Network for Blind Super-Resolution With Multiple Degradations. In ICIP.

PDF Cite DOI