Xiao Zhang
I'm a final-year CS Ph.D student at the University of Chicago, where I primarily work with Michael Maire on computer vision.
During my Ph.D., I focused on unsupervised representation learning and generative models. My research aims to develop scalable algorithms for representation learning applicable to real-world scenarios and complex image/video generation.
Before starting my Ph.D., I worked as a graduate research assistant with Jianbo Shi.
Email /
Resume /
Scholar /
Twitter /
Github
|
|
|
Nested Diffusion Models Using Hierarchical Latent Priors
Xiao Zhang*, Roxie Jiang*, Rebecca Willett, Michael Maire
CVPR, 2025
An efficient and
powerful hierarchical generative framework that employs
a series of diffusion models to progressively generate latent
variables at different semantic levels. Each model in this
series is conditioned on the output of the preceding higher
level models, culminating in image generation
|
|
PROGRESSOR: A Perceptually Guided Reward Estimator with Self-Supervised Online Refinement
Tewodros Ayalew, Xiao Zhang*, Kevin Yuanbo Wu*, Tianchong Jiang, Michael Maire, Matthew R. Walter
In Submission, 2024
We learn a task-agnostic reward function from videos, enabling policy training through goal-conditioned reinforcement learning without manual supervision.
We train a model to estimate of the distribution over task progress and refines rewards adversarially during online RL training by pushing back predictions for out-of-distribution observations.
|
|
Latent Intrinsics Emerge from Training to Relight
Xiao Zhang, Will Gao, Seemandhar Jain, Michael Maire, David Forsyth, Anand Bhattad
NeurIPS, 2024 (Spotlight)
Code
A relighting method that is entirely data-driven, where intrinsics and lighting are each represented as latent variables. Our approach produces SOTA relightings of real scenes. We show that albedo can be recovered from our latent intrinsics without using any example albedos, and that the albedos recovered are competitive with SOTA methods.
|
|
Residual Connections Harm Self-Supervised Abstract Feature Learning
Xiao Zhang*, Roxie Jiang*, Will Gao, Rebecca Willett, Michael Maire
In Submission, 2024
We show that incorporating a weighting factor to reduce the strength of identity shortcuts within residual networks significantly enhances semantic feature learning in the masked autoencoding (MAE) framework. This modification promotes low-rank representations at the bottleneck layers and increases the MAE linear probing accuracy on ImageNet from 67.8% to 72.8%..
|
|
Deciphering 'What' and 'Where' Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations
Xiao Zhang*, David Yunis*, Michael Maire
CVPR, 2024 (Highlight)
Code
We introduce a method for examining the grouping information embedded in a neural network's activations, allowing for the extraction of spatial layout and semantic segmentation from the behavior of large pre-trained vision models. Our approach provides tools for a comprehensive analysis of the model's behavior across datasets. In the attention layers, we demonstrate that key-query similarity encodes scene spatial layout, while value vector similarity encodes object identity.
|
|
Structural Adversarial Objectives for Self-Supervised Representation Learning
Xiao Zhang, Michael Maire
In Submission, 2024
Code
Within the framework of GANs, we propose objectives that task the discriminator for self-supervised representation learning via additional structural modeling responsibilities. Operating as a feature learner within the GAN framework frees our self-supervised system from the reliance on hand-crafted data augmentation schemes.
|
|
Self-supervised Visual Representation Learning From Hierarchical Grouping
Xiao Zhang, Michael Maire
NeurIPS, 2020 (Spotlight)
Code
We develop a framework for visual representation learning using a basic grouping capability. This grouping is implemented through a contour detector that divides an image into regions, which are then merged into a tree hierarchy. We train a self-supervised feature learning system by generating per-pixel embeddings that respect the hierarchical relationships between regions.
|
|
Normalized diversification
Shaohui Liu* Xiao Zhang*, Jianqiao Wangni, Jianbo Shi
CVPR, 2019
Code
We introduce the concept of normalized diversity which force the model to preserve the normalized pairwise distance between the sparse samples from a latent parametric distribution and their corresponding high-dimensional outputs. The normalized diversification aims to unfold the manifold of unknown topology and non-uniform distribution, which leads to safe interpolation between valid latent variables.
|
|