Learning to See Through Obstructions

See-Through Captions: Real-Time Captioning on Transparent Display for Deaf and Hard-of-Hearing People

Estimation of continuous valence and arousal levels from faces in naturalistic conditions

Fashionpedia: Ontology, Segmentation, and an Attribute Localization Dataset

C-Space Tunnel Discovery for Puzzle Path Planning

Filter Style Transfer between Photos

Dynamic facial asset and rig generation from a single scan

SpeedNet: Learning the Speediness in Videos

OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

Monocular Real-Time Volumetric Performance Capture

Controlling Style and Semantics in Weakly-Supervised Image Generation

Learned Motion Matching

Neural Light Transport for Relighting and View Synthesis

Full-Body Awareness from Partial Observations

Complementary Dynamics

Non-Local Musical Statistics as Guides for Audio-to-Score Piano Transcription

Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation

Multimodal Humor Dataset: Predicting Laughter tracks for Sitcoms

BARF: Bundle-Adjusting Neural Radiance Fields

Animating Pictures with Eulerian Motion Fields

End-to-End Object Detection with Transformers(DETR)

DDPM - Diffusion Models Beat GANs on Image Synthesis

DDPM - Denoising Diffusion Probabilistic Models

XCiT: Cross-Covariance Image Transformers

Involution: Inverting the Inherence of Convolution for Visual Recognition

Alias-Free Generative Adversarial Networks

MakeItTalk: Speaker-Aware Talking-Head Animation

One-Shot Free-View Neural Talking-Head Synthesis for Video Conferencing

TeethTap: Recognizing Discrete Teeth Gestures Using Motion and Acoustic Sensing on an Earpiece

Transferring Dense Pose to Proximal Animal Classes

A Simple Framework for Contrastive Learning of Visual Representations(自己教師学習)

Whole-Body Human Pose Estimation in the Wild

Zero-Shot Text-to-Image Generation(DALL·E)