Skip to content

Codec Avatars

Codec Avatars is a technology for metric telepresence that enables immersive social presence indistinguishable from reality.

What are Codec Avatars?

Codec Avatars allow people in different places to interact using eye contact, subtle shifts in expression, posture and gesture. This fully embodied interaction enables natural and expressive remote communication.

Why open source?

The Codec Avatars lab at Meta Reality Labs Research has been building the future of connection with lifelike avatars since 2015, and has shared many of its results and methods with the research community.

Through this site, Meta Reality Labs Research provides the research community with datasets and baseline reference implementations for Codec Avatars, supporting the advancement of metric telepresence research. Using the code and models we share, researchers are empowered to investigate open challenges in metric telepresence including:

  • Generalization of universal priors to new identities
  • Online encoder adaptation
  • Improving quality for clothing and hair

Research

A collection of images showing up to 4 people in a room acting out scenes. They each have an overlayed digital mesh.

4D Talking Avatar

Transforming text to audio-visual human-like interactions

A collection of images showing up to 4 people in a room acting out scenes. They each have an overlayed digital mesh.

Embody 3D

A large-scale multimodal motion and behavior dataset

A collage of three digital head and shoulders, with closeups of the hair and mouths.

Ava-256 dataset

First dataset for end-to-end telepresence.

Two pictures of a face overlayed, a white line down the middle indicated where each image begins and ends. The left side is labeled Real Person, the right is labeled Avatar. There is almost no difference between the two.

Goliath-4 dataset

First complete captures of full bodies, hands and faces.

Three rows of images show the different digital maps, such as pose, segmentation, and depth, of three different people.

Sapiens

Foundation models for human vision.

Other OSS publications and releases

Twelve small images with the same digital face in various expressions.
Multiface

High-quality recordings of the faces of 13 people

Six digital variations of hairstyles.
CT2Hair

High-fidelity 3D hair modeling using computed tomography

A digital hand clasped together with the index and middle fingers pointing forward.
Interhand2.6M

Dataset and baseline for 3D interacting hand post estimation

Two lifelike digital hands on a black brackground. The fingers on the left had are grouped together and press into the open palm of the right hand.
Re:Interhand

Dataset of relightable 3D interacting hands

A wide, fisheye style, photo of objects on a table.
Eyeful

High-quality indoor scenes for neural reconstruction

A person stood with a digital multi colored box surrounding them from the waist down.
Sounding bodies

Modeling 3D spatial sound of humans using body pose and audio

A close up photo of folded cloth with tiny, brightly colored, squares forming a random pattern.
PatternedClothing

4 subjects wearing patterned clothes for high-quality registration