conversational head generation challenge

Leaderboard and Submissions

The Leaderboards of two tracks (sorted by numTop1), and your submissions (visible after Login).

Track #1: Talking Head Generation

#	TeamName	SSIM	CPBD	PSNR	FID	CSIM	ExpFD	AngleFD	TransFD	AVOffset	AVConf	LipLMD	numTop1

Track #2: Listening Head Generation

#	TeamName	SSIM	CPBD	PSNR	FID	CSIM	ExpFD	AngleFD	TransFD	numTop1

People's Selection Awards

Ten experts from production, user experience development and research areas select one team for each track's People's Selection Award ★.

Best Talking Head Results from Team iLearn

Best Listening Head Results from Team en_train

Evaluation

Submission Format

For every sample in the test set, you must generate only one human face video.

Talking Head Generation: For each audios/{uuid}.wav, given the first frame of result first_frames/{uuid}.speaker.jpg and its reference image ref_images/{speaker_id}.jpg, generation head video file named as talkinghead_test_results/{uuid}.speaker.mp4
Listening Head Generation: For each speaker videos/{uuid}.speaker.mp4, given the first frame of result first_frames/{uuid}.listener.jpg and its related listener's reference image ref_images/{listener_id}.jpg, generation listener's head video file named as listeninghead_test_results/{uuid}.listener.mp4

All generated videos should be formated as .mp4 format, and compressed into one [team_name]_(\d+).zip file for each track. Due to the file storage limiation, the competitors are asked to upload their [team_name]_(\d+).zip file to an online storage (e.g. Onedrive, Googledrive, DropBox, BaiduPan, etc.), and submit a public download link to our evaluation system.
**Note: The evaluation results will be sent to the team captain via Email in few hours after the download link is submitted, and the Leaderboard will be updated in realtime.

Evaluation Metrics and Ranking Rules

The quality of generated videos will be quantitative evaluated from the following prespectives:

generation quality (image level): SSIM, CPBD, PSNR
generation quality (feature level): FID
identity preserving: Cosine Similarity (Arcface)
expression: L1 distance of 3dmm exp features
head motion: L1 distance of 3dmm angle & trans features
lip sync (speaker only): AV offset and AV confidence (SyncNet)
lip landmark distance: L1 distance of lip landmarks

Scripts can be accessed from this github repo.
The final ranking is based on the number of ``first place'' across all metrics. Teams in the first three place (teams with same #Top-1 will be tied for the same place) will receive award certificates. Individuals/teams with top submissions or novel solutions will present their work at the ACM MM2022 workshop. Besides the quantification ranking results, we will also ask experts (from production and user experience development area) to select one team for the People's Selection award.

Schedule

We use AoE time (UTC-12) for the following schedule.

June 3, 11:59pm: The website would reject all submissions, we would evaluate the submitted results. "Leaderboard will be disabled".
June 4, 11:59pm: All results must have been sent to participants, and each team can select one result as their "final submission" for each track.
June 5, 11:59pm: The website would reject submission selection requests. Each team MUST send a brief description (including team name, members, method description, training details, information about pretrained models or external data) about their final submission to vico-challenge@outlook.com before this time.
June 6, 11:59pm: Verify all submissions' compliance with the Competition Rules. Final Leaderboard release.
June 8, 11:59pm: User study for all submissions. Challenge award (including top-rank award and experts selected People's Selection award) announce.
June 25, 12:00am: Paper submission deadline. We encourage each team to submit a paper to "Conversational Head Generation Challenge" track of ACM MM2022.
July 7: Grand challenge paper notification of Acceptance. "Best Paper" Award notification.
July 20: MM Grand Challenge Camera Ready papers due.

Competition Rules

Pre-trained models are allowed in the competition. The pre-trained models should be public when the participants submit their results.
Participants are restricted to train their algorithms on ``ViCo'' training set. Collecting additional data for the target identities is not allowed. Collecting additional unlabeled data for pretraining is ok. Please specify any and all external data used for training when uploading results.
Additional annotation on the provided training data is fine (e.g., bounding boxes, keypoints, etc.). Teams should specify that they collected additional annotations when submitting results.
We ask that all participants respect the spirit of the competition and do not cheat. Hand-making is forbidden.
One account per participant. You cannot sign up from multiple accounts and therefore you cannot submit from multiple accounts.
No private sharing outside teams. Privately sharing code or data outside of teams is not permitted.

Paper Submission

Information about challenge paper submission and deadlines

Multimedia grand challenge papers will go through a single-blind review process. Submitted papers (.pdf format) must use the ACM Article Template (https://www.acm.org/publications/proceedings-template) as used by regular ACMMM submissions. Please limit each grand challenge paper submission to 4 pages with 1-2 extra pages for references only. Papers will be handled by a submission site (https://openreview.net/group?id=acmmm.org/ACMMM/2022/Track/Grand_Challenges).

Paper Submission Deadline: 25 June, 2022 12:00AM UTC-0

ViCo