Pre-training for Video Understanding Challenge

Introduction

The goal of this challenge is to offer a fertile ground for designing pre-training techniques that facilitate a series of video understanding downstream tasks (e.g., video captioning and video categorization this year). Hence, two tracks (Pre-training for Video Captioning and Pre-training for Video Categorization) will be involved in this grand challenge. Meanwhile, to further motivate and challenge the multimedia community, we provide two large-scale video pre-training datasets, i.e., Auto-captions on GIF (ACTION) and the Weakly-Supervised dataset, for contestants to solve this challenging but emerging task in each track.

Particularly, in the first track, the contestants are asked to develop video captioning system based on ACTION dataset (as pre-training data) and the public MSR-VTT benchmark (as training data for downstream task). For the evaluation purpose, a contesting system is asked to produce at least one sentence of the test videos. The accuracy will be evaluated against human pre-generated sentence(s).

In the second track, the contestants are asked to develop video categorization system based on the Weakly-Supervised dataset (as pre-training data) and the provided Downstream dataset (as training data for downstream task). During evaluation stage, a contesting system is asked to predict the category of the test videos. The accuracy will be evaluated against human annotated categories.

Pre-training for Video Captioning Track

This monkey on the back of horse

Disney made the best cake of all time using projection

The dry driver returns to his car and presents his mate with kebab

Tiny squid flopping around on the rocky bottom of fish tank

Pre-training for Video Categorization Track

Query: brushing teeth

Sentence: Disney Jr Puppy Dog Pals Morning Routine Brushing Teeth, Taking a Bath, and Eating Breakfast!

Query: bowling

Sentence: Dude Perfect Thanksgiving Turkey Bowling | FACE OFF

Query: archery

Sentence: Can You Shoot an Apple Off Your Head? (William Tell Archery Challenge)

Query: tiger cat

Sentence: BIG CATS like boxes too!

Important Dates

· March 15, 2022: Web Site and Call for Participation Ready
· March 25, 2022: Dataset available for download (pre-training, training, and validation set)
· June 10, 2022: Test set available for download
· June 17, 2022: Results submission
· June 18 - June 19, 2022: Objective evaluation
· June 20, 2022: Evaluation results announce
· June 25, 2022: Paper submission deadline

Pre-training for Video Understanding Challenge @ACM Multimedia 2022

Pre-training for Video Understanding Challenge
@ACM Multimedia 2022