Long videos sampled on GQN-Mazes and MineRL by iterated application of our Hierarchy-2 sampling scheme, and a CARLA Town 01 video sampled with an autoregressive sampling scheme.
Arrays of 30-1000 second videos
CARLA Town01. Blocks of sampled completions with (1) FDM with Autoreg, (2) FDM with Hierarchy-2, and (3) CWVAE. Within each block each column shows completions for a different test video. The top row is the ground-truth and the second row contains sampled completions.
MineRL. Blocks of sampled completions with (1) FDM with Autoreg, (2) FDM with Hierarchy-2, and (3) CWVAE. Within each block each column shows completions for a different test video. The top row is the ground-truth and all other rows are sampled completions.
GQN-Mazes. Blocks of sampled completions with (1) FDM with Autoreg, (2) FDM with Hierarchy-2, and (3) CWVAE. Within each block each column shows completions for a different test video. The top row is the ground-truth and all other rows are sampled completions.
And unconditional samples (i.e. not conditioned on the first few frames)