UNC Crowd Scene Analysis Dataset
Overview video |
Synthetic Videos Labels: High density, Max Pedestrian Count: 382, Good Light Condition, Indoor Environment, Simulated Background, Crowd Behavior: Aggressive, Congestion |
The main goal of our work is to produce a rich labeled dataset for machine learning. In order to learn high-level features from videos or images, for example performing crowd counting and motion segmentation, a ground truth is often necessary for training or verifies the results. Eventually, we will produce over a million videos, and over 20 million images with ground truth labeled. Our video dataset is generated with a diversity of variations, including:
Density | Pedestrian Count | Lighting Condition | Camera Viewpoint | Environment (Indoor vs Outdoor) | Background (Real vs Simulated) | Crowd Behaviour (Aggresive vs Shy) |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |