//
🏃‍♂️
NTU Outdoor Dataset (Partially Released)
Search
Try Notion
🏃‍♂️🏃‍♂️
NTU Outdoor Dataset (Partially Released)
The existing public datasets have a very limited number of cameras ranging from 6 to 15. The small camera number reduces the variation and diversity of a person under different backgrounds, illumination or camera colour profiles. Overall, it lowers the difficulty of searching and matching person across different cameras. As a result, the recently proposed Person ReID algorithms can achieve more than 90% rank-1 accuracy in both Market1501 and DukeMTC-reID. However, a real-world surveillance system usually consists of over hundreds of cameras. Creating a more realistic dataset is the foundation for developing a more generalized and more robust Person ReID model for real-world applications.
NTU-Outdoor Dataset Collection
The new NTU-Outdoor dataset was collected within the NTU campus by using actual surveillance cameras installed on lamp posts. There are a total of 34 camera groups, with each camera group containing two to four cameras pointing in different directions. The location of all camera groups in NTU can be found in Figure 1.
Figure 1. Camera Group Locations
Based on all 34 camera groups, we have designed 8 paths to let participants walk past the cameras, as shown in Figure 2. The 8 designed paths contain 51 cameras from 23 camera groups. A total of 332 NTU students, staff and residents have participated in this NTU-Outdoor dataset collection.
Figure 2. Eight Paths for Data Collection
Annotating a Person ReID dataset is a tedious, label-intensive and time-consuming job. In order to increase annotation efficiency and reduce the data annotation effort, we have developed a mobile web app for this NTU-Outdoor dataset collection. By running our web app in the smart-phone, the GPS information of the participant and timestamps of the participant passing each camera can be automatically recorded in the web app. It significantly reduced the searching time window for our annotators. The design of our web app is demonstrated in Figure 3 below.
Figure 3. NTU Outdoor Data Collection Web App Design
After phase 1 annotation, 26,175 three-minutes long videos clips have been extracted from the raw surveillance videos. From those video clips, a total of 45,397 bounding box images of pedestrians have been successfully annotated with 40 additional attributes labels. These images are collected from 278 different people (unique identities) with 805 different appearances on multiple-days over an 8 week period.
Limitations of Existing Public Datasets
1. Limited Number of Cameras
Much attention has been paid in recent years to the problem of person-based re-identification (Person Re-ID). Existing Person Re-ID algorithms are typically trained and evaluated on four large-scale public datasets: Market1501, DukeMTMC-reID, and MSMT17. All these datasets are obtained from a very limited number of cameras ranging from 6 to 15, as shown in Table 1 below. However, real-world surveillance systems usually consist of hundreds of cameras resulting in the images obtained are much more dynamic in nature. Our NTU-Outdoor campus dataset consists of a total of 51 different cameras covering the entire 2 km2 NTU campus area which gives the most dynamic changes in the image background.
Dataset
Market-1501
DukeMTMC-reID
MSMT17
NTU Outdoor
Camera
6
8
15
51
Table 1: Number of Cameras for Recent Person ReID Datasets
2. Actual Surveillance Viewing Angles
Dataset Market1501 and MSMT17 are using non-surveillance cameras mounted on tripods for video recording which result in a near-horizontal point of view of all captured persons, as shown in Figure 3 (a). However, in actual surveillance systems, cameras used are usually using the wide-angle lens and mounted on the lamp-posts or ceiling which give unique top-down wide-angle viewing of all passing pedestrians shown in Figure 4 (b). In the NTU-Outdoor dataset, all the images are captured from actual surveillance cameras mounted on lamp-posts around the NTU campus.
Figure 4. Viewing Angles of Market1501, MSMT15 and NTU-Outdoor
3. Day Night Lighting Difference
Most of the real-world surveillance systems run 24/7. The daytime videos and nighttime videos have very different colour profiles and image qualities, as shown in Figure 5 (a) and (c) below. However, all existing public datasets only use the video recorded during the daytime which limits the capability of the Person Re-ID models trained on them.
Figure 5. Same Person in the Afternoon, Evening and Night Time in NTU-Outdoor Dataset
As a result, we take into consideration the difference between a person during day and night time by recording all cameras 24 hours non-stop. In the NTU-Outdoor dataset, if the time period from 7 am to 7 pm is the daytime, and 38.6% of the images are captured during the nighttime, as shown in Figure 6.
Figure 6. Percentage of Day Time Images and Night Time Images in NTU-Outdoor Dataset
4. Attribute Labels
The appearance attribute labels of persons are extremely valuable auxiliary information. By integrating attribute information during the model training stage, it can give more generalized and robust feature representation of different persons. Most of the person re-id datasets did not come with the attribute labels at the beginning, some of the attribute annotations (such as Market1501 and DukeMTMC-reID attributes labels) are completed by third-party organizations. The attribute annotation procedure is tedious, expensive and time-consuming. In our NTU-Outdoor Dataset, most of the appearance and soft-biometric attributes are submitted by participants themselves. It eliminates the manual attribute annotation process and gives more accurate representations of a person because the labels are annotated by the participant themselves.
The colour distribution of the upper and lower body is demonstrated in Figure 7 below. white, black and blue are the dominant colours for the upper body. These three colours combined contributed to around 70% of the total upper body colour labels. The lower body colours are less variant. As the dataset was collected in Singapore with mainly university students as participants, nearly 80% of the participant wore black or blue jeans, pants or shorts.
Figure 7. Upper and Lower Body Colour Distribution of NTU-Outdoor Dataset
In NTU-Outdoor datasets, appearance attributes are not limited to the colour of the clothing. We have 5 upper body clothing types: t-shirt, polo-shirt, shirt, jacket, dresses and 5 lower body clothing types: shorts, jeans, pants, shirt, dresses. In addition, we consider some accessories such as a hat, glasses, handbag, backpack, and messenger bag. As the bike and e-scooter are the two most popular transportation methods within the NTU campus, we also have attributes for those riding bikes and scooters. The distribution of different clothing types, accessories and transportation methods in the NTU-Outdoor dataset are illustrated in Figure 8.
Figure 8. Distribution of Clothing Types, Accessory and Transport in NTU-Outdoor Dataset
In the NTU-Outdoor dataset, the male and the female ratio of both the participant and the images are well balanced, as shown in Figure 9.
Figure 9. Male and Female Ratio of Participants and Images
In total, NTU-Outdoor provided 40 binary attribute labels, out-sizing the existing Market1501 and DukeMTMC-reID attributes annotations, as shown in Table 2.
Dataset
Market-1501
DukeMTMC-reID
MSMT17
NTU Outdoor
Attributes
30
23
-
40
Table 2. The number of Binary Attribute for Recent Person ReID Datasets
Overall Comparison With Other Datasets
Table 3 gives a detailed comparison of our NTU-Outdoor Dataset with three recent large-scale Person Re-ID datasets: MSMT17, DukeMTMC-reID and Market1501. NTU-Outdoor has more bounding boxes of human images compared to Market1501 and DukeMTMC. Although the MSMT17 has much more identities and human images for training and testing, the limited camera number and horizontal viewing position cannot fully represent the real-world outdoor surveillance system. On the other hand, the NTU-Outdoor dataset is obtained from 51 real surveillance cameras and contains people with different day-time and night-time views, it is the most realistic dataset for Person ReID tasks right now. In the next release, more identities will be annotated and added to the NTU-Outdoor dataset to bring it to the same level as Market1501 and DukeMTMC-reID.
Dataset
Market-1501
DukeMTMC-reID
MSMT17
NTU-Outdoor
NTU-Outdoor-38 (Released)
Surveillance Camera
No
Yes
Noi
Yes
Yes
Number of Camera
6
8
15
51
38
Collection Period
1 Day
1 Day
4 Days
8 Weeks
8 Weeks
Time Variant
-
-
Morning Noon Afternoon
24 Hours Day | Night
24 Hours Day | Night
Number of Identities
1501
1812
4101
805
549
Number of BBoxes
32,668
36,411
126,441
66,084
48,347
Attribue
30
23
-
40
40
Persion Detection
DPM
DPM
Faster RCNN
YOLO V3
YOLO V3