Remember Me
Or use your Academic/Social account:


Or use your Academic/Social account:


You have just completed your registration at OpenAire.

Before you can login to the site, you will need to activate your account. An e-mail will be sent to you with the proper instructions.


Please note that this site is currently undergoing Beta testing.
Any new content you create is not guaranteed to be present to the final version of the site upon release.

Thank you for your patience,
OpenAire Dev Team.

Close This Message


Verify Password:
Verify E-mail:
*All Fields Are Required.
Please Verify You Are Human:
fbtwitterlinkedinvimeoflicker grey 14rssslideshare1
Markatopoulou, Foteini; Mezaris, Vasileios (2017)
Publisher: Zenodo
Type: dataset
Subjects: concept detection, ad-hoc video search, TRECVID AVS task, video analysis

We provide concept detection scores for the IACC.3 dataset (600 hr internet archive videos), which is used in the TRECVID Ad-hoc Video Search (AVS) task [1]. Concept detection scores for 1345 concepts (1000 ImageNet concepts provided for the ILSVRC challenge [2] and 345 TRECVID SIN concepts [3]) have been generated as follows:
1) To generate scores for the ImageNet concepts, 5 pre-trained ImageNet networks were applied on the IACC.3 dataset and their output was fused in terms of arithmetic mean.
2) To generate scores for the TRECVID SIN concepts, two pre-trained ImageNet networks were fine-tuned on these concepts using a combination of our methods presented in the following papers: [4], [5]. We provide two different sets of concept scores for the TRECVID SIN concepts: a) The output of the two fine-tuned networks was fused in terms of arithmetic mean in order to return a single score for each concept. b) The last fully-connected layer was used as feature to train SVM classifiers separately for each fine-tuned network and each concept. Then, the SVM classifiers were applied on the IACC.3 dataset and the prediction scores of the SVMs for the same concept were fused in terms of arithmetic mean in order to return a single score for each concept. We evaluated the two different sets of concepts in terms of MXInfAP on a subset of 38 TRECVID SIN concepts for which ground-truth annotation exists, and the MXInfAP of each set of concept scores is: a) 30.04% for the networks' direct output, b) 35.81% for the SVM classifiers.

Three different files of concept detection scores can be downloaded (after unpacking the compressed file):
1) scores_ImageNet.txt
2a) scores_SIN_direct.txt
2b) scores_SIN_svm.txt
In total there are 335944 rows in each file; 1002 columns in the first file and 347 columns in each of the other two. Each row in any of these files corresponds to a different video shot; the video shot IDs appear in the first two columns. (Note: the shot IDs are the ones from the mp7 files in the TRECVID AVS master shot reference, with the format shotFILENUMBER_SHOTNUMBER). Then, each column (except for the fist two) corresponds to a different concept, with all concept scores being in [0,1] range. The higher the score the more likely that the corresponding concept appears in the video shot. Files “concept_names_ImageNet.txt” and “concept_names_SIN.txt” indicate the order of the concepts that is used in the concept score files. 

[1] G. Awad, J. Fiscus, M. Michel et al. 2016. TRECVID 2016: Evaluating Video Search, Video Event Detection, Localization, and Hyperlinking. In TRECVID 2016 Workshop. NIST, USA.
[2] O. Russakovsky, J. Deng, H. Su et al. 2015. ImageNet Large Scale Visual Recognition Challenge. Int. Journal of Computer Vision (IJCV) 115, 211–252.
[3] G. Awad, C. Snoek, A. Smeaton, and G. Quénot. 2016. TRECVid semantic indexing of video: a 6-year retrospective. ITE Transactions on Media Technology and Applications, 4 (3). pp. 187-208.
[4] N. Pittaras, F. Markatopoulou, V. Mezaris, I. Patras. 2017. Comparison of Fine-tuning and Extension Strategies for Deep Convolutional Neural Networks, Proc. 23rd Int. Conf. on MultiMedia Modeling (MMM'17), Reykjavik, Iceland, Springer LNCS vol. 10132, pp. 102-114, Jan. 2017.
[5] F. Markatopoulou, V. Mezaris, and I. Patras. 2016. Deep Multi-task Learning with Label Correlation Constraint for Video Concept Detection, Proc. ACM Multimedia 2016, Amsterdam, Oct. 2016.

  • No related publications.
  • No related research data.

Share - Bookmark

Download from

Funded by projects

  • EC | InVID

Cite this research data

Collected from