FSCOCO-Seg: Segmented Scene Sketches


This dataset is a part of the following publication:

"Open Vocabulary Scene Sketch Semantic Understanding"

Ahmed Bourouis1, Judith Ellen Fan2, Yulia Gryaditskaya1

1CVSSP and Surrey Institute for People-Centred AI, UK,
2Department of Psychology, Stanford University, USA

CVPR 2024

The data
teaser

This dataset contains training, validation and test data used in the paper: "Open Vocabulary Scene Sketch Semantic Understanding" by Ahmed Bourouis, Judith Ellen Fan, Yulia Gryaditskaya, CVPR, 2024.

It contains our split of the sketches from the FSCOCO dataset into training, validation and test sets. For the validation and test sets, we provide stroke-level annotations into different categories, as shown in the images above. The details are provided below.

The FSCOCO dataset
The dataset comprises 10,000 sketch-caption pairs, associated with reference images from the MS-COCO dataset. The sketches are drawn from memory by 100 non-expert participants. The reference image was shown for 60 seconds, followed by a 3-minute sketching window. For more detials please refer to the FSCOCO dataset webpage.

Our Training/Validation/Test splits
We first selected 500 sketches with distinct styles from five participants. We then randomly sample 5 sketches from each of the remaining 95 participants for validation (a total of 475 sketches). We use the remaining 9025 sketches for training.

Test/Validation set annotations
One of the co-authors manually annotated test and validation sketches, relying on reference images and category labels from the MS-COCO dataset. We assign each stroke a unique category label. Candidate category labels are extracted from MS-COCO image captions rather than sketch captions to obtain richer `ground-truth' annotations. Our test set contains 185 different object classes, with an average of 3.54 objects per sketch.


Download

Test: test.tar.gz

Validation: val.tar.gz

Train: train.tar.gz

The structure of the data

The data is split into three folders: Test, Val and Train. Refer to this code to load the dataset. The structure of the content is stated bellow:

  • Test:
    • sketches: 500 raster sketches of size 512 by 512 from the FSCOCO dataset, collected from 5 different artists.
    • vector_sketches: Vector representation of the respective raster sketches, as was provided in the FSCOCO dataset.
    • seg_sketches: Visualisation of the 500 segmented sketches.
    • classes: A folder of JSON files, each containing class labels for individual strokes as a list of strings.
    • captions: Brief corresponding captions, provided in the FSCOCO dataset.
    • images: Corresponding images used to as a reference during the segmentation process from the MSCOCO dataset.
    • all_classes.json: A JSON file with a list of all the 185 classes present in the segmented test set.
  • Val:
    • sketches: 475 raster sketches of size 512 by 512 from the FSCOCO dataset, collected from 95 different artists (5 sketches per artist).
    • vector_sketches: Vector representation of the respective raster sketches, as was provided in the FSCOCO dataset.
    • classes: A folder of JSON files, each containing class labels for individual strokes as a list of strings.
    • captions: Brief corresponding captions, provided in the FSCOCO dataset.
    • images: Corresponding images used to as a reference during the segmentation process from the MSCOCO dataset.
  • Train: For completeness, we include 9025 scene sketches from FSCOCO dataset used during training.
    • sketches: 95 folders corresponding to 95 different artists, each containing 95 scene sketches.
    • text: Corresponding text descriptions for each sketch.
    • images: Corresponding reference images showen to participants during the drawing process.


If you use this dataset please cite:
		
			 
@inproceedings{bourouis2023open,
	title={Open Vocabulary Semantic Scene Sketch Understanding}, 
	author={Ahmed Bourouis and Judith Ellen Fan and Yulia Gryaditskaya},
	booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},			
	year={2024}
}

@inproceedings{fscoco,
    title={FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context.}
    author={Chowdhury, Pinaki Nath and Sain, Aneeshan and Bhunia, Ayan Kumar and Xiang, Tao and Gryaditskaya, Yulia and Song, Yi-Zhe},
    booktitle={ECCV},
    year={2022}
}

@inproceedings{lin2014microsoft,
	title={Microsoft coco: Common objects in context},
	author={Lin, Tsung-Yi and Maire, Michael and Belongie, Serge and Hays, James and Perona, Pietro and Ramanan, Deva and Doll{\'a}r, Piotr and Zitnick, C Lawrence},
	booktitle={ECCV},
	year={2014},
	organization={Springer}
}
	
	
Questions:

For any questions about this dataset, please contact Ahmed Bourouis.