Skin Lesion Classification: A Comprehensive Dataset

The main training dataset contains 25331 dermoscopic images, acquired at multiple sites and with different preprocessing methods applied beforehand. It contains images of the classes melanoma (MEL), melanocytic nevus (NV), basal cell carcinoma (BCC), actinic keratosis (AK), benign keratosis (BKL), dermatofibroma (DF), vascular lesion (VASC) and squamous cell carcinoma (SCC). A part of the training dataset is the HAM10000 dataset which contains images of size 600 × 450 that were centered and cropped around the lesion. The dataset curators applied histogram corrections to some images [1] . Another dataset, BCN_20000, contains images of size 1024 × 1024. This dataset is particularly challenging as many images are uncropped and lesions in difficult and uncommon locations are present [2] . Last, the MSK dataset contains images with various sizes.
The dataset also contains meta-information about the patient's age group (in steps of five years), the anatomical site (eight possible sites) and the sex (male/female). The meta data is partially incomplete, i.e., there are missing values for some images.
In addition, we make use of external data. We use the 955 dermoscopic images from the 7-point dataset [3] . Moreover, we use an in-house dataset which consists of 986 images. For the unknown class, we use 353 images obtained from a web search. We include images of healthy skin, angiomas, warts, cysts, and other benign alterations. The key idea is to build a broad class of skin variations that should encourage the model to assign any image that is not part of the eight main classes to the ninth broad pool of skin alterations. We also consider the three types of meta data for our external data, if it is available.
For internal evaluation, we split the main training dataset into five folds. The dataset contains multiple images of the same lesions. Thus, we ensure that all images of the same lesion are in the same fold. We add all our external data to each of the training sets. Note that we do not include any of our images from the unknown class in our evaluation as we do not know whether they accurately represent the actual unknown class. Thus, all our models are trained to predict nine classes but we only evaluate on the known, eight classes.
We use the mean sensitivity for our internal evaluation which is defined as

S = \frac{1}{C} \sum_{i = 1}^{C} \frac{T P_{i}}{T P_{i} + F N_{i}}

where TP are true positives, FN are false negatives and C is the number of classes. The metric is also used for the final challenge ranking.

Free full text: Click here

Gessert N., Nielsen M., Shaikh M., Werner R, & Schlaefer A. (2020). Skin lesion classification using ensembles of multi-resolution EfficientNets with meta data. MethodsX, 7, 100864.

Publication 2020

Actinic keratosis Age group Anatomical site Angiomas Basal cell carcinoma Cysts Dermatofibroma Dermoscopic Female Keratosis Male Melanocytic nevus Melanoma Sensitivity Skin Squamous cell carcinoma Vascular Warts

Corresponding Organization : Infektionsmedizinisches Centrum Hamburg

Other organizations : University Medical Center Hamburg-Eppendorf

Top 5 similar protocols

Protocol cited in 8 other protocols

Variable analysis

independent variables

Preprocessing methods applied to the images in the main training dataset
Image size and format (600 x 450, 1024 x 1024, various sizes)
Inclusion of different datasets (HAM10000, BCN_20000, MSK)
Inclusion of external data (7-point dataset, in-house dataset, web-scraped images)

dependent variables

Classification performance of the model, evaluated using mean sensitivity

control variables

Ensuring that all images of the same lesion are in the same fold during the 5-fold internal evaluation
Excluding the unknown class images from the evaluation, as their accuracy is not known

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!