Task Description

We invite you to participate in our ongoing challenge on the detection of clickbait posts in social media. Clickbait refers to social media posts that are, at the expense of being informative and objective, designed to entice its readers into clicking an accompanying link. More on clickbait.
The task of the challenge is to develop a classifier that rates how click baiting a social media post is. For each social media post, the content of the post itself as well as the main content of the linked target web page are provided as JSON-Objects in our datasets.

 "id": "608999590243741697",
 "postTimestamp": "Thu Jun 11 14:09:51 +0000 2015",
 "postText": ["Some people are such food snobs"],
 "postMedia": ["608999590243741697.png"],


 "targetTitle": "Some people are such food snobs",
 "targetDescription": "You'll never guess one...",
 "targetKeywords": "food, foodfront, food waste...",
 "targetParagraphs": [
   "What a drag it is, eating kale that isn't ...",
   "A new study, published this Wednesday by ...", 
 "targetCaptions": ["(Flikr/USDA)"]
instances.jsonl (cont'd)

Classifiers have to output a clickbait score in the range [0,1], where a value of 1.0 denotes that a post is heavily click baiting.

{"id": "608999590243741697", "clickbaitScore": 1.0}

Performance is measured against a crowd-sourced test set. The posts in the training and test sets have been judged on a 4-point scale [0, 0.3, 0.66, 1] by at least five annotators.

Some people are such food snobs link
crowd sourcing task design

  {"id": "608999590243741697", 
   "truthJudgments": [0.33, 1.0, 1.0, 0.66, 0.33],
   "truthMean"  : 0.6666667,
   "truthMedian": 0.6666667,
   "truthMode"  : 1.0,
   "truthClass" : "clickbait"}

As primary evaluation metric, Mean Squared Error (MSE) with respect to the mean judgments of the annotators is used. For informational purposes, we compute further evaluation metrics such as the Median Absolute Error (MedAE), the F1-Score (F1) with respect to the truth class, as well as the runtime of the classification software. For your convenience, you can download the official python evaluation program.

0.0240.1740.910.76017:11:16Team 1
0.0520.2010.880.53302:47:50Team 2

How To Participate

  1. Register for the challenge to get a TIRA virtual machine.
  2. Develop and train a clickbait classifier on the training data.
  3. Deploy the trained classifier on the TIRA virtual machine assigned to you.
  1. Use tira.io to self-evaluate the deployed classifier on the test set.
  2. Write and submit a paper that describes how you approached the task.
  3. Pesent your approach at the next workshop.


You can find the datasets for the clickbait challenge by following this link. The dataset is provided as a zip archive with the following resources (the unlabeled dataset lacks the truth file):

  • instances.jsonl: A line delimited JSON file (JSON Lines). Each line is a JSON-Object containing the information we extracted for a specific post and its target article. Have a look at the dataset schema file for an overview of the available fields.
  • truth.jsonl: A line delimited JSON file. Each line is a JSON-Object containing the crowdsourced clickbait judgements of a specific post. Have a look at the dataset schema file for an overview of the available fields.
  • media/: A folder that contains all the images referenced in the instances.jsonl file.

Software Submission

We use the Evaluation as a Service platform TIRA to evaluate the performance of your classifier. TIRA requires that you deploy your classifier as a program that can be executed with two arguments for input and output directories via a command line call. E.g., the syntax could be:

 > myClassifier -i path/to/input/directory -o path/to/output/directory
example command line call for tira.io

At runtime, the input directory contains the unzipped dataset (i.e. instances.jsonl and media/ folder) your classifier has to process. The predictions of your classifier should be written into a file called results.jsonl into the given output directory. The results.jsonl file should contain a valid JSON-Object in each line that contains the id and the predicted clickbaitScore for a post (cf. the dataset schema file).

{"id": "608999590243741697", "clickbaitScore": 1.0}
{"id": "609408598704128000", "clickbaitScore": 0.25}

We will ask you to deploy your classifier onto a virtual machine that will be made accessible to you after registration. You can choose freely among the available programming languages and among the operating systems Microsoft Windows and Ubuntu. You will be able to reach the virtual machine via ssh and via remote desktop. More information about how to access the virtual machines can be found in the user guide below:

Virtual Machine User Guide »

Once deployed on your virtual machine, we ask you to access TIRA at www.tira.io, where you can self-evaluate your software on the test data.

Note: By submitting your software you retain full copyrights. You agree to grant us usage rights only for the purpose of the Clickbait Challenge. We agree not to share your software with a third party or use it for other purposes than the Clickbait Challenge.

Paper Submission

  1. Prepare a paper about your approach and its variants using our paper template.
  2. Publish the finished paper on arXiv, and let us know where we can find it.
  3. Publish your code on our github repository. We can fork a repo you already have, or create a new one for you and invite you as owners.


The first workshop on clickbait detection took place on November 27, 2017 at Bauhaus-Universität Weimar, Germany.


Digital Bauhaus Lab
Bauhausstr. 9a, 3rd floor
99423 Weimar


09:00 - 09:30Welcome Reception
09:30 - 10:30Clickbait-Challenge 2017: Overview
Martin Potthast, Tim Gollub
10:30 - 11:00A Neural Clickbait Detection Engine
Yash Kumar Lal
11:00 - 11:30Clickbait Identification using Neural Networks
Philippe Thomas
11:30 - 12:00The Emperor Clickbait Detector
Erdan Genc
12:00 - 14:00Lunch Break
14:00 - 14:30Detecting Clickbait in Online Social Media: You Won’t Believe How We Did It
Aviad Elyashar
14:30 - 15:00Heuristic Feature Selection for Clickbait Scoring
Matti Wiegmann
15:00 - 16:00Discussion and Outlook


The following list presents the current performances achieved by the participants. As primary evaluation measure, Mean Squared Error (MSE) with respect to the mean judgments of the annotators is used. For further metrics, see the full result table on tira.io. If provided, paper and code of the submissions are linked in each row.

0.0320.6700.7320.6190.85500:01:10albacorepaper code
0.0330.6830.7190.6500.85600:03:27zingelpaper code
0.0360.6410.7140.5810.84500:04:03emperor code
0.0360.0360.7280.5680.84700:08:05carpetsharkpaper code
0.0430.5650.6990.4740.82600:04:31whitebaitpaper code
0.0790.6500.5300.8410.78500:04:55torpedopaper code
0.2520.4340.2870.8930.44619:05:31snapperpaper code


In case of questions, don't hesitate to contact us via clickbait@webis.de.

Task Committee

  • Tim Gollub, Bauhaus-Universität Weimar.
  • Martin Potthast, Bauhaus-Universität Weimar.
  • Matthias Hagen, Bauhaus-Universität Weimar.
  • Benno Stein, Bauhaus-Universität Weimar.