Task Description

We invite you to participate in our challenge on the detection of clickbait posts in social media. Clickbait refers to social media posts that are, at the expense of being informative and objective, designed to entice its readers into clicking an accompanying link. More on clickbait.
The task of the challenge is to develop a classifier that rates how click baiting a social media post is. For each social media post, the content of the post itself as well as the main content of the linked target web page are provided as JSON-Objects in our datasets.

           
{
 "id": "608999590243741697",
 "postTimestamp": "Thu Jun 11 14:09:51 +0000 2015",
 "postText": ["Some people are such food snobs"],
 "postMedia": ["608999590243741697.png"],

              
instances.jsonl

 "targetTitle": "Some people are such food snobs",
 "targetDescription": "You'll never guess one...",
 "targetKeywords": "food, foodfront, food waste...",
 "targetParagraphs": [
   "What a drag it is, eating kale that isn't ...",
   "A new study, published this Wednesday by ...", 
   ...],
 "targetCaptions": ["(Flikr/USDA)"]
 }            
instances.jsonl (cont'd)

Classifiers have to output a clickbait score in the range [0,1], where a value of 1.0 denotes that a post is heavily click baiting.


{"id": "608999590243741697", "clickbaitScore": 1.0}
          
results.jsonl

Performance is measured against a crowd-sourced test set. The posts in the training and test sets have been judged on a 4-point scale [0, 0.3, 0.66, 1] by at least five annotators.

Some people are such food snobs link
crowd sourcing task design

  {"id": "608999590243741697", 
   "truthJudgments": [0.33, 1.0, 1.0, 0.66, 0.33],
   "truthMean"  : 0.6666667,
   "truthMedian": 0.6666667,
   "truthMode"  : 1.0,
   "truthClass" : "clickbait"}
              
truth.jsonl

As primary evaluation metric, Mean Squared Error (MSE) with respect to the mean judgments of the annotators is used. For informational purposes, we compute further evaluation metrics such as the Median Absolute Error (MedAE), the F1-Score (F1) with respect to the truth class, as well as the runtime of the classification software. For your convenience, you can download the official python evaluation program.

MSEMedAEACCF1RuntimeTeam
0.0240.1740.910.76017:11:16Team 1
0.0520.2010.880.53302:47:50Team 2

How To Participate

  1. Register for the challenge to get a TIRA virtual machine.
  2. Develop and train a clickbait classifier on the training data.
  3. Deploy the trained classifier on the TIRA virtual machine assigned to you.
  1. Use tira.io to self-evaluate the deployed classifier on the test set.
  2. Write and submit a paper that describes how you approached the task.
  3. Pesent your approach at the workshop.

Important Dates

March 31, 2017Registration begins.
March 31, 2017Release of training dataset.
May 31, 2017Release of validation dataset.
July 10 -- August 31, 2017Software evaluation phase.
September 08, 2017Result Notification.
September 29, 2017Paper submission deadline.
October 13, 2017Notification of acceptance.
October 27, 2017Camera-ready deadline.
November 27, 2017Workshop.

Datasets

Over the course of the competition, three datasets are going to be released. Each dataset is provided as a zip archive with the naming pattern clickbait17-<dataset>-<version>.zip. It contains the following resources (the unlabeled dataset lacks the truth file):

  • instances.jsonl: A line delimited JSON file (JSON Lines). Each line is a JSON-Object containing the information we extracted for a specific post and its target article. Have a look at the dataset schema file for an overview of the available fields.
  • truth.jsonl: A line delimited JSON file. Each line is a JSON-Object containing the crowdsourced clickbait judgements of a specific post. Have a look at the dataset schema file for an overview of the available fields.
  • media/: A folder that contains all the images referenced in the instances.jsonl file.
Dataset#posts#clickbait#no-clickbaitDownload LinkRelease Date
Training2495 762 1697 clickbait16-train-170331.zip March 31, 2017
Unlabeled80012 ? ? clickbait17-unlabeled-170429.zip April 30, 2017
Training / Validation 19829 9656 10173 clickbait17-train-170616.zip June 16, 2017
Training / Validation 19538 4761 14777 clickbait17-train-170630.zip June 30, 2017

Software Submission

We use the Evaluation as a Service platform TIRA to evaluate the performance of your classifier. TIRA requires that you deploy your classifier as a program that can be executed with two arguments for input and output directories via a command line call. E.g., the syntax could be:

 > myClassifier -i path/to/input/directory -o path/to/output/directory
example command line call for tira.io

At runtime, the input directory contains the unzipped dataset (i.e. instances.jsonl and media/ folder) your classifier has to process. The predictions of your classifier should be written into a file called results.jsonl into the given output directory. The results.jsonl file should contain a valid JSON-Object in each line that contains the id and the predicted clickbaitScore for a post (cf. the dataset schema file).


{"id": "608999590243741697", "clickbaitScore": 1.0}
{"id": "609408598704128000", "clickbaitScore": 0.25}
...
results.jsonl

We will ask you to deploy your classifier onto a virtual machine that will be made accessible to you after registration. You can choose freely among the available programming languages and among the operating systems Microsoft Windows and Ubuntu. You will be able to reach the virtual machine via ssh and via remote desktop. More information about how to access the virtual machines can be found in the user guide below:

Virtual Machine User Guide »

Once deployed on your virtual machine, we ask you to access TIRA at www.tira.io, where you can self-evaluate your software on the test data.

Note: By submitting your software you retain full copyrights. You agree to grant us usage rights only for the purpose of the Clickbait Challenge. We agree not to share your software with a third party or use it for other purposes than the Clickbait Challenge.


Paper Submission

Paper submission information and paper templates will we provided as soon as the organization of the workshop has been finished.


Workshop

The workshop takes place on November 27, 2017 at Bauhaus-Universität Weimar, Germany.


Contact

In case of questions, don't hesitate to contact us via clickbait@webis.de.


Organizers

  • Tim Gollub, Bauhaus-Universität Weimar.
  • Martin Potthast, Bauhaus-Universität Weimar.
  • Matthias Hagen, Bauhaus-Universität Weimar.
  • Benno Stein, Bauhaus-Universität Weimar.

Supporter