Skip to content

Latest commit

 

History

History
 
 

README.md

Final assignment

This final assignment is the last part of the Python for Text Analysis course. Now that you have learned the basics of the Python programming language, it's time to put your skills into practice and work on your own code project.

This is a group assignment in which you will work together with one other student. You can form your own team.

For this assignment, you will choose a classification task and a corresponding dataset (see complete description) for which you are asked to:

  1. download/obtain the data;
  2. split the dataset into train/test sets;
  3. read and process the files in your dataset;
  4. extract relevant statistics from those files;
  5. store the computed statistics in a useful format (e.g. CSV/TSV);
  6. present the statistics to the user by means of visualization;
  7. use the computed statistics as features for the classification task;
  8. save the predictions of your model on the test data in a separate file;
  9. BONUS: evaluate your system's accuracy on the test set.

Important dates

When? What?
Friday 22 December 2017 (13:00) Decision about the task & dataset (please inform us by e-mail)
Monday 29 January 2018 (15:30-17:15) 5-minute presentation
Sunday 4 February 2018 (23:59) Deadline submission final assignment

Grading

Weight
Code Accuracy 20
Code Structure 20
Content & Features 35
Visualizations 10
Documentation 10
Presentation 5
BONUS 5