- Project Motivation
- Installation
- Instructions
- Screenshot
- File Descriptions
- Results
- Licensing, Authors, and Acknowledgements
The goal of this project is to create a model that will help disaster response efforts. By analyzing real disaster messages, the model classifies these messages towards multiple categories, enabling better communication with disaster relief agencies. The project provides a final app, where the user is able to input new messages and have them directly classified.
In order to run this project, you will need
- Anaconda
- Python version 3.*
- the following libraries:
- re
- pickle
- sys
- the nltk library & it´s modules (specified in train_classifier.py)
- numpy
- pandas
- sqlalchemy
- scikit-learn modules (specified in train_classifier.py)
- Flask
- plotly
- Clone the following repository to your local machine: https://github.com/MILM-stack/Project_Disaster_Response_Pipeline.git
- Run the ETL pipeline (this pipeline loads, cleans and stores the data into a database) using the following command: 2.1. python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponseProject.db
- Run the NLP pipeline (which loads the df from the database, sets the parameters X, y and categories, tokenizes the messages, creates & evalutates the model and finally saves it to a pickle file) using the following command: 3.1. python models/train_classifier.py data/DisasterResponseProject.db models/classifier.pkl
- Then go to
appdirectory:cd app - Run your web app:
python run.py - Click the
PREVIEWbutton to open the homepage or go to http://0.0.0.0:3000/ - Browse the app to see the data and how the model classification works!
-
Visualizations: 2.1: Distribution of Message Genres
2.2: Distribution of Messages by Category
2.3. Top 10 Message Categories

Here is an overview of the provided files:
- Workspace_NoteBooks
| - ETL Pipeline Preparation.ipynb #loads, cleans and stores the data into a database
| - ML Pipeline Preparation.ipynb # loads data, sets parameters X, y and categories, tokenizes the messages, creates & evalutates the model & saves to a pickle file
- app
| - template
|- master.html # main page of web app
|- go.html # classification result page of web app
|- run.py # Flask file that runs app
- data
|- disaster_categories.csv # data to process
|- disaster_messages.csv # data to process
|- process_data.py
|- DisasterResponseProject.db # database to save clean data to
- models
|- train_classifier.py
|- ones you run train_classifier, you will create a classifier.pkl file with a saved model
- README.md
The results as shown in the Flask web app, which is shown in the screenshots above. The user is provided an overview of the distribution of the messages, which is displayed through visualizations. Furthermore, there is an message classification input that can be used to give in new messages and have the classified to the appropiate category.
Based on the F1 Score results, we can also see that there is room for improvement of the model. Further details can be found in the ML Pipeline Preparation.ipynb
Thank you Appen for providing the datasets. The datasets used in this project are licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). This allows for the use of the data for non-commercial purposes, provided that appropriate credit is given, a link to the license is included, and any changes made are indicated.
Thank you Udacity for training materials, the fantastic support from the mentors on the Knowledge Base, as well as Udacity´s AI. All tools provided by Udacity were extremely helpful.


