Skip to content

groupGit/IS620GroupProject3

Repository files navigation

			Male or Female?
 		Choose the best Name Gender Classifier

We started with a template file where we split the data so we would have comparable results.

https://github.com/groupGit/IS620GroupProject3/blob/master/Project3_version_a.ipynb

We then created a chatbot @ Slack.com to exchange our ideas. We then experimented with different techniques and features. Our goal was to find the best machine learning classifier for the names corpus dataset. Using slack, we split our work and used github to check-in files as and when we got our classifiers done.

The dataset was pretty small with 7944 rows and was split into three data-frames, maintaining the same ration of male and female in all the sets:Validation set with 500 rows, Test set with 500 rows and Train set with 6944 rows. This was the basis for all the classifiers and our team built the classifiers mentioned below.

  1. Max Entropy: https://github.com/groupGit/IS620GroupProject3/blob/master/Project3_MaxEntropy.ipynb

  2. Random Forest: https://github.com/groupGit/IS620GroupProject3/blob/master/RandomForest.ipynb

  3. Decision Tree: https://github.com/groupGit/IS620GroupProject3/blob/master/DecisionTreeT2.ipynb

  4. Naive Bayes: a) https://github.com/groupGit/IS620GroupProject3/blob/master/naive_bayes_2a.ipynb b) https://github.com/groupGit/IS620GroupProject3/blob/master/naive_bayes_2.ipynb c) https://github.com/groupGit/IS620GroupProject3/blob/master/naive_bayes_2c.ipynb

The results are summarized under each classifier. The overall view was that the feature with first letter and the last letter performed the best.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors