###########################################
CS229 Final Project
Authors:
Pouya Rezazadeh Kalehbasti (pouyar@stanford.edu)
Liubov Nikolenko (liubov@stanford.edu)
Hoormazd Rezaei (hoormazd@stanford.edu)
###########################################
In order to run the code make sure you pre-instal all the dependecies such as TextBlob and sklearn
- INITIAL DATA PREPROCESSING:
- Generate a fine with review sentiment: python sentiment_analysis.py
- Clean the data: python data_cleanup.py
- Normalize and split the data: data_preprocessing_reviews.py
- GENERATE THE FEATURE SELECTION .NPY FILE:
- For P-value feature selection: python feature_selection.py
- For Lasso CV: python cv.py
- TRAIN AND RUN THE MODELS: python run_models.py Note that by commenting/uncommenting certain lines of code you will be able to run different configurations of the models.
-
To run the models with Lasso CV feature selection comment out line 240 (coeffs = np.load('../Data/selected_coefs_pvals.npy') and uncomment line 241 (coeffs = np.load('../Data/selected_coefs.npy').
-
To run the models with p-value feature selection uncomment line 240 (coeffs = np.load('../Data/selected_coefs_pvals.npy') and comment out line 241 (coeffs = np.load('../Data/selected_coefs.npy').
-
To run the baseline uncomment the lines 277, 278 ( print("--------------------Linear Regression--------------------") LinearModel(X_concat, y_concat, X_test, y_test) ) and comment out everything below these lines. Also, comment out the lines 268, 269 and 270 ( X_train = X_train[list(col_set)] X_val = X_val[list(col_set)] X_test = X_test[list(col_set)]
)
Warning: certain models take a while to train and run!