Update README.rst

kk7nc · web-flow · commit 57780274c73f · 2018-07-20T12:44:28.000-04:00
diff --git a/README.rst b/README.rst
@@ -731,10 +731,99 @@ Matthew correlation coefficient (MCC)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
 
+Compute the Matthews correlation coefficient (MCC)
+
+The Matthews correlation coefficient is used in machine learning as a measure of the quality of binary (two-class) classifications. It takes into account true and false positives and negatives and is generally regarded as a balanced measure which can be used even if the classes are of very different sizes. The MCC is in essence a correlation coefficient value between -1 and +1. A coefficient of +1 represents a perfect prediction, 0 an average random prediction and -1 an inverse prediction. The statistic is also known as the phi coefficient. 
+
+
+.. code:: python
+
+    from sklearn.metrics import matthews_corrcoef
+    y_true = [+1, +1, +1, -1]
+    y_pred = [+1, -1, +1, +1]
+    matthews_corrcoef(y_true, y_pred)  
+
+
+
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Receiver operating characteristics (ROC)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
+ROC curves are typically used in binary classification to study the output of a classifier. In order to extend ROC curve and ROC area to multi-class or multi-label classification, it is necessary to binarize the output. One ROC curve can be drawn per label, but one can also draw a ROC curve by considering each element of the label indicator matrix as a binary prediction (micro-averaging).
+
+Another evaluation measure for multi-class classification is macro-averaging, which gives equal weight to the classification of each label. [`sources  <http://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html>`__] 
+
+.. code:: python
+
+    import numpy as np
+    import matplotlib.pyplot as plt
+    from itertools import cycle
+
+    from sklearn import svm, datasets
+    from sklearn.metrics import roc_curve, auc
+    from sklearn.model_selection import train_test_split
+    from sklearn.preprocessing import label_binarize
+    from sklearn.multiclass import OneVsRestClassifier
+    from scipy import interp
+
+    # Import some data to play with
+    iris = datasets.load_iris()
+    X = iris.data
+    y = iris.target
+
+    # Binarize the output
+    y = label_binarize(y, classes=[0, 1, 2])
+    n_classes = y.shape[1]
+
+    # Add noisy features to make the problem harder
+    random_state = np.random.RandomState(0)
+    n_samples, n_features = X.shape
+    X = np.c_[X, random_state.randn(n_samples, 200 * n_features)]
+
+    # shuffle and split training and test sets
+    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=.5,
+                                                        random_state=0)
+
+    # Learn to predict each class against the other
+    classifier = OneVsRestClassifier(svm.SVC(kernel='linear', probability=True,
+                                     random_state=random_state))
+    y_score = classifier.fit(X_train, y_train).decision_function(X_test)
+
+    # Compute ROC curve and ROC area for each class
+    fpr = dict()
+    tpr = dict()
+    roc_auc = dict()
+    for i in range(n_classes):
+        fpr[i], tpr[i], _ = roc_curve(y_test[:, i], y_score[:, i])
+        roc_auc[i] = auc(fpr[i], tpr[i])
+
+    # Compute micro-average ROC curve and ROC area
+    fpr["micro"], tpr["micro"], _ = roc_curve(y_test.ravel(), y_score.ravel())
+    roc_auc["micro"] = auc(fpr["micro"], tpr["micro"])
+   
+
+
+Plot of a ROC curve for a specific class
+
+
+.. code:: python
+
+    plt.figure()
+    lw = 2
+    plt.plot(fpr[2], tpr[2], color='darkorange',
+             lw=lw, label='ROC curve (area = %0.2f)' % roc_auc[2])
+    plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')
+    plt.xlim([0.0, 1.0])
+    plt.ylim([0.0, 1.05])
+    plt.xlabel('False Positive Rate')
+    plt.ylabel('True Positive Rate')
+    plt.title('Receiver operating characteristic example')
+    plt.legend(loc="lower right")
+    plt.show()
+
+
+.. image:: /docs/pic/sphx_glr_plot_roc_001.png
+
 
 ~~~~~~~~~~~~~~~~~~~~~~~
 Area under curve~(AUC)