|
2 | 2 |
|
3 | 3 | 感受Python之美 | 一、Python基础 |二、Python字符串和正则|三、Python文件和日期|四、Python三大利器|五、Python绘图|六、Python之坑|七、Python第三方包|八、机器学习和深度学必知算法|九、Python实战|十、Pandas数据分析案例实战 |
4 | 4 |
|
| 5 | + |
| 6 | + |
5 | 7 | > 目前,正在编写第十一章:一步一步掌握Flask web 开发 |
6 | 8 |
|
7 | 9 |
|
@@ -7079,6 +7081,91 @@ Out[6]: |
7079 | 7081 |
|
7080 | 7082 | 也就是说dummy向量的长度等于输入字符串中,唯一字符的个数。 |
7081 | 7083 |
|
| 7084 | +#### 15 讨厌的SettingWithCopyWarning!!! |
| 7085 | +
|
| 7086 | +Pandas 处理数据,太好用了,谁用谁知道! |
| 7087 | +
|
| 7088 | +使用过 Pandas 的,几乎都会遇到一个警告: |
| 7089 | +
|
| 7090 | +*SettingWithCopyWarning* |
| 7091 | +
|
| 7092 | +非常烦人! |
| 7093 | +
|
| 7094 | +尤其是刚接触 Pandas 的,完全不理解为什么弹出这么一串: |
| 7095 | +
|
| 7096 | +```python |
| 7097 | +d:\source\test\settingwithcopy.py:9: SettingWithCopyWarning: |
| 7098 | +A value is trying to be set on a copy of a slice from a DataFrame. |
| 7099 | +Try using .loc[row_indexer,col_indexer] = value instead |
| 7100 | +
|
| 7101 | +See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy |
| 7102 | +``` |
| 7103 | +
|
| 7104 | +归根结底,是因为代码中出现`链式操作`... |
| 7105 | +
|
| 7106 | +有人就问了,什么是`链式操作`? |
| 7107 | +
|
| 7108 | +这样的: |
| 7109 | +
|
| 7110 | +```python |
| 7111 | +tmp = df[df.a<4] |
| 7112 | +tmp['c'] = 200 |
| 7113 | +``` |
| 7114 | +
|
| 7115 | +先记住这个最典型的情况,即可! |
| 7116 | +
|
| 7117 | +有的人就问了:出现这个 Warning, 需要理会它吗? |
| 7118 | +
|
| 7119 | +如果结果不对,当然要理会;如果结果对,不care. |
| 7120 | +
|
| 7121 | +举个例子~~ |
| 7122 | +
|
| 7123 | +```python |
| 7124 | +import pandas as pd |
| 7125 | +
|
| 7126 | +df = pd.DataFrame({'a':[1,3,5],'b':[4,2,7]},index=['a','b','c']) |
| 7127 | +df.loc[df.a<4,'c'] = 100 |
| 7128 | +print(df) |
| 7129 | +print('it\'s ok') |
| 7130 | +
|
| 7131 | +tmp = df[df.a<4] |
| 7132 | +tmp['c'] = 200 |
| 7133 | +print('-----tmp------') |
| 7134 | +print(tmp) |
| 7135 | +print('-----df-------') |
| 7136 | +print(df) |
| 7137 | +``` |
| 7138 | +
|
| 7139 | +输出结果: |
| 7140 | +```python |
| 7141 | + a b c |
| 7142 | +a 1 4 100.0 |
| 7143 | +b 3 2 100.0 |
| 7144 | +c 5 7 NaN |
| 7145 | +it's ok |
| 7146 | +d:\source\test\settingwithcopy.py:9: SettingWithCopyWarning: |
| 7147 | +A value is trying to be set on a copy of a slice from a DataFrame. |
| 7148 | +Try using .loc[row_indexer,col_indexer] = value instead |
| 7149 | +
|
| 7150 | +See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy |
| 7151 | + tmp['c'] = 200 |
| 7152 | +-----tmp------ |
| 7153 | + a b c |
| 7154 | +a 1 4 200 |
| 7155 | +b 3 2 200 |
| 7156 | +-----df------- |
| 7157 | + a b c |
| 7158 | +a 1 4 100.0 |
| 7159 | +b 3 2 100.0 |
| 7160 | +c 5 7 NaN |
| 7161 | +``` |
| 7162 | +
|
| 7163 | +it's ok 行后面的发生链式赋值,导致结果错误。因为 tmp 变了,df 没赋上值啊,所以必须理会。 |
| 7164 | +
|
| 7165 | +it's ok 行前的是正解。 |
| 7166 | +
|
| 7167 | +以上,链式操作尽量避免,如何避免?多使用 `.loc[row_indexer,col_indexer]`,提示告诉我们的~ |
| 7168 | +
|
7082 | 7169 |
|
7083 | 7170 |
|
7084 | 7171 | ### 十一、一步一步掌握Flask web开发 |
|
0 commit comments